Amazon Polly makes it easy to convert text into natural-sounding speech using AI-powered voices. Whether you prefer clicking through a web interface or automating everything on a Linux server, Polly has you covered.

In this guide, we’ll:

  • Launch an Amazon Linux EC2 instance
  • Use Amazon Polly from the AWS Console (GUI)
  • Generate speech using the AWS CLI
  • Create audio files programmatically with Python

What Is Amazon Polly?

Amazon Polly is a managed text-to-speech service that:

  • Converts text into lifelike speech
  • Supports multiple languages and neural voices
  • Outputs MP3, OGG, and PCM audio formats
  • Requires no infrastructure management

Prerequisites

You’ll need:

  • An AWS account
  • An EC2 key pair
  • Basic Linux knowledge
  • An IAM user or role with Polly permissions

Step 1: Launch an Amazon Linux EC2 Instance

  1. Go to AWS EC2 Console
  2. Click Launch Instance
  3. Choose Amazon Linux 2
  4. Select t2.micro or t3.micro
  5. Allow SSH (port 22) in the security group
  6. Launch the instance

Copy the public IP address once the instance is running.


Step 2: Connect to the EC2 Instance

ssh -i your-key.pem ec2-user@<EC2_PUBLIC_IP>

You are now logged into your Amazon Linux server.


Step 3: Using Amazon Polly via the AWS Console (GUI)

Before touching the command line, let’s explore Polly using the AWS Management Console — this is the fastest way to experiment.

Accessing the Polly Console

  1. Log in to the AWS Management Console
  2. Search for Polly
  3. Click Amazon Polly
  4. Open the Text-to-Speech page

No EC2 instance is required for this step.


Generating Speech in the GUI

  1. In the Text-to-Speech editor:
    • Enter your text:
Welcome to Amazon Polly. This audio was created using the AWS Console.

  1. Choose a voice (e.g., Joanna, Matthew)
  2. Select Engine:
    • Standard
    • Neural (more natural, recommended)
  3. Choose Language
  4. Click Listen ▶️

You’ll hear the generated speech instantly.


Downloading the Audio File

  1. Select Output format (MP3 or OGG)
  2. Click Download
  3. Save the file locally

This is perfect for:

  • Testing voices
  • Demos and presentations
  • Content creation workflows

Using SSML in the GUI (Optional)

Enable SSML to control speech:

<speak>
  Welcome to <emphasis level="strong">Amazon Polly</emphasis>.
  <break time="1s"/>
  This is an example using SSML.
</speak>

SSML allows:

  • Pauses
  • Emphasis
  • Speaking rate control
  • Pronunciation tuning

Step 4: Configure AWS Credentials on Linux

Recommended: IAM Role

Attach an IAM role to the EC2 instance with:

  • AmazonPollyFullAccess

No credentials required on the server.

Alternative: AWS CLI Credentials

aws configure

Enter:

  • Access key
  • Secret key
  • Region (e.g., us-east-1)

Step 5: Using Amazon Polly from the AWS CLI

Generate speech directly from Linux:

aws polly synthesize-speech \
  --voice-id Joanna \
  --output-format mp3 \
  --text "This audio was generated from the AWS CLI" \
  cli-output.mp3

Install an audio player:

sudo yum install -y mpg123

Play the file:

mpg123 cli-output.mp3

Step 6: Using Amazon Polly with Python

Install Dependencies

sudo yum install -y python3 pip
pip3 install boto3

Python Script Example

Create the script:

nano polly_tts.py

Add:

import boto3

polly = boto3.client("polly")

response = polly.synthesize_speech(
    Text="Hello from Amazon Polly using Python on Amazon Linux",
    OutputFormat="mp3",
    VoiceId="Matthew"
)

with open("python-output.mp3", "wb") as file:
    file.write(response["AudioStream"].read())

print("Audio file created: python-output.mp3")

Run it:

python3 polly_tts.py
mpg123 python-output.mp3

Comparing GUI vs CLI vs Code

MethodBest For
AWS Console (GUI)Voice testing, demos, learning
AWS CLIAutomation, scripting
Python / SDKApplication integration

Security Best Practices

  • Prefer IAM roles over access keys
  • Use least-privilege IAM policies
  • Monitor usage with CloudWatch
  • Avoid committing credentials to Git

Conclusion

Amazon Polly is flexible enough for beginners and powerful enough for production systems. Whether you use the AWS Console GUI, CLI, or Python SDK, Polly lets you bring natural-sounding speech to your applications quickly and securely.

Once you’re comfortable, you can combine Polly with:

  • S3 for audio storage
  • Lambda for serverless processing
  • Transcribe for full speech workflows

Happy building—and enjoy giving your apps a voice 🔊🚀