Creating an AWS Linux System and Using Amazon Polly (CLI, Python, and GUI)
Amazon Polly makes it easy to convert text into natural-sounding speech using AI-powered voices. Whether you prefer clicking through a web interface or automating everything on a Linux server, Polly has you covered.
In this guide, we’ll:
- Launch an Amazon Linux EC2 instance
- Use Amazon Polly from the AWS Console (GUI)
- Generate speech using the AWS CLI
- Create audio files programmatically with Python
What Is Amazon Polly?
Amazon Polly is a managed text-to-speech service that:
- Converts text into lifelike speech
- Supports multiple languages and neural voices
- Outputs MP3, OGG, and PCM audio formats
- Requires no infrastructure management
Prerequisites
You’ll need:
- An AWS account
- An EC2 key pair
- Basic Linux knowledge
- An IAM user or role with Polly permissions
Step 1: Launch an Amazon Linux EC2 Instance
- Go to AWS EC2 Console
- Click Launch Instance
- Choose Amazon Linux 2
- Select t2.micro or t3.micro
- Allow SSH (port 22) in the security group
- Launch the instance
Copy the public IP address once the instance is running.
Step 2: Connect to the EC2 Instance
ssh -i your-key.pem ec2-user@<EC2_PUBLIC_IP>
You are now logged into your Amazon Linux server.
Step 3: Using Amazon Polly via the AWS Console (GUI)
Before touching the command line, let’s explore Polly using the AWS Management Console — this is the fastest way to experiment.
Accessing the Polly Console
- Log in to the AWS Management Console
- Search for Polly
- Click Amazon Polly
- Open the Text-to-Speech page
No EC2 instance is required for this step.
Generating Speech in the GUI
- In the Text-to-Speech editor:
- Enter your text:
Welcome to Amazon Polly. This audio was created using the AWS Console.
- Choose a voice (e.g., Joanna, Matthew)
- Select Engine:
- Standard
- Neural (more natural, recommended)
- Choose Language
- Click Listen ▶️
You’ll hear the generated speech instantly.
Downloading the Audio File
- Select Output format (MP3 or OGG)
- Click Download
- Save the file locally
This is perfect for:
- Testing voices
- Demos and presentations
- Content creation workflows

Using SSML in the GUI (Optional)
Enable SSML to control speech:
<speak>
Welcome to <emphasis level="strong">Amazon Polly</emphasis>.
<break time="1s"/>
This is an example using SSML.
</speak>
SSML allows:
- Pauses
- Emphasis
- Speaking rate control
- Pronunciation tuning
Step 4: Configure AWS Credentials on Linux
Recommended: IAM Role
Attach an IAM role to the EC2 instance with:
- AmazonPollyFullAccess
No credentials required on the server.
Alternative: AWS CLI Credentials
aws configure
Enter:
- Access key
- Secret key
- Region (e.g., us-east-1)
Step 5: Using Amazon Polly from the AWS CLI
Generate speech directly from Linux:
aws polly synthesize-speech \
--voice-id Joanna \
--output-format mp3 \
--text "This audio was generated from the AWS CLI" \
cli-output.mp3
Install an audio player:
sudo yum install -y mpg123
Play the file:
mpg123 cli-output.mp3
Step 6: Using Amazon Polly with Python
Install Dependencies
sudo yum install -y python3 pip
pip3 install boto3
Python Script Example
Create the script:
nano polly_tts.py
Add:
import boto3
polly = boto3.client("polly")
response = polly.synthesize_speech(
Text="Hello from Amazon Polly using Python on Amazon Linux",
OutputFormat="mp3",
VoiceId="Matthew"
)
with open("python-output.mp3", "wb") as file:
file.write(response["AudioStream"].read())
print("Audio file created: python-output.mp3")
Run it:
python3 polly_tts.py
mpg123 python-output.mp3
Comparing GUI vs CLI vs Code
| Method | Best For |
|---|---|
| AWS Console (GUI) | Voice testing, demos, learning |
| AWS CLI | Automation, scripting |
| Python / SDK | Application integration |
Security Best Practices
- Prefer IAM roles over access keys
- Use least-privilege IAM policies
- Monitor usage with CloudWatch
- Avoid committing credentials to Git
Conclusion
Amazon Polly is flexible enough for beginners and powerful enough for production systems. Whether you use the AWS Console GUI, CLI, or Python SDK, Polly lets you bring natural-sounding speech to your applications quickly and securely.
Once you’re comfortable, you can combine Polly with:
- S3 for audio storage
- Lambda for serverless processing
- Transcribe for full speech workflows
Happy building—and enjoy giving your apps a voice 🔊🚀

You must be logged in to post a comment.