Getting Started
Generating Audio With a Prompt
The prompt attribute for the speak tag in SSML allows you to dynamically control the style, emotion, or context of your synthesized audio without creating a new voice. This powerful feature enables you to guide the synthesis system on how to interpret and vocalize your content, whether it's text-to-speech or speech-to-speech, resulting in more natural and contextually appropriate audio output.
By adding a prompt to your SSML, you can:
- Adjust the speaking style (casual, formal, enthusiastic)
- Set an emotional tone (happy, serious, sympathetic)
- Provide context for the content (news reading, storytelling, conversation)
- Customize the delivery without creating multiple voice designs
How it works
When you include a prompt attribute in your SSML's speak tag, the TTS system uses this information as guidance for how to render the audio. This works similarly to how you might create a voice from a prompt, but applies the styling dynamically at synthesis time.
<speak prompt="Speak this like an excited sports announcer">
The home team just scored the winning goal in the final seconds!
</speak>
SSML Prompt Examples
Basic Prompt:
<speak prompt="Speak in a calm, soothing voice">
Welcome to your guided meditation session. Let's begin by taking a deep breath.
</speak>
Contextual Speaking Styles:
<speak prompt="You are a knowledgeable science educator explaining complex concepts in an engaging way">
Black holes form when massive stars collapse at the end of their life cycle.
The gravity is so strong that nothing, not even light, can escape once it passes the event horizon.
</speak>
Emotional Delivery
<speak prompt="Speak with excitement and enthusiasm, like you're sharing amazing news">
We've just received confirmation that our project has been approved with full funding!
</speak>
Combined with Other SSML Features
The prompt attribute works alongside other SSML features:
<speak prompt="Speak in an excited, upbeat tone">
<prosody pitch="x-high">
Wow, this is a really exciting announcement!
</prosody>
</speak>
Best Practices
- Keep prompts concise and focused
- Use natural language to describe the desired style
- Experiment with different prompts to find the best fit for your content