What You'll Learn
When to Use This Guide
Use these optimization techniques when your AI-generated video or audio card:
Mispronounces company names, acronyms, or technical terms
Has awkward pacing or sounds rushed
Needs more natural pauses between sentences or ideas
Requires emphasis on specific words or phrases
Sounds unnatural despite having correct text
These techniques work for both:
Video cards with AI avatars and voiceover
Audio cards with AI voices (you can choose from multiple voice options per language)
π‘ Not a pronunciation issue? If your video was rejected for content policy reasons, see: Avoid your video getting rejected
Quick Tips for Better AI Voice Quality
Follow these essential practices to improve your AI-generated content:
π« Don't mix languages
Example: Don't use English words in a Spanish script. The AI detects language automatically and mixing confuses the voice engine.
π Spell words correctly
Make sure you've used correct spelling in your script. Misspellings can cause pronunciation errors or unexpected results.
π¬ Insert breaks if needed
You can add pauses into your script by inserting break tags:
β<break time="2s" /> (creates a 2-second pause)
βοΈ Use punctuation marks
A script without proper commas and periods sounds too fast and hard to follow. Use periods, commas, hyphens, and question marks to help the AI sound natural.
π£οΈ Fix pronunciation with hyphens
Sometimes splitting words with hyphens helps the AI pronounce them correctly:
Example: "con-tent" instead of "content"
Most Important Tip
Improving voice quality is all about creative use of:
Periods and commas for pacing
Break tags for strategic pauses
Hyphens for pronunciation control
Don't be afraid to experiment with different combinations to get the sound you want!
Adding Breaks and Pauses
Our AI voices support SSML markup language (Speech Synthesis Markup Language). The most useful feature is the ability to add custom breaks wherever you need them.
How to Add Breaks
Wherever you want a pause in your text, simply insert (you can specify time in seconds or milliseconds):
<break time="2s" />
Example Usage
Original text:
Hey John! How are you doing today?
Problem: The default break after "John!" might feel too short or too long.
Solution: Add a custom break:
Hey John!<break time="50ms"/>How are you doing today?
When to use breaks:
Separate sentences for clarity
Add dramatic pauses for emphasis
Create breathing room between ideas
Slow down rushed sections
Works for both video and audio cards!
Correcting Pronunciation
Pronouncing company names, acronyms, business terms, or technical language can be challenging for AI. Getting pronunciation right is usually a matter of inserting hyphens or adjusting how you spell words.
Words
Try inserting hyphens to make the word sound how you want:
Example:
Content β con-tent
Project β pro-ject
Research β re-search
Tip: Break words at syllable boundaries where pronunciation problems occur.
Acronyms
If you want an acronym pronounced like a word, spell it phonetically:
AI β a-eye
AWS β a-"double you"-s
NASA β nassa
If you want each letter pronounced separately, add spaces between letters:
NYC β N Y C
FBI β F B I
HR β H R
Numbers
Change how you spell numbers depending on how you want them to sound:
Ten eighty-nine β 10 89
Two five eight six β 2 5 8 6
One hundred and forty-eight β 148
For years:
2024 as "two thousand twenty-four" β 2,024
1999 as "nineteen ninety-nine" β leave as 1999
For phone numbers: Add spaces to get natural pronunciation:
(206) 555-3131 β 2 0 6 5 5 5 31 31
Using Punctuation for Natural Delivery
Punctuation isn't just for grammarβit's a powerful tool for controlling AI voice delivery.
How Different Punctuation Affects Delivery
Commas (,)
Add shorter pauses than periods
Create natural breathing points
Help separate ideas within sentences
Periods (.)
Add longer breaks
Create downward inflection (statement tone)
Best for breaking long sentences into shorter ones
Quotation marks ("")
Add emphasis to that word or phrase
Makes the AI "pay attention" to specific content
Example: This is the "most important" step
Example: Same Sentence, Different Results
β Without punctuation:
Here's a demonstration of how a sentence without any breaks or commas at all compare to a sentence that has as you can see the video without can be difficult to follow because there are no breaks or pauses in it.
Result: Rushed, hard to follow
β With strategic punctuation:
Here's a demonstration of how a sentence, without any breaks or commas at all, compare to a sentence that has. As you can see, the video without can be difficult to follow, because there are no breaks or pauses in it.
Result: Natural, easy to understand
Pro Tips for Punctuation
Questions: End with question marks to get upward inflection
Emphasis: Use quotes around key phrases: "the most critical step"
Lists: Use commas between items for natural pacing
Long sentences: Break into two with a period, even if grammatically you wouldn't
This works identically for video and audio cards!
Advanced Phonetic Spelling (Basic Introduction)
Sometimes hyphens aren't enough. For difficult words, you can use phonetic spelling to tell the AI exactly how to pronounce each syllable.
Basic Example
Word: Desert (the dry place, not dessert)
βPhonetic spelling: de-zert
Word: Content (the stuff, not being satisfied)
βPhonetic spelling: con-tent
When You Need Advanced Techniques
If you're still struggling with pronunciation after trying:
Hyphens for syllable breaks
Different punctuation combinations
SSML break tags
Then you may need advanced respelling techniques with detailed phonetic charts.
For comprehensive advanced techniques, including:
Full phonetic alphabet charts
Respelling system with
::notationComplex vowel and consonant combinations
Emphasis techniques
Upward inflection methods
Supported Languages and Voices
AI-generated videos and audio cards support different language options. Below are the complete lists for each card type.
How It Works
Type your script in any supported language
Language is automatically detected from your text
Audio cards: Choose from multiple voice options per language (2-4 voices per language)
Video cards: Voice is matched to your selected avatar
Video Card Languages
Video cards support the following languages:
Afrikaans - Natural β’ Arabic - Natural β’ Austrian (AT) - Natural β’ Austrian (CH) - Natural β’ Bulgarian - Natural β’ Burmese - Natural β’ Catalan - Natural β’ Chinese (CN) - Natural β’ Chinese (HK) - Natural β’ Chinese (TW) - Natural β’ Croatian - Natural β’ Czech - Natural β’ Danish - Natural β’ Dutch (BE) - Natural β’ Dutch (NL) - Natural β’ English (AU) - Natural β’ English (CA) - Natural β’ English (GB) - Natural β’ English (NZ) - Natural β’ English (US) - Professional/Natural β’ Estonian - Natural β’ Filipino - Natural β’ Finnish - Natural β’ French (BE) - Natural β’ French (CA) - Natural β’ French (CH) - Natural β’ French (FR) - Natural β’ Galician - Natural β’ German - Natural β’ Greek - Natural β’ Gujarati - Natural β’ Hebrew - Natural β’ Hungarian - Natural β’ Indonesian - Natural β’ Irish - Natural β’ Italian - Natural β’ Japanese - Original β’ Javanese - Natural β’ Kannada - Original β’ Khmer - Natural β’ Korean - Natural β’ Latvian - Natural β’ Lithuanian - Natural β’ Malay - Natural β’ Maltese - Natural β’ Marathi - Natural β’ Norwegian - Natural β’ Persian - Natural β’ Polish - Natural β’ Portuguese (BR) - Natural β’ Portuguese (PT) - Natural β’ Romanian - Natural β’ Russian - Natural β’ Slovak - Natural β’ Slovenian - Natural β’ Spanish (ES) - Natural β’ Spanish (MX) - Natural β’ Spanish (US) - Natural β’ Swedish - Natural β’ Thai - Natural β’ Ukrainian - Natural β’ Vietnamese - Default β’ Welsh - Natural β’ Zulu - Natural
Audio Card Languages
Audio cards support the following languages with multiple voice options per language:
Chinese (Mandarin) β’ Chinese (Cantonese) β’ English (US) β’ English (UK) β’ English (AU) β’ French β’ French (BE) β’ French (CA) β’ German β’ Italian β’ Japanese β’ Korean β’ Portuguese β’ Portuguese (BR) β’ Spanish β’ Spanish (MX)
Note: Each audio card language includes 2-4 different voice options. If one voice struggles with specific pronunciations, try selecting a different voice to see if it handles your content more naturally.
Choosing Between Video and Audio
Use video cards when:
You want visual engagement with an avatar
Content benefits from facial expressions and presence
You need the broader language support (60+ languages)
Use audio cards when:
Visual elements aren't necessary for comprehension
You want faster generation times
You prefer voice-only delivery
You want to test pronunciation before committing to video
Tips for Specific Card Types
Video Cards
Avatar selection matters: Some avatars may handle certain pronunciations better than others
Regeneration is quick: If pronunciation isn't right, adjust your script and regenerate (5-10 minutes)
Subtitles are permanent: Yellow auto-subtitles are burnt into the video and can't be edited afterward, so test pronunciation before finalizing
See also: Video Cards: Complete Guide
Audio Cards
Try different voices: Each language has multiple voice optionsβsome may pronounce your specific content more naturally
Voice selection affects tone: Different voices have different delivery styles
Character limits: Same 700-character limit as video cards (~1 minute of audio)
Faster testing: Audio cards generate faster than videos, making them good for testing pronunciation techniques
Quick Troubleshooting Guide
Problem | Solution |
Word is mispronounced | Try hyphens: "pro-ject" instead of "project" |
Sounds too fast | Add commas for shorter pauses, periods for longer ones |
Need specific pause | Use break tags: |
Acronym sounds wrong | Add spaces for letters: "N Y C" or use phonetic: "nassa" |
Number sounds weird | Spell differently: "2 5 8 6" instead of "2586" |
No emphasis on key word | Use quotes: "This is the 'most important' step" |
Unnatural inflection | Try different punctuation combinations |
Still not working | See Advanced Pronunciation Techniques for detailed phonetic spelling |
Common Questions
Q: Do these techniques work the same for video and audio cards?
A: Yes! Both use the same AI voice technology, so all techniques (hyphens, breaks, punctuation, phonetic spelling) work identically for both card types.
Q: Can I use these techniques in multiple languages?
A: Absolutely. Hyphens, break tags, and punctuation work across all 40+ supported languages. The phonetic spelling techniques may need language-specific adjustments.
Q: How do I know which voice to use for audio cards?
A: Try generating with different voices to hear which one handles your specific content best. Each voice has slightly different characteristics and may pronounce certain words more naturally.
Q: Why does the same script sound different with different avatars?
A: Each video avatar is paired with a specific voice profile that may have subtle pronunciation differences. If one avatar struggles with your content, try selecting a different one.
Q: Can I preview before finalizing?
A: For video cards, you need to wait for full generation (up to 10 minutes). For audio cards, generation is typically faster, making them good for testing pronunciation before committing to a video version.
Q: What if none of these techniques work?
A: Check out our Advanced Pronunciation Techniques guide, which includes detailed phonetic spelling charts and advanced emphasis methods. If you're still stuck, contact supportβwe can help!
Q: Do punctuation changes affect the subtitles/captions?
A: Yes, punctuation appears in auto-generated subtitles and closed captions, so consider readability when adding commas, periods, or quotes for voice control.
Q: Is there a limit to how many break tags I can use?
A: No specific limit, but excessive breaks make content feel choppy. Use them strategically for natural pacing.
Related Resources
Video and audio card guides:
Video Cards: Add AI-Generated or Custom Videos to Your Courses - Complete video card guide
Advanced Pronunciation Techniques for AI Content - Detailed phonetic spelling and advanced methods
Avoid Your Video Getting Rejected - Content moderation policies
Need help? Contact 7taps support through the Help button in your course editor or email our support team.
This article is part of the 7taps Help Center. For more guides on creating effective microlearning, visit our complete documentation.
