Day Learning Design

,

Using Storyline’s AI Text-to-Speech Functionality

Over the last few months I’ve leaned heavily into the ElevenLabs AI text-to-speech voices in Articulate Storyline and I have some tips to share.

First, make sure you are using the drop down AI Text-to-Speech. The regular Text-to-Speech functionality only provides access to the old Amazon Polly voices and they do not compare.

Second, the new voices have a drawback: you can’t use SSML to modify pronunciation yet. I phonetically spell things out to achieve better output. Here are a few examples:

  1. Abbreviations and Acronyms -> Spell these out how you want them pronounced. For example, POS. This means point-of-sale and for this I write β€œpose”.
  2. # * & -> Write out special characters like hashtag because AI butchers them.
  3. Capital letters -> AI was pronouncing UP_FRONT in my script like β€œyou pee front,” ha! So I had to script it as Up Front.
  4. One-offs like “Read only” -> I write this as “reed only”, otherwise AI pronounces it as “red only” every time.

My basic workflow is generate the audio, export the .wav file to Camtasia, edit/splice the audio into my video, export an .mp4 back to Storyline. I’m skipping the step where I recorded and edited audio in Audacity.

These new and improved text-to-speech voices save time not only in production, but also revisions and future updates.

Have you found any funny quirks with pronunciation as well? I hope my tips help!

View original post on LinkedIn.

A woman in a studio smiles in front of a microphone on a desk.

About