Bring your karma
Join the waitlist today
HUMBLECAT.ORG

Blind and Visually Impaired Community

Full History - 2021 - 12 - 03 - ID#r84xa7
23
Text-to-Speech Narration is Being Forced on Audio Description Users (self.Blind)
submitted by WadjetAD
A debate continues among audio description users: Should audio description narrators perform in a neutral style which mirrors the objective quality of description or opt for a more performance-oriented cadence which reacts to each scene’s tone? There is a case to be made for either style but, despite this disagreement, AD users seem to agree on at least one thing: Text-to-Speech (TTS) narration is terrible.

Users’ complaints about audio description are usually peeves, issues that could use some massaging to improve the experience by a small degree. However, grievances concerning a TTS narrator nearly always describe a ruined experience and an inability to suffer through this type of narration.

If it seems obvious that a grating computer voice is no substitute for mellifluous human tones, that’s because it is. The thousands of complaints and internet comments on the subject merely confirm what is all but a fact.

Given the obvious drop in quality from a human voice actor to a TTS narrator, we must conclude that the ladder’s use is willful ignorance on the part of providers. It’s especially upsetting that the offending streaming services use a mixed bag of TTS and human description. This tactic intentionally makes it harder for consumers to ‘vote with their dollars’ because in on-demand marketplaces they have no way of knowing if a title has TTS description before purchasing it.

This issue widens a familiar chink in the armor of the otherwise fabulous 21st Century Communications and Video Accessibility Act: It specifies that a certain percentage of a company’s content must be described but does not ensure the quality of the audio description. This gives companies that only provide description to stay out of legal trouble free rein to produce unlistenable audio description narration tracks.

If legal trouble is the only thing that will motivate some folks, then we’ve got to implement legislative protections against this type of low-end content’s production. Therefore, I urge the reader to reach out to the American Council of the Blind or similar organizations that consolidate the voice of the VI community. Make these representatives aware of the egregiousness of this issue and how common your grievance is.

Some visually impaired users think that a Devil’s bargain can be struck. They believe that while TTS is lower quality, its automated nature would proliferate audio description more quickly. This misconception stems from some users’ belief that text to speech audio description is also *written* by a computer. This is not so. A program advanced enough to decide what images best serve a visual story and craft a supplemental narrative has not yet been built. Given that scriptwriting is the longest, most costly part of audio description production, further implementing TTS would have a marginal impact on audio description’s availability.

Wadjet will never produce a description track with a TTS narrator. We are committed to hiring wonderful performers who not only voice our scripts with verve and style, but who also reflect the cultures and life experiences represented in the programs and visual narratives we proudly make accessible.

I am excited to hear everyone's thoughts! If you enjoyed this post, please consider visiting it on my blog and leaving a like or a comment.

$1
fastfinge 10 points 1y ago
I *want* TTS audio description. I want an audio description script sent to my phone, and synced with what is on the TV using the microphone. That way:

* I could enjoy audio description in my airpods without bothering sighted friends
* Audio description could contain more detail because I could speed up the TTS
* level of detail could be easily configurable, without making more work for a voice actor
* audio description would be available everywhere, even if the DVD player or online streaming stick didn't support it, because my phone would be syncing and narrating it
* A show wouldn't be ruined because the audio description was mixed at the wrong volume level
* I could enjoy atmos and surround sound content; right now, we're lucky if the audio description track is in stereo, never mind surround
* Movies in the theatre would be much easier, because I wouldn't have to request the special headphones and go to the exact one showing that supports audio description

The existing solution where a voice actor puts audio description on a second audio track is entirely unsatisfactory, and needs to go.
WadjetAD [OP] 7 points 1y ago
Thank you for your perspective! There are actually applications which exist currently that do what you're describing, such as SpectrumAccess.

But implementing text-to-speech narration would not accomplish any of the bullet points in your list, except for a configurable level of detail. However, the logistics of writing many different scripts to correspond to different speeds TTS could talk makes this solution logistically impossible.

The rest of your points come down to licensing deals and pressing companies to see the value in creating higher quality content and better ways to experience it.
fastfinge 4 points 1y ago
> But implementing text-to-speech narration would not accomplish any of the bullet points in your list,

Even without a configurable detail level, it would be nice to speed up the audio description, to better enjoy musicals etc. The less time I'm listening to description, the more time I can enjoy the music or other content, without missing what's happening. The best way to do that is TTS. And adding configurable detail doesn't mean writing multiple scripts. It just means writing a single script, and tagging each phrase in the script with a detail level. Then the device only speaks the parts of the script at or under your configured level of detail. This is super easy to do with TTS, and absolutely impossible to do with a voice actor.

Also, sending audio description to the phone as text, and letting the phone read it out, has other advantages I didn't list. If I wanted, I could read the AD on my Braille display, instead of listening to it. That would mean I don't have to wear headphones at all anymore, and can enjoy the audio parts of the content like everyone else, without a headset in the way. Also, it would mean that instead of 300 MB of data, the audio description might take 1 MB of data, if that.

Lastly, when it comes to pushing for higher quality levels of content, there's a hard limit here. No streaming provider is going to store and provide multiple, lossless atmos tracks of a show or movie. It also won't fit on a blueray. And over-the-air TV doesn't have the bandwidth for it. Captions just require a text stream, and that's one of the reasons they're so widely available. Audio description requires an entirely secondary audio track, and nobody wants to spend the storage or bandwidth to provide that, and in many cases it just isn't available in the first place (like in the case of over the air TV or a blueray). If we could get audio descriptions to be stored and streamed as text, I think we'd find they'd be more available to everyone.
SightlessBastard 5 points 1y ago
`I just would like to adress a few points here. I have heard that pretty often now, that the AD is talking between the songs in musicals. These movies aren't my prefered genre, but I have watched a few disney movies. And at least there, there was no narration during the songs. Could you maybe provide me with an example here?`

But apart from that, this problem could be solved easily, by just providing two different audiotracks with narration. At least for musicals, that would work. One with a more detailled narration during songs, and one, where there's no narration during the music.

​

\> If I wanted, I could read the AD on my Braille display, instead of listening to it...

​

Well, I guess, you could theoretically do that. But would you really sit with a Braille display in a theatre?

​

\> ...instead of 300 MB of data...

​

We have an app here in my country, that basicaly does the same thing, that Spectrum 'Access does. You download the track with the narration, and sync it via your phone with the movie. These files might have a size of 50 or maybe 60 megs, but never ever 300.

​

Now, regarding streaming services and their storage capazities. If you take an episode of, for example a Netflix show with a duration of about an hour, you can assume, that the actual episode on the Netflix servers has a size of about 1 Terrabyte. Netflix has their AD-Tracks mostly in 5.1. So, yes, I would guess, here, you estimated 300 Megs would be correct. But I highly doubt, that this would matter to them, since they calculate in Terrabytes. And I won't even start with AppleTV plus. I don't even wanna know, how big their raw files must be, since they provide their AD not only in 10 different languages, but also, as far as I know, in Dolby Atmos.
fastfinge 3 points 1y ago
> there was no narration during the songs. Could you maybe provide me with an example here?

The audio description on Les Misérables is really bad for this. I haven't re-watched it in a few years, but I especially remember I Dreamed a Dream and some of the choruses of One Day more being a problem.

However, I've also had problems with Disney where I wanted narration on some of the songs, because I wanted to know more about the dance coriography, for a class assignment. This type of thing is almost impossible to find descriptions of.

> But would you really sit with a Braille display in a theatre?

No, but I would on my couch. No audio description volume changes, no downmixing, and no headphones. For the first time I could really enjoy my Sonos surround system for TV, watch a show with my friends, and not miss out on what's happening.

> Now, regarding streaming services and their storage capazities.

This is only a tiny part of the problem. I want audio description on blueray and over the air TV. In all of these cases, bandwidth is *extremely* limitted. There's only so much space on the blueray, not enough for atmos audio description tracks. And similarly, broadcast TV only has so much bandwidth to work with in the digital signal. Lastly, why do you think that free services like Youtube don't offer audio description? They're not a premium service, so they probably can't afford to provide the storage that Apple or Netflix do. But I want audio description there, too. If it was text sent to my phone, I could have it in all of these places.
WadjetAD [OP] 3 points 1y ago
Disney has a separate style guide which they rarely deviate from that dictates no narration during songs. So this "issue" is only found on other services. I am also interested in a solution where the user has different options, since it really comes down to personal preference. Hearing the song unobstructed might be more pleasurable but nearly always causes the audience to miss visual gags, character moments and plot pertinent visuals.
SightlessKombat 1 points 1y ago
By way of an example, Hamilton's film version, if memory serves. I may come back to this topic later, but wanted to throw that your way.
WadjetAD [OP] 3 points 1y ago
Thank you for these insights, I clearly have not thought about the advantages that text files carry as deeply as you! Providing a text file with the purchase or streaming of a described program would cost next to nothing as well. I wonder how many people are interested in this option; it's creative and I've never heard it before.
purple_goat_8138 4 points 1y ago
I'd be interested in it. Personally, I like TTS description. I want to be focused on the movie and what's happening in it. I don't want to hear a describer gasp or deliver emotive content. I just want to hear what's happening. TTS is great for this. Just the facts, ma'am.
SightlessKombat 1 points 1y ago
I've been promoting transcripts for videogame sequences for a number of years as part of the #TranscribingGames project, for instance, don't see why the same couldn't be applied to movies while still keeping standard human narrated AD intact myself.
Nighthawk321 9 points 1y ago
I'm on the audio description advisory board for Descriptive Video Works.. It's good to read posts like these so we know what points to address during our meeting's. Thank you!
bondolo 6 points 1y ago
This seems like a commercial perspective and not a purely aesthetic one.
WorldlyLingonberry40 3 points 1y ago
The PornHub sellection of videos is disappointing. They hired someone who does not watch porn to describe the content. There is a really go interview with the person somewhere in Youtube..
Superfreq2 3 points 1y ago
This is similar to the double edged sword of auto generated captions, and I know that auto generated TTS descriptions are coming soon too.

On one side, much more stuff can be described this way even if it's potentially sub par and quality will probably vary significantly between services. Opening up some access for allot of things is still extremely helpful short term, and thankfully we're joining the game allot farther in when it comes to AI quality than the Deaf/HoH community had to, plus TTS voices are better than ever.

On the other side however, it encourages companies who might otherwise do better audio description with a human to go for the cheaper option, because they don't understand or value the difference.
The Deaf/HoH community is constantly fighting the "but I thought the auto captions were fine and why do you want me to spend all this extra money on proper ones" battle, so I'm not looking forward to that part of it.


We can already see this to some degree with "Aira" access locations. Getting help on demand is obviously great, and some see it as more independence because at least you're doing more of it on your own even if you still have help.
Yet when the staff just say "can't you use Aira" even if you don't have a phone/enough data/a good battery level, or you just don't prefer it, that becomes a problem, and worries about overreliance on tech and less of a focus on staff training become a concern.

Still, the possibilities this could open up are exciting and this shift was all but inevitable, I just hope we aren't shooting our selves in the foot in the long term.
[deleted] 3 points 1y ago
[deleted]
SqornshellousZ 4 points 1y ago
I think you missed the point. There is no incentive to use a better engine especially if it's more expensive.
[deleted] 2 points 1y ago
[deleted]
SqornshellousZ 1 points 1y ago
Have you ever considered a career in politics? /s
BlindWizard 3 points 1y ago
I have been pretty impressed lately with the audio description through Netflix. Especially in one scene of the Mitchell's versus the machines where they are fighting the furbies 😂 I agree 100% that it needs better quality and honestly if they could even get actors from the particular movie or show to do the audio description it would be more immersive and take away from the experience. Less. Audio description is distracting in a lot of ways still but I do believe they are getting much better. But the buy wrote plane deadpan description that you see in the majority of places definitely needs to go in my opinion. Like you said though there are arguments for both
AnElusiveDreamer 2 points 1y ago
Who does this? I’ve never heard audio description done with TTS, but I don’t doubt that it’s out there. Can I have some examples?
WadjetAD [OP] 4 points 1y ago
From what I've heard Amazon is the main offender.
SightlessBastard 2 points 1y ago
As far as I know, it’s mostly Amazon. You can look at the nightmare on Elm Street movies. I normally wouldn’t have bothered with them, since I hate TTS descriptions. I can’t stand them. They are just terrible. But I was always a big fan of these movies. And now I wish, Netflix would acquire this license, so we could get a proper audio description.
liamjh27 1 points 1y ago
I’ve just heard it for the first time on The Protégé on Amazon Prime. It was terrible. Really would not like this going forward.
AnElusiveDreamer 1 points 1y ago
I found it on Amazon as well, and it is definitely not good!
This nonprofit website is run by volunteers.
Please contribute if you can. Thank you!
Our mission is to provide everyone with access to large-
scale community websites for the good of humanity.
Without ads, without tracking, without greed.
©2023 HumbleCat Inc   •   HumbleCat is a 501(c)3 nonprofit based in Michigan, USA.