Bring your karma
Join the waitlist today
HUMBLECAT.ORG

Blind and Visually Impaired Community

Full History - 2016 - 03 - 13 - ID#4a9924
3
IBM Watson has a selection of text to speech voices developers can use. This link leads to the demo (only works in Firefox). Check my comments for audio recordings people not using Firefox can hear, and my thoughts. (text-to-speech-demo.mybluemix.net)
submitted by fastfinge
fastfinge [OP] 1 points 7y ago
This is a little off-topic for this sub, but blind people do use text to speech every day, so I thought a lot of us might be as interested in this as I am.

Here are some examples of how Watson sounds. I converted the files IBM creates into wav (they're opus by default) so everyone could play them; I don't think OS X or IOS have opus support at all yet.

Allison: https://www.sendspace.com/pro/dl/qo03o0

Kate: https://www.sendspace.com/pro/dl/lek3bu

Lisa: https://www.sendspace.com/pro/dl/zo0mgq

The text I used comes from $1, and I believe it's a short enough passage that it qualifies as fair use. It reads as follows:
> A number of Sicilian fisherman are hauling in their empty nets when a huge rocketship crashes into the ocean. Two of the alarmed men overcome their fear and row out to the half-submerged wreck. They rescue two people, one of them Col. Calder, before the rocket begins to sink. Judging from the angle, it appeared to be embedded in the sea floor. Why did it suddenly start to sink? Anyway, the ship had traveled to Venus and back. The mission was top secret and the public is not informed of the achievement until later.

Based on all the hype about Watson and how wonderful it is, I was really expecting a lot better! Maybe it's me, but compared to state of the art speech engines like voiceware, or even Google's TTS offering, I'm really not impressed. Speaking of Google, I also played around with the $1, and I wasn't impressed with that, either. Based on translation of French to English, Google is much much better there, to.
impablomations 2 points 7y ago
The 'Kate' voice is especially bad.


The SSML version of the default text on the same page, which is supposedly more advanced as it can simulate emotion in speech is actually pretty bad too.

Alison - SSML version: https://www.sendspace.com/pro/dl/sl6pn0

It's a short demo paragraph explaining that an online order can't be fulfilled.

It sounds extremely sarcastic and condescending, as if the person speaking is 'talking down' to someone they feel are beneath them.

It definitely suffers from the 'uncanny valley' effect where in it's attempt to sound lifelike, it sounds even less so due to the over expressive voice.

Definitely pretty disappointed considering how much IBM brags about how powerful and adaptable WATSON is.
fastfinge [OP] 1 points 7y ago
Agreed. Here's an ad-free download link by the way: https://www.sendspace.com/pro/dl/sl6pn0

impablomations 1 points 7y ago
Nice, thanks.

I'll edit my comment to your ad free link.
fastfinge [OP] 1 points 7y ago
No problem. I use the heck out of my sendspace pro account. Surprising, as I run like 3 servers myself that I could upload to, but then I'd have to use an ftp client and fix file permissions and pay for extra storage and so on. Sendspace just makes quicky one-time uploads for stuff like this so easy!
This nonprofit website is run by volunteers.
Please contribute if you can. Thank you!
Our mission is to provide everyone with access to large-
scale community websites for the good of humanity.
Without ads, without tracking, without greed.
©2023 HumbleCat Inc   •   HumbleCat is a 501(c)3 nonprofit based in Michigan, USA.