FB Speech Synthesis anyone ?

General discussion for topics related to the FreeBASIC project or its community.
Post Reply
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

FB Speech Synthesis anyone ?

Post by TESLACOIL »

FB Speech Synthesis anyone ?

The Problem !

Messing about with SAPI voices is both limited uninspiring and the results are pretty ropey. Lack of expressive control a key flaw.
Im interested in creating a 'highly fit for purpose artificial voice' , one where you have direct access to a 101 emotional presets.


Code: Select all

 

SPK # "Say this" ; 52 ' like you really mean it
SPK# "Sapi sucks, sooooh baaahd ! " ; 77#12 'sarcastic and contemptuous

SPK# "This problem is not going away anytime soon, us coders HAVE to solve this sooner or later" ; 61 ' grumpy

SPK# " Somewhere over the rainbow ", 101 ' sing it like an angel
[/code]

Something like that anyway, you get my drift.





The solution There are two main approaches.

1) Build a truly extensive voice bank , read 500gigs of wav's, useing a real human voice as source but its not exactly portable or downloadable.

2) Think al'a Moog Synthesizer and attack it digitally using well known techniques used in music synthesizers



Now I need to solve this problem and solve it well, solve it much better than the current cruddy crop of voice-ware that currently exists. Im sat here with $10,000+ of analog and digital recording studio and a cast iron will so i can go either way.

Right now im leaning towards the wav voice bank for coding simplicity BUT ULTIMATELY we all know this needs to be solved digitally.

So if anyone in the FB community wants to have a SERIOUS pop at speech synthesis, as in build some highly usable end user software that's dead easy to integrate with your code then let me know. Now my code skills are a tad cruddy but i have a great deal of experience and equipment ref sound recording and processing plus a huge amount of determination to solve this problem one way or the other.

This isn't a trivial problem, NOR is it that hard either. It is attack-able by a determined team of 2,3 or 4 people.


If your up for a challenge and into producing some truly worthwhile code then let me know.
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

Re: FB Speech Synthesis anyone ?

Post by TESLACOIL »

Voice tech is now what id call borderline useful. The voice in the video is far from perfect but its just about bearable to listen to. As is typical with current approaches as soon as you ask the voice to say something outside it main voice bank the quality plummets.

50 Google Now Voice Commands
http://www.youtube.com/watch?v=2vT0AWDq3DE

some new ideas are hear by needed
lassar
Posts: 306
Joined: Jan 17, 2006 1:35

Re: FB Speech Synthesis anyone ?

Post by lassar »

I agree,that these modern day voice Synthesizers are barking up the wrong tree.

For a truly good voice, you need to simulate the Vocal Tract.

Here are a few google search terms.

simulate "vocal cords" "Speech Synthesis"
simulate "Vocal Tract" "Speech Synthesizer" algorithm

You will find a whole bunch of PDF's on this subject.
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

Re: FB Speech Synthesis anyone ?

Post by TESLACOIL »

agreed !

and you also have to emulate in software the influence that human emotions have on that vocal tract

When it comes to something as expressive as the human voice, cutting corners also means cutting capability.

Im kind of surprised that noone has bothered to 'nail it'. When it comes to the human voice its like we are still in the dos era, no GUI, no mouse, no internet...a small team can nail it, its not that hard. Just need a couple of people with vision and its a wrap.
MichaelW
Posts: 3500
Joined: May 16, 2006 22:34
Location: USA

Re: FB Speech Synthesis anyone ?

Post by MichaelW »

lassar wrote: For a truly good voice, you need to simulate the Vocal Tract.
And, probably more importantly, the brain controlling that vocal tract.
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

Re: FB Speech Synthesis anyone ?

Post by TESLACOIL »

Just as a reminder, 101 ways in which a 'proper' artificial voice would be useful

People with vocal disabilities

Blind people

Partially deaf people who require clear speaking

Voice overs and Voice acting anything.

Characters in computer games.

E-readers, audio books and bedtime stories

Speech therapy

Learning or a new language.

Translating between languages.

Giving a computer the ability to communicate with other devices that use speech recognition.

Virtual teaching assistant

Intelligent personal assistant

ie SILVIA, Samsung's S Voice, LG's Voice Mate, Google Now, Microsoft Cortana, HTC's Hidi and Apple’s Siri
http://en.wikipedia.org/wiki/Intelligen ... _assistant

Sophisticated telephone answering systems that don't drive you or your customers nuts !!!


Musicians & Multimedia artists. It takes skill to direct others, voice actors silicon or carbon benefit from skilled direction. Thus i forsee a new musical device somewhat like a Moog Synthesizer whose primary musical function = a human voice singing,talking, rapping etc. Composition at first, but then as the tech gets better and the artist skill increases, live performance also becomes viable. Think Vocaliod++ but under real-time control.

and of course...Robots and Androids, which is my personnel angst thang <AKA motivation to strongly solve voice issues>



I have no doubt that in the future voices of the past will speak again, Freddie Mercury prowling the stage. Abraham Lincoln emulation droids giving history lessons. Characters from history, populcture and the multitude of subcultures past and present joining in the chorus.
Merick
Posts: 1038
Joined: May 28, 2007 1:52

Re: FB Speech Synthesis anyone ?

Post by Merick »

You shouldn't need gigabytes of wav files. If you take a look at Utau, it can do impressive singing voices using less than 100mb of voice data.
D.J.Peters
Posts: 8586
Joined: May 28, 2005 3:28
Contact:

Re: FB Speech Synthesis anyone ?

Post by D.J.Peters »

Only a hint
in FBSound is an realtime pitchshifter included (for free).
You can give spoken words a melodie or let words singing.
Of course you can manipulate speed and pitch independently.
In all other free open source sound libs you can only set the speed not the pitch.
The result of the other sound libs are more a voice of mickey mouse not a singing real person.

Yes I like FBSound ;-)

Joshy
anonymous1337
Posts: 5494
Joined: Sep 12, 2005 20:06
Location: California

Re: FB Speech Synthesis anyone ?

Post by anonymous1337 »

Google's voice recognition is just outright amazing. I often prefer to text or search on my phone by talking. It is honestly easy enough to replace typing on my laptop keyboard much of the time.

It works in part via neural networks and analysis of spectograms.
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

Re: FB Speech Synthesis anyone ?

Post by TESLACOIL »

Basic speech recognition has been around for a while but it is only useful in a narrow range of instances.

Speech synthesis is the output side & what the OP is about. It too is only useful in a narrow range of circumstances.

Whats needed is some 'easy plug n play software' that greatly broadens the range of circumstances.
rolliebollocks
Posts: 2655
Joined: Aug 28, 2008 10:54
Location: new york

Re: FB Speech Synthesis anyone ?

Post by rolliebollocks »

I'm think of having SAPI do all my speaking for me, so as not to offend anyone with my contemptuous pitch.
TESLACOIL
Posts: 1769
Joined: Jun 20, 2010 16:04
Location: UK
Contact:

Re: FB Speech Synthesis anyone ?

Post by TESLACOIL »

Writing good end-user software requires commitment & vision Rollie. Im not doing this for kicks im doing it to solve 'real problems' , some-projects are just for fun...this isn't one of them !...Just making that abundantly clear in advance. I doubt anyone on here wants to make a big commitment but id thought id ask anyway. In the main i code to solve problems rather than for fun so my outlook is different from typical hobbyists.

Avanna an English vocaloid, not bad actually , they have they own merit. Plenty of freeware/trail-ware to play around with on the web.
http://www.youtube.com/watch?v=y7__D7dKSE0
Post Reply