Modifying SAPI Text-to-Speech using XML Mark-up

Post your FreeBASIC source, examples, tips and tricks here. Please don’t post code without including an explanation.
Post Reply
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

In SAPI you can mark-up the text to be spoken with XML tags which can be used to change such things as
- the volume of the speech
- how fast the text is spoken
- the voice pitch used etc.

This program defines a bunch of simple functions to make use of this capability.

WHAT VOICES ARE AVAILABLE TO SAPI ON MY COMPUTER?
You can find the voices that are available on your computer from the Control Panel. These days the Control Panel is hidden in Windows.
To check do this:
- Press Win + R keys to open the Run command dialog box.
- Type "Control" and then click OK.
- Control Panel > Ease of Access > Speech Recognition > Text to Speech
- The Voice Selection shows what voices are available. You can use this to change your (default) voice.

On my computer I have 3 voices
- Microsoft David (US)
- Microsoft Hazel (UK)
- Microsoft Zira (US)

UPDATE
I have deleted the code that was posted here. Please see my updated code later in this thread.
Last edited by delyall on May 12, 2024 6:14, edited 1 time in total.
SARG
Posts: 1785
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by SARG »

Thanks, working fine.

To use with default dialect (fb)
comment line 3 -> DEFINT A-Z
and change key$ by key in SUB CheckForProgramClose() (need to dim key as string)

On my french PC I get only 2 voices : Hortense (french girl) and Zira (US). Obvioulsly the texts should be adapted :D
Last edited by SARG on May 12, 2024 7:50, edited 1 time in total.
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

I believe the French Language id for French (Standard) = 40c
Unfortunately I can't test this. And yeah I'm using FBLite not FB so thanks for your suggested changes SARG.
Last edited by delyall on May 12, 2024 2:07, edited 2 times in total.
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

UPDATED CODE
FRENCH LANGUAGE VOICE ADDED
This code should also help anyone wanting to use a language other than English.

I was able to install the voice - Microsoft Hortense, which is a French voice.
Just a reminder that SAPI only has access to voices that are displayed in -
Control Panel > Ease of Access > Speech Recognition > Text to Speech
The program uses the default voice set up there but the other voices are available programmatically.

I have added 2 new Functions/Subs -
TSS_FR Says the text using the French voice. The French voice has to be installed.
SpeakFR This is a replacement for the Speak subroutine. It defaults to saying the text passed to it using the French voice. The French voice has to be installed.

If you have a French voice on your computer you can test this out by un-commenting lines - 126, 127, 150.
Hopefully this should make usage clear.

Code: Select all

'$lang: "fb"
'
' See https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms717077(v=vs.85)
'     https://www.freebasic.net/forum/viewtopic.php?t=25278
'     https://www.freebasic.net/forum/viewtopic.php?t=28445
'     https://www.homeandlearn.co.uk/extras/speech/speech-project-intro.html
'     https://learn.microsoft.com/en-us/windows-hardware/manufacture/desktop/available-language-packs-for-windows?view=windows-11
'
' You can check what voices/accents are available on your computer from the Control Panel:
' - Press Win + R to open the Run command dialog box. 
' - Type "Control" and then click OK.
' - Control Panel > Ease of Access > Speech Recognition > Text to Speech
'
' TDeL May 2024
'
'********************************************************************************"
'*                           SUBROUTINES / FUNCTIONS                            *"
'********************************************************************************"
'
DECLARE SUB Speak(BYVAL textIn AS STRING)
DECLARE SUB SpeakFR(BYVAL textIn AS STRING) ' Speak subroutine that uses the French voice if on is present. This is a replacement for Speak if the whole dialog is to be in French.
'
DECLARE FUNCTION TSS_AU (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSS_FR (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSS_UK (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSS_US (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSSEmph (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSSFemale (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSSLang (BYVAL textIn AS STRING, BYVAL countrycode AS STRING) AS STRING
DECLARE FUNCTION TSSMale (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSSPause (BYVAL milliseconds AS INTEGER) AS STRING
DECLARE FUNCTION TSSPitch (BYVAL textIn AS STRING, BYVAL level AS INTEGER) AS STRING
DECLARE FUNCTION TSSRate (BYVAL textIn AS STRING, BYVAL speed AS INTEGER) AS STRING
DECLARE FUNCTION TSSSpell (BYVAL textIn AS STRING) AS STRING
DECLARE FUNCTION TSSVolume (BYVAL textIn AS STRING, BYVAL level AS INTEGER) AS STRING
'
'********************************************************************************"
'*             USER DEFINED SPEAKERS - YOU CAN DELETE THESE                     *"
'********************************************************************************"
'
DECLARE SUB Speaker1 (BYVAL textIn AS STRING) 
DECLARE SUB Speaker2 (BYVAL textIn AS STRING)
DECLARE SUB SpeakerFR (BYVAL textIn AS STRING) 
'
DECLARE SUB CheckForProgramClose()
'
'
'********************************************************************************"
'*                                   INCLUDES                                   *"
'********************************************************************************"
'
#define UNICODE
#Include "DispHelper/DispHelper.bi"
'
'********************************************************************************"
'*                    DISPHELPER / SAPI DECLARATIONS                             *"
'********************************************************************************"
'
DIM SHARED AS IDispatch PTR voice
'
'********************************************************************************"
'*                                                                              *"
'*                               M A I N L I N E                                *"
'*                                                                              *"
'********************************************************************************"
'
SCREEN 12
WINDOWTITLE "SAPI with XML Markups"
'
'********************************************************************************"
'*  INITIALISATION - This must occur before any calls to Speak Subroutine       *"
'********************************************************************************"
dhInitialize(TRUE) 
' Can SAPI be opened? If not, terminate the program.
IF dhCreateObject("SAPI.SpVoice", NULL, @voice) <> 0 THEN PRINT "Unable to open SAPI. The program will terminate.": dhUninitialize(TRUE): END ' Note: Voice is defined as SHARED above
'
'
'
'***** SAPI Tests *****

Speak ("This is a test of SAPI XML markup using calls to functions written in FreeBasic.")
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can change the volume of the voice.")
Speak (TSSVolume ("This sentence is loud.", 100))
Speak (TSSVolume ("This sentence is soft.", 8))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can change how fast or slowly the voice speaks.")
Speak (TSSRate ("This sentence is fast.", 8))
Speak (TSSRate ("This sentence is slow.", -8))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can change the pitch of the voice.")
Speak (TSSPitch ("This sentence is spoken in a high pitch.", 10))
Speak (TSSPitch ("This sentence is spoken in a low pitch.", -10))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can add emphasis to words.")
Speak ("This is important." + TSSEmph ("Please pay attention."))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("Voice control functions can be combined.")
Speak (TSSRate (TSSPitch ("This sentence is low and slow.", -10), -5))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can spell words rather than saying them.")
Speak ("How do you spell apple?" + TSSSpell ("Apple"))
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can put a break" + TSSPause(1500) + "into a sentence.")
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can speak in different accents if they are available on your computer. Let's give it a try.")
Speak (TSS_US ("This sentence should be in an American accent."))
Speak (TSS_UK ("This sentence should be in a UK accent. This may not work on your computer."))
Speak (TSS_AU ("This sentence should be in an Australian accent. This may not work on your computer."))
'Speak (TSS_FR ("Cette phrase est en français. Cela peut ne pas fonctionner sur votre ordinateur.")) ' Uncomment this if you have a French voice on your computer. 
'SpeakFR ("C'est la même voix française. La méthode est différente. Utilise (SpeakFR).") ' Uncomment this if you have a French voice on your computer. 
Speak (TSSPause(250))
CheckForProgramClose()

Speak ("You can speak in in male and female voice if they are available on your computer. You should now hear a female voice followed by a male voice. This may not work on your computer.")
Speak(TSSFemale ("This sentence is in an female voice."))
Speak(TSSMale ("This sentence is in an male voice."))
Speak (TSSPause(250))
CheckForProgramClose()

' ***** A 2 Speaker conversation THIS CAN BE DELETED *****

Speak ("Users can create their own subroutines that combine the TSS functions to create distinct speakers.")
Speak (TSSPause(200))

Speaker1 ("Hello. How did the move go? What kind of place are you living in now?")
Speaker2 ("Great, thanks! It's just a little flat actually, right next to the park. The area's nice and quiet too, so, yeah - all good so far!")
Speaker1 ("Great! What's it like inside?")
Speaker2 ("So it's a bit small, you know, but it's cosy. And the de-cor is quite modern too so that makes it feel a bit more spacious. I really love the kitchen, because it's got loads of natural light.")
Speaker1 ("Ok! Anything you're not so happy about?")
Speaker2 ("Well, I'd be a lot happier if there was somewhere to hang coats, because there's nothing right now. And also, the parking outside is a real nightmare too. But apart from that, I'm really happy with it.")
Speaker1 ("I'll see you later. I have to get home before the traffic gets too heavy." + TSSEmph("Bye"))

'SpeakerFR ("Excusez-moi. Je cherche l’Église Sainte-Agathe. Connaissez-vous le chemin?") 'Uncomment this if you have a French voice on your computer.

Speak (TSSPause(250))
CheckForProgramClose()

'*********************************************************

Speak ("And that's all folks. Press any key to end the program.")    

'
'********************************************************************************"
'*  UNINITIALISATION - This must occur after all calls to Speak Subroutine       *"
'********************************************************************************"
dhUninitialize(TRUE) 
'
'
PRINT: PRINT "Press any key to end ..."
SLEEP
END


SUB Speak(BYVAL textIn AS STRING)
  '
  dhCallMethod(voice, ".Speak(%s, %d)", "<speak>" + textIn + "</speak>", 0) ' Note: Voice is defined as SHARED above
  '
END SUB

SUB SpeakFR(BYVAL textIn AS STRING) 'French version. Requires a French voice to be on the computer. Version française. Nécessite une voix française pour être sur l'ordinateur.
  '
  dhCallMethod(voice, ".Speak(%s, %d)", "<speak>" + TSS_FR(textIn) + "</speak>", 0) ' Note: Voice is defined as SHARED above
  '
END SUB

FUNCTION TSSLang(BYVAL textIn AS STRING, BYVAL countrycode AS STRING) AS STRING
  '
  ' Accent / country codes:
  '    409	  English	United States
  '    809	  English	United Kingdom
  '    1009	  English	Canada
  '    1809	  English	Ireland
  '    4009	  English	India
  '    C09	  English	Australia
  '    40C    French Standard
  ' Check what voices/accents are available on your computer from the Control Panel > Ease of Access > Speech Recognition > Text to Speech
  '
  TSSLang = "<lang langid=" + CHR(34) + countrycode + CHR(34) + "> " + textIn + " </lang>"
  '
END FUNCTION

FUNCTION TSS_AU (BYVAL textIn AS STRING) AS STRING
  ' Australian English language
  TSS_AU = TSSLang(textIn, "C09")
END FUNCTION

FUNCTION TSS_FR (BYVAL textIn AS STRING) AS STRING
  ' Standard French language. Requires a French voice to be on the computer. Langue française. Nécessite une voix française pour être sur l'ordinateur.
  TSS_FR = TSSLang(textIn, "40C")
END FUNCTION

FUNCTION TSS_UK (BYVAL textIn AS STRING) AS STRING
  ' UK English language
  TSS_UK = TSSLang(textIn, "809")
END FUNCTION

FUNCTION TSS_US (BYVAL textIn AS STRING) AS STRING
  ' US English language
  TSS_US = TSSLang(textIn, "409")
END FUNCTION

FUNCTION TSSEmph (BYVAL textIn AS STRING) AS STRING
  ' Emphasis on text
  TSSEmph = "<emph>" + textIn + "</emph>"
END FUNCTION

FUNCTION TSSFemale (BYVAL textIn AS STRING) AS STRING
  TSSFemale = "<voice required=" + CHR(34) + "Gender=Female" +  CHR(34) +">" + textIn + "</voice>"
END FUNCTION

FUNCTION TSSMale (BYVAL textIn AS STRING) AS STRING
  TSSMale = "<voice required=" + CHR(34) + "Gender=Male" +  CHR(34) +">" + textIn + "</voice>"
END FUNCTION

FUNCTION TSSPause (BYVAL milliseconds AS INTEGER) AS STRING
  ' Insert a specified number of milliseconds of silence 
  TSSPause = "<silence msec=" + CHR(34) + STR(milliseconds) + CHR(34) + "/>"
END FUNCTION

FUNCTION TSSPitch (BYVAL textIn AS STRING, BYVAL level AS INTEGER) AS STRING
  ' Raises or lowers the voice pitch
  ' Pitch levels between -10 and 10
  IF level > 10 THEN level = 10
  IF level < -10 THEN level = -10
  TSSPitch = "<pitch absmiddle=" + CHR(34) + STR(level) +  CHR(34) +">" + textIn + "</pitch>"
END FUNCTION

FUNCTION TSSRate (BYVAL textIn AS STRING, BYVAL speed AS INTEGER) AS STRING
  ' Make voice speak faster or slower
  ' Rate between -10 and 10
  IF speed > 10 THEN speed = 10
  IF speed < -10 THEN speed = -10
  TSSRate = "<rate absspeed=" + CHR(34) + STR(speed) + CHR(34) +">" + textIn + "</rate>"
END FUNCTION

FUNCTION TSSSpell (BYVAL textIn AS STRING) AS STRING
  ' Spell out the text
  TSSSpell = "<spell>" + textIn + "</spell>"
END FUNCTION

FUNCTION TSSVolume (BYVAL textIn AS STRING, BYVAL level AS INTEGER) AS STRING
  ' Raise or lower voice volume
  ' Volume levels between 0 and 100
  IF level > 100 THEN level = 100
  IF level < 0 THEN level = 0
  TSSVolume = "<volume level=" + CHR(34) + STR(level) + CHR(34) +">" + textIn + "</volume>"
END FUNCTION
'
'********************************************************************************"
'*        USER DEFINED SPEAKERS  - YOU CAN DELETE THESE                          *"
'********************************************************************************"
'
SUB Speaker1 (BYVAL textIn AS STRING) 
  Speak (TSSPause(50))
  Speak (TSSRate (TSSPitch (textIn, 5), 2))
END SUB

SUB Speaker2 (BYVAL textIn AS STRING) 
  Speak (TSSPause(50))
  Speak (TSSRate (TSSPitch (textIn, -6), 0))
  'Speak (TSSMale (TSSRate (TSSPitch (textIn, -6), 0))) 'Try this instead if you have both male and female voices.
END SUB

SUB SpeakerFR (BYVAL textIn AS STRING) 
  Speak (TSSPause(50))
  SpeakFR (TSSRate (TSSPitch (textIn, -6), 0))
END SUB

' ***********************************************************************************
SUB CheckForProgramClose()
  ' This is just a quick and dirty way to end the program while it is running.
  ' The program will possibly continue speaking for a liitle while before closing.
  ' Delete this as appropriate.
  DIM key AS STRING
  key = INKEY
  IF key = CHR(27) OR key = CHR(255, 107) THEN dhUninitialize(TRUE): END
  WHILE INKEY <> "": WEND             ' Flush buffer
END SUB 
' ***********************************************************************************

Last edited by delyall on May 12, 2024 6:56, edited 2 times in total.
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

SARG wrote: May 11, 2024 10:17 On my french PC I get only 2 voices : Hortense (french girl) and Zira (US). Obvioulsly the texts should be adapted :D
Got French working using Microsoft Hortense voice. See revised post.
SARG
Posts: 1785
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by SARG »

Tested, it works nicely. :-)

As there are only French and US voices the output in English UK and other voices is awfull ;-)

I use the same API in a game I'm writing but with less parameters so your code interests me. Thanks again.
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

SARG wrote: May 12, 2024 9:25 As there are only French and US voices the output in English UK and other voices is awfull ;-)
As far as I can tell if a voice isn't present it uses the default voice. In your case if that is Hortense (French) it will sound awful.
Last edited by fxm on May 12, 2024 11:00, edited 1 time in total.
Reason: Enabled BBCode.
SARG
Posts: 1785
Joined: May 27, 2005 7:15
Location: FRANCE

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by SARG »

delyall wrote: May 12, 2024 10:03 As far as I can tell if a voice isn't present it uses the default voice. In your case if that is Hortense (French) it will sound awful.
yes I expressed myself badly : English texts are spoken awfully except those when US voice is selected. :lol:
delyall
Posts: 30
Joined: Feb 13, 2010 0:19

Re: Modifying SAPI Text-to-Speech using XML Mark-up

Post by delyall »

A subroutine to bring up the Text-to-Speech dialog box. Saves having to click through a number of Windows links. Can be added to your TSS program to help users see what voices they have and change the default voice.

Code: Select all

'SAPIVoices.bas

'$lang: "fb"

#include "file.bi"
#define WIN_INCLUDEALL
#include "windows.bi"

DECLARE SUB DisplayAvailableSAPIVoices()

SCREEN 12
DisplayAvailableSAPIVoices 
PRINT "Press any key to end program ..."
SLEEP
END

SUB DisplayAvailableSAPIVoices()
  '
  DIM SAPICplFile AS STRING
  SAPICplFile = ENVIRON("SystemRoot") + "\System32\Speech\SpeechUX\sapi.cpl"
  IF FILEEXISTS (SAPICplFile) THEN
    ShellExecute(NULL, NULL, "control.exe", SAPICplFile, NULL, SW_SHOWNORMAL)
  ELSE
    SAPICplFile = ENVIRON("SystemRoot") + "\SysWOW64\Speech\SpeechUX\sapi.cpl"
    IF FILEEXISTS (SAPICplFile) THEN 
      ShellExecute(NULL, NULL, "control.exe", SAPICplFile, NULL, SW_SHOWNORMAL)
    ELSE
      MessageBox(GetForegroundWindow(), "Unable to display available voices." + CHR$(0), "Information" + CHR$(0), MB_OK + MB_ICONWARNING + MB_ICONINFORMATION + MB_DEFBUTTON1 + MB_SETFOREGROUND )
    END IF
  END IF  
  '
END SUB 

Post Reply