<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <!-- Created on October 18, 2014 by texi2html 5.0 texi2html was written by: Lionel Cons <Lionel.Cons@cern.ch> (original author) Karl Berry <karl@freefriends.org> Olaf Bachmann <obachman@mathematik.uni-kl.de> and many others. Maintained by: Many creative people. Send bugs and suggestions to <texi2html-bug@nongnu.org> --> <head> <title>Flite: a small, fast speech synthesis engine: 7 APIs</title> <meta name="description" content="Flite: a small, fast speech synthesis engine: 7 APIs"> <meta name="keywords" content="Flite: a small, fast speech synthesis engine: 7 APIs"> <meta name="resource-type" content="document"> <meta name="distribution" content="global"> <meta name="Generator" content="texi2html 5.0"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <style type="text/css"> <!-- a.summary-letter {text-decoration: none} blockquote.smallquotation {font-size: smaller} div.display {margin-left: 3.2em} div.example {margin-left: 3.2em} div.lisp {margin-left: 3.2em} div.smalldisplay {margin-left: 3.2em} div.smallexample {margin-left: 3.2em} div.smalllisp {margin-left: 3.2em} pre.display {font-family: serif} pre.format {font-family: serif} pre.menu-comment {font-family: serif} pre.menu-preformatted {font-family: serif} pre.smalldisplay {font-family: serif; font-size: smaller} pre.smallexample {font-size: smaller} pre.smallformat {font-family: serif; font-size: smaller} pre.smalllisp {font-size: smaller} span.nocodebreak {white-space:pre} span.nolinebreak {white-space:pre} span.roman {font-family:serif; font-weight:normal} span.sansserif {font-family:sans-serif; font-weight:normal} ul.no-bullet {list-style: none} --> </style> </head> <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> <a name="APIs"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="flite_5.html#Structure" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="flite_5.html#cst_005fval" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[ Up ]</td> <td valign="middle" align="left">[<a href="#flite-binary" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <a name="APIs-1"></a> <h1 class="chapter">7 APIs</h1> <p>Flite is a library that we expected will be embedded into other applications. Included with the distribution is a small example executable that allows synthesis of strings of text and text files from the command line. </p> <hr> <a name="flite-binary"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Up section"> Up </a>]</td> <td valign="middle" align="left">[<a href="#Voice-selection" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <h2 class="section">7.1 flite binary</h2> <p>The example flite binary may be suitable for very simple applications. Unlike Festival its start up time is very short (less that 25ms on a PIII 500MHz) making it practical (on larger machines) to call it each time you need to synthesize something. </p><div class="example"> <pre class="example">flite TEXT OUTPUTTYPE </pre></div> <p>If <code>TEXT</code> contains a space it is treated as a string of text and converted to speech, if it does not contain a space <code>TEXT</code> is treated as a file name and the contents of that file are converted to speech. The option <code>-t</code> specifies <code>TEXT</code> is to be treat as text (not a filename) and <code>-f</code> forces treatment as a file. Thus </p><div class="example"> <pre class="example">flite -t hello </pre></div> <p>will say the word "hello" while </p><div class="example"> <pre class="example">flite hello </pre></div> <p>will say the content of the file ‘<tt>hello</tt>’. Likewise </p><div class="example"> <pre class="example">flite "hello world." </pre></div> <p>will say the words "hello world" while </p><div class="example"> <pre class="example">flite -f "hello world" </pre></div> <p>will say the contents of a file ‘<tt>hello world</tt>’. If no argument is specified text is read from standard input. </p> <p>The second argument <code>OUTPUTTYPE</code> is the name of a file the output is written to, or if it is <code>play</code> then it is played to the audio device directly. If it is <code>none</code> then the audio is created but discarded, this is used for benchmarking. If it is <code>stream</code> then the audio is streamed through a call back function (though this is not particularly useful in the command line version. If <code>OUTPUTTYPE</code> is omitted, <code>play</code> is assumed. You can also explicitly set the outputtype with the <code>-o</code> flag. </p><div class="example"> <pre class="example">flite -f doc/alice -o alice.wav </pre></div> <hr> <a name="Voice-selection"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="#flite-binary" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Up section"> Up </a>]</td> <td valign="middle" align="left">[<a href="#C-example" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <h2 class="section">7.2 Voice selection</h2> <p>All the voices in the distribution are collected into a single simple list in the global variable <code>flite_voice_list</code>. You can select a voice from this list from the command line </p><div class="example"> <pre class="example">flite -voice awb -f doc/alice -o alice.wav </pre></div> <p>And list which voices are currently supported in the binary with </p><div class="example"> <pre class="example">flite -lv </pre></div> <p>The voices which get linked together are those listed in the <code>VOICES</code> in the ‘<tt>main/Makefile</tt>’. You can change that as you require. </p> <hr> <a name="C-example"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="#Voice-selection" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Up section"> Up </a>]</td> <td valign="middle" align="left">[<a href="#Public-Functions" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <h2 class="section">7.3 C example</h2> <p>Each voice in Flite is held in a structure, a pointer to which is returned by the voice registration function. In the standard distribution, the example diphone voice is <code>cmu_us_kal</code>. </p> <p>Here is a simple C program that uses the flite library </p><div class="example"> <pre class="example">#include "flite.h" register_cmu_us_kal(); int main(int argc, char **argv) { cst_voice *v; if (argc != 2) { fprintf(stderr,"usage: flite_test FILE\n"); exit(-1); } flite_init(); v = register_cmu_us_kal(NULL); flite_file_to_speech(argv[1],v,"play"); } </pre></div> <p>Assuming the shell variable FLITEDIR is set to the flite directory the following will compile the system (with appropriate changes for your platform if necessary). </p><div class="example"> <pre class="example">gcc -Wall -g -o flite_test flite_test.c -I$FLITEDIR/include -L$FLITEDIR/lib -lflite_cmu_us_kal -lflite_usenglish -lflite_cmulex -lflite -lm </pre></div> <hr> <a name="Public-Functions"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="#C-example" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Up section"> Up </a>]</td> <td valign="middle" align="left">[<a href="#Streaming-Synthesis" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <h2 class="section">7.4 Public Functions</h2> <p>Although, of course you are welcome to call lower level functions, there a few key functions that will satisfy most users of flite. </p><dl compact="compact"> <dt><code>void flite_init(void);</code></dt> <dd><p>This must be called before any other flite function can be called. As of Flite 1.1, it actually does nothing at all, but there is no guarantee that this will remain true. </p></dd> <dt><code>cst_wave *flite_text_to_wave(const char *text,cst_voice *voice);</code></dt> <dd><p>Returns a waveform (as defined in ‘<tt>include/cst_wave.h</tt>’) synthesized from the given text string by the given voice. </p></dd> <dt><code>float flite_file_to_speech(const char *filename, cst_voice *voice, const char *outtype);</code></dt> <dd><p>synthesizes all the sentences in the file ‘<tt>filename</tt>’ with given voice. Output (at present) can only reasonably be, <code>play</code> or <code>none</code>. If the feature <code>file_start_position</code> with an integer, that point is used as start position in the file to be synthesized. </p></dd> <dt><code>float flite_text_to_speech(const char *text, cst_voice *voice, const char *outtype);</code></dt> <dd><p>synthesizes the text in string point to by <code>text</code>, with the given voice. <code>outtype</code> may be a filename where the generated waveform is written to, or "play" and it will be sent to the audio device, or "none" and it will be discarded. The return value is the number of seconds of speech generated. </p></dd> <dt><code>cst_utterance *flite_synth_text(const char *text,cst_voice *voice);</code></dt> <dd><p>synthesize the given text with the given voice and returns an utterance from it for further processing and access. </p></dd> <dt><code>cst_utterance *flite_synth_phones(const char *phones,cst_voice *voice);</code></dt> <dd><p>synthesize the given phones with the given voice and returns an utterance from it for further processing and access. </p></dd> <dt><code>cst_voice *flite_voice_select(const char *name);</code></dt> <dd><p>returns a pointer to the voice named <code>name</code>. Will retrurn <code>NULL</code> if there is not match, if <code>name == NULL</code> then the first voice in the voice list is returned. </p></dd> <dt><code>int flite_voice_add_lex_addenda(cst_voice *v, const cst_string *lexfile);</code></dt> <dd><p>loads the pronunciations from <code>lexfile</code> into the lexicon identified in the given voice (which will cause all other voices using that lexicon to also get this new addenda list. An example lexicon file is given in ‘<tt>flite/tools/examples.lex</tt>’. Words may be in double quotes, an optional part of speech tag may be give. A colon separates the headword/postag from the list of phonemes. Stress values (if used in the lexicon) must be specified. Bad phonemes will be complained about on standard out. </p></dd> </dl> <hr> <a name="Streaming-Synthesis"></a> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="#Public-Functions" title="Previous section in reading order"> < </a>]</td> <td valign="middle" align="left">[<a href="#APIs" title="Up section"> Up </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next section in reading order"> > </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <h2 class="section">7.5 Streaming Synthesis</h2> <p>In 1.4 support was added for streaming synthesis. Basically you may provided a call back function that will be called with waveform data immediately when it is available. This potentially can reduce the dealy bewteen sending text to the synthesized and having audio available. </p> <p>The support is through a call back function of type </p><div class="example"> <pre class="example">int audio_stream_chunk(const cst_wave *w, int start, int size, int last, void *user) </pre></div> <p>If the utterance feature <code>streaming_info</code> is set (which can be set in a voice or in an utterance). The LPC or MLSA resynthesis functions will call the provided function as buffers become available. The LPC and MLSA waveform synthesis functions are used for diphones, limited domain, unit selection and clustergen voices. Note explicit support is required for streaming so new waveform synthesis function may not have the functionality. </p> <p>An example streaming function is provided in ‘<tt>src/audio/au_streaming.c</tt>’ and is used by the example flite main program when <code>stream</code> is given as the playing option. (Though in the command line program the function it isn’t really useful.) </p> <p>In order to use streaming you must provide call back function in your particualr thread. This is done bay adding features to the voice in your thread. Suppose your function was declrared as </p> <div class="example"> <pre class="example">int example_audio_stream_chunk(const cst_wave *w, int start, int size, int last, void *user) </pre></div> <p>You can add this function as the streaming function through the statement </p><div class="example"> <pre class="example"> cst_audio_streaming_info *asi; ... asi = new_audio_streaming_info(); asi->asc = example_audio_stream_chunk; feat_set(voice->features, "streaming_info", audio_streaming_info_val(asi)); </pre></div> <p>You may also optionally include your own pointer to any information you additionally want to pass to your function. For example </p><div class="example"> <pre class="example">typedef my_callback_struct { cst_audiodev *fd; int count; }; cst_audio_streaming_info *asi; ... mcs = cst_alloc(my_callback_struct,1); mcs->fd=NULL; mcs->count=1; asi = new_audio_streaming_info(); asi->asc = example_audio_stream_chunk; asi->userdata = mcs; feat_set(voice->features, "streaming_info", audio_streaming_info_val(asi)); </pre></div> <hr> <table class="header" cellpadding="1" cellspacing="1" border="0"> <tr><td valign="middle" align="left">[<a href="#APIs" title="Beginning of this chapter or previous chapter"> << </a>]</td> <td valign="middle" align="left">[<a href="flite_7.html#Converting-FestVox-Voices" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="flite.html#Abstract" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="flite_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> <td valign="middle" align="left">[Index]</td> <td valign="middle" align="left">[<a href="flite_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <p> <font size="-1"> This document was generated on <i>October 18, 2014</i> using <a href="http://www.nongnu.org/texi2html/"><i>texi2html 5.0</i></a>. </font> <br> </p> </body> </html>