This is festival.info, produced by Makeinfo version 3.12h from festival.texi. This file documents the `Festival' Speech Synthesis System a general text to speech system for making your computer talk and developing new synthesis techniques. Copyright (C) 1996-2001 University of Edinburgh Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the authors. File: festival.info, Node: Problems, Next: References, Prev: Examples, Up: Top Problems ******** There will be many problems with Festival, both in installation and running it. It is a young system and there is a lot to it. We believe the basic design is sound and problems will be features that are missing or incomplete rather than fundamental ones. We are always open to suggestions on how to improve it and fix problems, we don't guarantee we'll have the time to fix problems but we are interested in hearing what problems you have. Before you smother us with mail here is an incomplete list of general problems we have already identified * The more documentation we write the more we realize how much more documentation is required. Most of the Festival documentation was written by someone who knows the system very well, and makes many English mistakes. A good re-write by some one else would be a good start. * The system is far too slow. Although machines are getting faster, it still takes too long to start the system and get it to speak some given text. Even so, on reasonable machines, Festival can generate the speech several times faster than it takes to say it. But even if it is five time faster, it will take 2 seconds to generate a 10 second utterance. A 2 second wait is too long. Faster machines would improve this but a change in design is a better solution. * The system is too big. It takes a long time to compile even on quite large machines, and its foot print is still in the 10s of megabytes as is the run-time requirement. Although we have spent some time trying to fix this (optional modules have made the possibility of building a much smaller binary) we haven't done enough yet. * The signal quality of the voices isn't very good by today's standard of synthesizers, even given the improvement quality since the last release. This is partly our fault in not spending the time (or perhaps also not having enough expertise) on the low-level waveform synthesis parts of the system. This will improve in the future with better signal processing (under development) and better synthesis techniques (also under development). File: festival.info, Node: References, Next: Feature functions, Prev: Problems, Up: Top References ********** _allen87_ Allen J., Hunnicut S. and Klatt, D. _Text-to-speech: the MITalk system_, Cambridge University Press, 1987. _abelson85_ Abelson H. and Sussman G. _Structure and Interpretation of Computer Programs_, MIT Press, 1985. _black94_ Black A. and Taylor, P. "CHATR: a generic speech synthesis system.", _Proceedings of COLING-94_, Kyoto, Japan 1994. _black96_ Black, A. and Hunt, A. "Generating F0 contours from ToBI labels using linear regression", _ICSLP96_, vol. 3, pp 1385-1388, Philadelphia, PA. 1996. _black97b_ Black, A, and Taylor, P. "Assigning Phrase Breaks from Part-of-Speech Sequences", _Eurospeech97_, Rhodes, Greece, 1997. _black97c_ Black, A, and Taylor, P. "Automatically clustering similar units for unit selection in speech synthesis", _Eurospeech97_, Rhodes, Greece, 1997. _black98_ Black, A., Lenzo, K. and Pagel, V., "Issues in building general letter to sound rules.", 3rd ESCA Workshop on Speech Synthesis, Jenolan Caves, Australia, 1998. _black99_ Black, A., and Lenzo, K., "Building Voices in the Festival Speech Synthesis System," unpublished document, Carnegie Mellon University, available at `http://www.cstr.ed.ac.uk/projects/festival/docs/festvox/' _breiman84_ Breiman, L., Friedman, J. Olshen, R. and Stone, C. _Classification and regression trees_, Wadsworth and Brooks, Pacific Grove, CA. 1984. _campbell91_ Campbell, N. and Isard, S. "Segment durations in a syllable frame", _Journal of Phonetics_, 19:1 37-47, 1991. _DeRose88_ DeRose, S. "Grammatical category disambiguation by statistical optimization". _Computational Linguistics_, 14:31-39, 1988. _dusterhoff97_ Dusterhoff, K. and Black, A. "Generating F0 contours for speech synthesis using the Tilt intonation theory" _Proceedings of ESCA Workshop of Intonation_, September, Athens, Greece. 1997 _dutoit97_ Dutoit, T. _An introduction to Text-to-Speech Synthesis_ Kluwer Acedemic Publishers, 1997. _hunt89_ Hunt, M., Zwierynski, D. and Carr, R. "Issues in high quality LPC analysis and synthesis", _Eurospeech89_, vol. 2, pp 348-351, Paris, France. 1989. _jilka96_ Jilka M. _Regelbasierte Generierung natuerlich klingender Intonation des Amerikanischen Englisch_, Magisterarbeit, Institute of Natural Language Processing, University of Stuttgart. 1996 _moulines90_ Moulines, E, and Charpentier, N. "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones" _Speech Communication_, 9(5/6) pp 453-467. 1990. _pagel98,_ Pagel, V., Lenzo, K., and Black, A. "Letter to Sound Rules for Accented Lexicon Compression", ICSLP98, Sydney, Australia, 1998. _ritchie92_ Ritchie G, Russell G, Black A and Pulman S. _Computational Morphology: practical mechanisms for the English Lexicon_, MIT Press, Cambridge, Mass. _vansanten96_ van Santen, J., Sproat, R., Olive, J. and Hirschberg, J. eds, "Progress in Speech Synthesis," Springer Verlag, 1996. _silverman92_ Silverman K., Beckman M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., and Hirschberg, J "ToBI: a standard for labelling English prosody." _Proceedings of ICSLP92_ vol 2. pp 867-870, 1992 _sproat97_ Sproat, R., Taylor, P, Tanenblatt, M. and Isard, A. "A Markup Language for Text-to-Speech Synthesis", _Eurospeech97_, Rhodes, Greece, 1997. _sproat98,_ Sproat, R. eds, "Multilingual Text-to-Speech Synthesis: The Bell Labs approach", Kluwer 1998. _sable98,_ Sproat, R., Hunt, A., Ostendorf, M., Taylor, P., Black, A., Lenzo, K., and Edgington, M. "SABLE: A standard for TTS markup." ICSLP98, Sydney, Australia, 1998. _taylor91_ Taylor P., Nairn I., Sutherland A. and Jack M.. "A real time speech synthesis system", _Eurospeech91_, vol. 1, pp 341-344, Genoa, Italy. 1991. _taylor96_ Taylor P. and Isard, A. "SSML: A speech synthesis markup language" to appear in _Speech Communications_. _wwwxml97_ World Wide Web Consortium Working Draft "Extensible Markup Language (XML)Version 1.0 Part 1: Syntax", `http://www.w3.org/pub/WWW/TR/WD-xml-lang-970630.html' _yarowsky96_ Yarowsky, D., "Homograph disambiguation in text-to-speech synthesis", in "Progress in Speech Synthesis," eds. van Santen, J., Sproat, R., Olive, J. and Hirschberg, J. pp 157-172. Springer Verlag, 1996. File: festival.info, Node: Feature functions, Next: Variable list, Prev: References, Up: Top Feature functions ***************** This chapter contains a list of a basic feature functions available for stream items in utterances. *Note Features::. These are the basic features, which can be combined with relative features (such as `n.' for next, and relations to follow links). Some of these features are implemented as short C++ functions (e.g. `asyl_in') while others are simple features on an item (e.g. `pos'). Note that functional feature take precidence over simple features, so accessing and feature called "X" will always use the function called "X" even if a the simple feature call "X" exists on the item. Unlike previous versions there are no features that are builtin on all items except `addr' (reintroduced in 1.3.1) which returns a unique string for that item (its the hex address on teh item within the machine). Features may be defined through Scheme too, these all have the prefix `lisp_'. The feature functions are listed in the form RELATION.NAME where RELATION is the name of the stream that the function is appropriate to and NAME is its name. Note that you will not require the RELATION part of the name if the stream item you are applying the function to is of that type. `ANY.addr' Returned by popular demand, returns the address of given item that is guaranteed unique for this session. `ANY.lisp_*' Apply Lisp function named after lisp_. The function is called with an stream item. It must return an atomic value. This method may be inefficient and is primarily desgined to allow quick prototyping of new feature functions. `Intonation.lisp_last_tilt_accent' Returns the most recent tilt accent. `Intonation.lisp_last_tilt_boundary' Returns the most recent tilt boundary. `Intonation.lisp_next_tilt_accent' Returns the next tilt accent. `Intonation.lisp_next_tilt_boundary' Returns the next tilt boundary. `Intonation.peak_anchor_segment_type ie' Determines whether the segment anchor for a peak is the first consonant of a syl - C0 -, the vowel of a syl - V0 -, or segments after that - C1->X,V1->X. If the segment is in a following syl, the return value will be preceded by a 1 - e.g. 1V1 `Segment.diphone_phone_name' This is produced by the diphone module to contain the desired phone name for the desired diphone. This adds things like _ if part of a consonant or $ to denote syllable boundaries. These are generated on a per voice basis by function(s) specified by diphone_module_hooks. Identification of dark ll's etc. may also be included. Note this is not necessarily the name of the diphone selected as if it is not found some of these characters will be removed and fall back values will be used. `Segment.lisp_pos_in_syl seg' Finds the position in a syllable of a segment - returns a number. `Segment.ph_*' Access phoneset features for a segment. This definition covers multiple feature functions where ph_ may be extended with any features that are defined in the phoneset (e.g. vc, vlng, cplace etc.). `Segment.pos_in_syl' The position of this segment in the syllable it is related to. The index counts from 0. If this segment is not related to a syllable this returns 0. `Segment.seg_coda_fric' Returns 1 if coda of the syllable this segment is in contains a fricative. 0 otherwise. `Segment.seg_onset_stop' Returns 1 if onset of the syllable this segment is in contains a stop. 0 otherwise. `Segment.seg_onsetcoda' Returns onset if this segment is before the vowel in the syllable it is contained within. Returns coda if it is the vowel or after. If the segment is not in a syllable it returns onset. `Segment.seg_pitch' Pitch at the middle of this segment. `Segment.segment_duration' The duration of the given stream item calculated as the end of this item minus the end of the previous item in the Segment relation. `Segment.segment_end' The end time of the given segment. `Segement.segment_mid' The middle time of the given segment. `Segement.segment_start' The start time of the given segment. `Segment.syl_final' Returns 1 if this segment is the last segment in the syllable it is related to, or if it is not related to any syllable. `Segment.syl_initial' Returns 1 if this segment is the first segment in the syllable it is related to, or if it is not related to any syllable. `Syllable.accented' Returns 1 if syllable is accented, 0 otherwise. A syllable is accented if there is at least one IntEvent related to it. `Syllable.asyl_in' Returns number of accented syllables since last phrase break, not including this one. Accentedness is as defined by the syl_accented feature. `Syllable.asyl_out' Returns number of accented syllables to the next phrase break, not including this one. Accentedness is as defined by the syl_accented feature. `Syllable.last_accent' Returns the number of syllables since last accented syllable. `Syllable.lisp_last_stress' Number of syllables from previous stressed syllable. 0 if this syllable is stressed. It is effectively assumed that the syllable before the first syllable is stressed. `Syllable.lisp_next_stress' Number of syllables to next stressed syllable. 0 if this syllable is stressed. It is effectively assumed the syllable after the last syllable is stressed. `Syllable.lisp_tilt_accent' Returns "a" if there is a tilt accent related to this syllable, 0 otherwise. `Syllable.lisp_tilt_accented' Returns 1 if there is a tilt accent related to this syllable, 0 otherwise. `Syllable.lisp_tilt_boundaried' Returns 1 if there is a tilt boundary related to this syllable, 0 otherwise. `Syllable.lisp_tilt_boundary' Returns boundary label if there is a tilt boundary related to this syllable, 0 otherwise. `Syllable.lisp_time_to_next_vowel syl' The time from vowel_start to next vowel_start `Syllable.next_accent' Returns the number of syllables to the next accented syllable. `Syllable.old_syl_break' Like syl_break but 2 and 3 are promoted to 4 (to be compatible with some older models. `Syllable.pos_in_word' The position of this syllable in the word it is related to. The index counts from 0. If this syllable is not related to a word then 0 is returned. `Syllable.position_type' The type of syllable with respect to the word it it related to. This may be any of: single for single syllable words, initial for word initial syllables in a poly-syllabic word, final for word final syllables in poly-syllabic words, and mid for syllables within poly-syllabic words. `Syllable.ssyl_in' Returns number of stressed syllables since last phrase break, not including this one. `Syllable.ssyl_out' Returns number of stressed syllables to next phrase break, not including this one. `Syllable.stress' The lexical stress of the syllable as specified from the lexicon entry corresponding to the word related to this syllable. `Syllable.sub_phrases' Returns the number of non-major phrase breaks since last major phrase break. Major phrase breaks are 4, as returned by syl_break, minor phrase breaks are 2 and 3. `Syllable.syl_accent' Returns the name of the accent related to the syllable. NONE is returned if there are no accents, and multi is returned if there is more than one. `Syllable.syl_break' The break level after this syllable. Word internal is syllables return 0, non phrase final words return 1. Final syllables in phrase final words return the name of the phrase they are related to. Note the occasional "-" that may appear of phrase names is removed so that this feature function returns a number in the range 0,1,2,3,4. `Syllable.syl_coda_type' Return the van Santen and Hirschberg classification. -V for unvoiced, +V-S for voiced but no sonorants, and +S for sonorants. `Syllable.syl_codasize' Returns the number of segments after the vowel in this syllable. If there is no vowel in the syllable this will return the total number of segments in the syllable. `Syllable.syl_endpitch' Pitch at the end of this syllable. `Syllable.syl_in' Returns number of syllables since last phrase break. This is 0 if this syllable is phrase initial. `Syllable.syl_midpitch' Pitch at the mid vowel of this syllable. `Syllable.syl_numphones' Returns number of phones in syllable. `Syllable.syl_onset_type' Return the van Santen and Hirschberg classification. -V for unvoiced, +V-S for voiced but no sonorants, and +S for sonorants. `Syllable.syl_onsetsize' Returns the number of segments before the vowel in this syllable. If there is no vowel in the syllable this will return the total number of segments in the syllable. `Syllable.syl_out' Returns number of syllables to next phrase break. This is 0 if this syllable is phrase final. `Syllable.syl_pc_unvox' Percentage of total duration of unvoiced segments from start of syllable. (i.e. percentage to start of first voiced segment) `Syllable.syl_startpitch' Pitch at the start of this syllable. `Syllable.syl_vowel' Returns the name of the vowel within this syllable. Note this is not the general form you probably want. You can't refer to ph_* features of this. Returns "novowel" is no vowel can be found. `Syllable.syl_vowel_start' Start position of vowel in syllable. If there is no vowel the start position of the syllable is returned. `Syllable.syllable_duration' The duration of the given stream item calculated as the end of last daughter minus the end of previous item in the Segment relation of the first duaghter. `Syllable.syllable_end' The end time of the given syllable. `Syllable.syllable_start' The start time of the given syllable. `Syllable.tobi_accent' Returns the ToBI accent related to syllable. ToBI accents are those which contain a *. NONE is returned if there are none. If there is more than one ToBI accent related to this syllable the first one is returned. `Syllable.tobi_endtone' Returns the ToBI endtone related to syllable. ToBI end tones are those IntEvent labels which contain a % or a - (i.e. end tones or phrase accents). NONE is returned if there are none. If there is more than one ToBI end tone related to this syllable the first one is returned. `Syllable.lisp_get_onset_length' Length from start of syllable to start of vowel. `Syllable.lisp_get_rhyme_length' Length from start of the vowel to end of syllable. `SylStructure.lisp_length_to_last_seg' Length from start of the vowel to start of last segment of syllable. `SylStructure.lisp_num_postvocalic_c' Finds the number of postvocalic consonants in a syllable. `SylStructure.sonority_scale_coda syl' Returns value on sonority scale (1 -6, where 6 is most sonorous) for the coda of a syllable, based on least sonorant portion. `SylStructure.sonority_scale_onset syl' Returns value on sonority scale (1 -6, where 6 is most sonorous) for the onset of a syllable, based on least sonorant portion. `SylStructure.lisp_syl_numphones syl' Finds the number segments in a syllable. `SylStructure.vowel_frontness syl' Classifies vowels as front, back or mid `SylStructure.lisp_vowel_height syl' Classifies vowels as high, low or mid `SylStructure.vowel_length syl' Returns the df.length feature of a syllable's vowel `Token.prepunctuation' Preceeding puctuation symbol found before token in original string/file. `Token.punc' Succeeding punctuation symbol found after token in original string/file. `Token.whitespace' Whitespace found before token in original string/file. `Word.blevel' A crude translation of phrase break into ToBI like phrase level. Values may be 0,1,2,3,4. `Word.cap' Returns 1 if this word starts with a capital letter, 0 otherwise. `Word.content_words_in' Number of content words from start this phrase. `Word.content_words_out' Number of content words to end of this phrase. `Word.contentp' Returns 1 if this word is a content word as defined by gpos, 0 otherwise. `Word.gpos' Returns a guess at the part of speech of this word. The lisp a-list guess_pos is used to load up this word. If no part of speech is found in there "content" is returned. This allows a quick efficient method for part of speech tagging into closed class and content words. `Word.n_content' Next content word. Note this doesn't use the standard n. notation as it may have to search a number of words forward before finding a non-function word. Uses gpos to define content/function word distinction. This also works for Tokens. `Word.nn_content' Next next content word. Note this doesn't use the standard n.n. notation as it may have to search a number of words forward before finding the second non-function word. Uses gpos to define content/function word distinction. This also works for Tokens. `Word.num_break' 1 if this is the last word in a numeric token and it is followed by a numeric token. `Word.p_content' Previous content word. Note this doesn't use the standard p. notation as it may have to search a number of words backward before finding the first non-function word. Uses gpos to define content/function word distinction. This also works for Tokens. `Word.pbreak' Result from statistical phrasing module, may be B or NB denoting phrase break or non-phrase break after the word. `Word.pbreak_score' Log likelihood score from statistical phrasing module, for pbreak value. `Word.pos' Part of speech tag value returned by the POS tagger module. `Word.pos_in_phrase' The position of this word in the phrase this word is in. `Word.pos_score' Part of speech tag log likelihood from Viterbi search. `Word.pp_content' Previous previous content word. Note this doesn't use the standard p.p. notation as it may have to search a number of words backward before finding the first non-function word. Uses gpos to define content/function word distinction. This also works for Tokens. `Word.word_break' The break level after this word. Non-phrase final words return 1 Phrase final words return the name of the phrase they are in. `Word.word_duration' The duration of the given stream item. This is defined as the end of last segment in the last syllable (via the SylStructure relation) minus the segment immediate preceeding the first segment in the first syllable. `Word.word_end' The end time of the given word. `Word.word_numsyls' Returns number of syllables in a word. `Word.word_start' The start time of the given word. `Word.words_out' Number of words to end of this phrase. File: festival.info, Node: Variable list, Next: Function list, Prev: Feature functions, Up: Top Variable list ************* This chapter contains a list of variables currently defined within Festival available for general use. This list is automatically generated from the documentation strings of the variables as they are defined within the system, so has some chance in being up-to-date. Cross references to sections elsewhere int he manual are given where appropriate. `!' In interactive mode, this variable's value is the return value of the previously evaluated expression. `*module-descriptions*' An association list recording the description objects for proclaimed modules. `*ostype*' Contains the name of the operating system type that Festival is running on, e.g. SunOS5, FreeBSD, linux etc. The value is taken from the Makefile variable OSTYPE at compile time. `*properties*' Array for holding symbol property lists. `after_analysis_hooks' List of functions to be applied after analysis and before synthesis. `after_synth_hooks' List of functions to be applied after all synthesis modules have been applied. This is primarily designed to allow waveform manipulation, particularly resampling and volume changes. `auto-text-mode-alist' Following Emacs' auto-mode-alist thios provides a mechanism for auto selecting a TTS text mode based on the filename being analyzed. Its format is exactly the same as Emacs in that it consists of an alist of dotted pairs of regular expression and text mode name. `before_synth_hooks' List of functions to be run on synthesised utterances before synthesis starts. `default-voice-priority-list' List of voice names. The first of them available becomes the default voice. `default_access_strategy' How to access units from databases. `default_after_analysis_hooks' The default list of functions to be run on all synthesized utterances after analysis but before synthesis. `default_after_synth_hooks' The default list of functions to be run on all synthesized utterances after Wave_Synth. This will normally be nil but if for some reason you need to change the gain or rescale *all* waveforms you could set the function here, in your siteinit.scm. `default_before_synth_hooks' The default list of functions to be run on all synthesized utterances before synthesis starts. `diphone_module_hooks' A function or list of functions that will be applied to the utterance at the start of the diphone module. It can be used to map segment names to those that will be used by the diphone database itself. Typical use specifies _ and $ for consonant clusters and syllable boundaries, mapping to dark ll's etc. Reduction and tap type phenomena should probabaly be done by post lexical rules though the distinction is not a clear one. `duffint_params' Default parameters for Default (duff) intonation target generation. This is an assoc list of parameters. Two parameters are supported start specifies the start F0 in Hertz for an utterance, and end specifies the end. `editline_histsize' The number of lines to be saved in the users history file when a Festival session ends. The histfile is ".festival_history" in the users home directory. Note this value is only checked when the command interpreter is started, hence this should be set in a user's ".festivalrc" or system init file. Reseting it at the command interpreter will have no effect. `editline_no_echo' When running under Emacs as an inferior process, we don't want to echo the content of the line, only the prompt. `english_homographs' A list of tokens that are dealt with by a homograph disambiguation tree in english_token_pos_cart_trees. `english_phr_break_params' Parameters for English phrase break statistical model. `eou_tree' End of utterance tree. A decision tree used to determine if the given token marks the end of an utterance. It may look one token ahead to do this. [*note Utterance chunking::.] `etc-path' A list of directories where binaries specific to Festival may be located. This variable is automatically set to LIBDIR/etc/OSTYPE/ and that path is added to the end of the UNIX PATH environment variable. `festival_version' A string containing the current version number of the system. `festival_version_number' A list of major, minor and subminor version numbers of the current system. e.g. (1 0 12). `FP_duration' In using Fixed_Prosody as used in Phones type utterances and hence SayPhones, this is the fix value in ms for phone durations. `FP_F0' In using Fixed_Prosody as used in Phones type utterances and hence SayPhones, this is the value in Hertz for the monotone F0. `guess_pos' An assoc-list of simple part of speech tag to list of words in that class. This basically only contains closed class words all other words may be assumed to be content words. This was built from information in the f2b database and is used by the ffeature gpos. `home-directory' Place looked at for .festivalrc etc. `hush_startup' If set to non-nil, the copyright banner is not displayed at start up. `int_tilt_params' Parameters for tilt intonation model. `kal_diphone_dir' The default directory for the kal diphone database. `lexdir' The directory where the lexicon(s) are, by default. `libdir' The pathname of the run-time libary directory. Note reseting is almost definitely not what you want to do. This value is automatically set at start up from the value specifed at compile-time or the value specifed with -libdir on the command line. A number of other variables depend on this value. `load-path' A list of directories containing .scm files. Used for various functions such as load_library and require. Follows the same use as EMACS. By default it is set up to the compile-time library directory but may be changed by the user at run time, by adding a user's own library directory or even replacing all of the standard library. [*note Site initialization::.] `manual-browser' The Unix program name of your Netscape Navigator browser. [*note Getting some help::.] `manual-url' The default URL for the Festival Manual in html format. You may reset this to a file://.../... type URL on you're local machine. [*note Getting some help::.] `mbrola_database' The name of the MBROLA database to usde during MBROLA Synthesis. `mbrola_progname' The program name for mbrola. `Param' A feature set for arbitrary parameters for modules. `pbreak_ngram_dir' The directory containing the ngram models for predicting phrase breaks. By default this is the standard library directory. `phr_break_params' Parameters for phrase break statistical model. This is typcal set by a voice selection function to the parameters for a particular model. `pos_map' A reverse assoc list of predicted pos tags to some other tag set. Note using this changes the pos tag loosing the actual predicted value. Rather than map here you may find it more appropriate to map tags sets locally in the module sthat use them (e.g. phrasing and lexicons). `pos_model_dir' The directory contains the various models for the POS module. By default this is the same directory as lexdir. The directory should contain two models: a part of speech lexicon with reverse log probabilities and an ngram model for the same part of speech tag set. `pos_ngram_name' The name of a loaded ngram containing the a posteriori ngram model for predicting part of speech. The a priori model is held as a lexicon call poslex. `pos_p_start_tag' This variable's value is the tag most likely to appear before the start of a sentence. It is used when looking for pos context before an utterance. Typically it should be some type of punctuation tag. `pos_pp_start_tag' This variable's value is the tag most likely to appear before pos_p_start_tag and any position preceding that. It is typically some type of noun tag. This is used to provide pos context for early words in an utterance. `pos_supported' If set to non-nil use part of speech prediction, if nil just get pos information from the lexicon. `postlex_mrpa_r_cart_tree' For remove final R when not between vowels. `postlex_rules_hooks' A function or list of functions which encode post lexical rules. This will be voice specific, though some rules will be shared across languages. `postlex_vowel_reduce_cart_tree' CART tree for vowel reduction. `postlex_vowel_reduce_cart_tree_hand' A CART tree for vowel reduction. This is hand-written. `postlex_vowel_reduce_table' Mapping of vowels to their reduced form. This in an assoc list of phoneset name to an assoc list of full vowel to reduced form. `provided' List of file names (omitting .scm) that have been provided. This list is checked by the require function to find out if a file needs to be loaded. If that file is already in this list it is not loaded. Typically a file will have (provide 'MYNAME) at its end so that a call to (require 'MYNAME) will only load MYNAME.scm once. `server_access_list' If non-nil this is the exhaustive list of machines and domains from which clients may access the server. This is a list of REGEXs that client host must match. Remember to add the backslashes before the dots. [*note Server/client API::.] `server_deny_list' If non-nil this is a list of machines which are to be denied access to the server absolutely, irrespective of any other control features. The list is a list of REGEXs that are used to matched the client hostname. This list is checked first, then server_access_list, then passwd. [*note Server/client API::.] `server_log_file' If set to t server log information is printed to standard output of the server process. If set to nil no output is given. If set to anything else the value is used as the name of file to which server log information is appended. Note this value is checked at server start time, there is no way a client may change this. [*note Server/client API::.] `server_max_clients' In server mode, the maximum number of clients supported at any one time. When more that this number of clients attach simulaneous the last ones are denied access. Default value is 10. [*note Server/client API::.] `server_passwd' If non-nil clients must send this passwd to the server followed by a newline before they can get a connection. It would be normal to set this for the particular server task. [*note Server/client API::.] `server_port' In server mode the inet port number the server will wait for connects on. The default value is 1314. [*note Server/client API::.] `sgml_parse_progname' The name of the program to use to parse SGML files. Typically this is nsgml-1.0 from the sp SGML package. [*note XML/SGML requirements::.] `sonority_glides' List of glides (only good w/ radio_speech) `sonority_liq' List of liquids (only good w/ radio_speech) `sonority_nas' List of nasals (only good w/ radio_speech) `sonority_v_obst' List of voiced obstruents for use in sonority scaling (only good w/ radio_speech) `sonority_vless_obst' List of voiceless obstruents for use in sonority scaling (only good w/ radio_speech) `SynthTypes' List of synthesis types and functions used by the utt.synth function to call appropriate methods for wave synthesis. `system-voice-path' Additional directory not near the load path where voices can be found, this can be redefined in lib/sitevars.scm if desired. `tilt_accent_list' List of events containing accents in tilt model. `tilt_boundary_list' List of events containing boundaries in tilt model. `tobi_support_yn_questions' If set a crude final rise will be added at utterance that are judged to be yesy/no questions. Namely ending in a ? and not starting with a wh-for word. `token.letter_pos' The part of speech tag (valid for your part of speech tagger) for individual letters. When the tokenizer decide to pronounce a token as a list of letters this tag is added to each letter in the list. Note this should be from the part of speech set used in your tagger which may not be the same one that appears in the actual lexical entry (if you map them afterwards). This specifically allows "a" to come out as ae rather than @. `token.prepunctuation' A string of characters which are to be treated as preceding punctuation when tokenizing text. Prepunctuation symbols will be removed from the text of the token and made available through the "prepunctuation" feature. [*note Tokenizing::.] `token.punctuation' A string of characters which are to be treated as punctuation when tokenizing text. Punctuation symbols will be removed from the text of the token and made available through the "punctuation" feature. [*note Tokenizing::.] `token.singlecharsymbols' Characters which have always to be split as tokens. This would be usual is standard text, but is useful in parsing some types of file. [*note Tokenizing::.] `token.unknown_word_name' When all else fails and a pronunciation for a word or character can't be found this word will be said instead. If you make this "" them the unknown word will simple be omitted. This will only really be called when there is a bug in the lexicon and characters are missing from the lexicon. Note this word should be in the lexicon. `token.whitespace' A string of characters which are to be treated as whitespace when tokenizing text. Whitespace is treated as a separator and removed from the text of a token and made available through the "whitespace" feature. [*note Tokenizing::.] `token_most_common' A list of (English) words which were found to be most common in an text database and are used as discriminators in token analysis. `token_pos_cart_trees' This is a list of pairs or regex plus CART tree. Tokens that match the regex will have the CART tree aplied, setting the result as the token_pos feature on the token. The list is checked in order and only the first match will be applied. `tts_hooks' Function or list of functions to be called during text to speech. The function tts_file, chunks data into Utterances of type Token and applies this hook to the utterance. This typically contains the utt.synth function and utt.play. [*note TTS::.] `tts_text_modes' An a-list of text modes data for file type specific tts functions. See the manual for an example. [*note Text modes::.] `UttTypes' List of types and functions used by the utt.synth function to call appropriate methods. `var-docstrings' An assoc-list of variable names and their documentation strings. `voice-location-trace' Set t to print voice locations as they are found `voice-locations' Association list recording where voices were found. `voice-path' List of places to look for voices. If not set it is initialised from load-path by appending "voices/" to each directory with system-voice-path appended. `voice_default' A variable whose value is a function name that is called on start up to the default voice. [*note Site initialization::.] `Internal variable containing list of voice descriptions as' decribed by proclaim_voice. `xml_dtd_dir' The directory holding standard DTD form the xml parser. `xxml_elements' List of Scheme actions to perform on finding xxML tags. `xxml_hooks' Function or list of functions to be applied to an utterance when parsed with xxML, before tts_hooks. `xxml_token_hooks' Functions to apply to each token. `xxml_word_features' An assoc list of features to be added to the current word when in xxml parse mode.