The aim of this project was the realization of a polyglot speech synthesis system in four languages (German, French, Italian, and English). There exists already a large number of multilingual speech synthesis systems, which may produce speech for different languages but are in general not flexible enough to switch language within a single sentence. When switching languages it is usually necessary to change the application, because the underlying concepts and structures differ considerably as they are realized by different research groups. Moreover, each language is usually synthesized with a different voice.
As opposed to multilingual synthesis, our polyglot synthesis should be able to synthesize four languages using the same voice and the same system, so that it will be possible to synthesize stretches of foreign languages within sentences of a different base language. Such a mixed-language or polyglot synthesis is necessary to synthesize, e.g., French or English proper names within a German utterance in applications such as a reverse telephone directory service for Switzerland.
During the first phase of the project (1997/98), a synthesis units inventory for the four languages was created. To this end, a quadrilingual speaker with excellent phonetic skills had to be found to read carrier words for the extraction of the natural-speech units (diphones and possibly triphones). The carrier material for all the languages was recorded in a professional audio studio, and the tools for the automatic extraction of the synthesis units were completed.
In the second phase, which ended mid 2000, the monolingual German TTS system concept of SVOX has been extended in order to enable mixed-linguality in each part of the system.
For further information about POSSY, please refer to [HPT98], [Hub98b], or [TH99].
Supported by: The project was fully financed by Swisscom.
In collaboration with: TTS99 was a joint project of Swisscom AG, University of Geneva (LATL), and ETH Zürich.