Send us your bug reports.
Use keywords to find the product you are looking for.
Purchase procedure & Conditions
Pricing & user licences
How to promote your resources ?
Catalog Reference : ELRA-E0023
EvaSy Evaluation Package
The EvaSy Evaluation Package was produced within the French national project EvaSy (Evaluation of speech synthesis systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The EvaSy project enabled to carry out a campaign for the evaluation of speech synthesis systems using French text data. This project is an extension of the only campaign that was ever carried out for French in this field within the AUPELF campaigns (Actions de recherche Concertées, 1996-1999).
This package includes the material that was used for the EvaSy evaluation campaign. It includes resources, protocols, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of these evaluation packages is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself.
The campaign is distributed over three actions:
1) Evaluation of grapheme-to-phoneme conversion: it consists in evaluating the capacity of speech synthesis systems to phonetize text data.
2) Evaluation of prosody: it consists in evaluating the capacity of speech synthesis systems to forecast text prosody (duration and fundamental frequency of phonemes) from the text itself.
3) Global evaluation of the quality of speech synthesis systems:
- ACR tests (Absolute Category Rating): they consist in evaluating the overall quality of speech synthesis voices, by asking a number of subjects to evaluate several general characteristics of the speech synthesis voice, such as its naturalness, its fluency, its intelligibility.
- SUS tests (Semantically Unpredictable Sentences): they consist in evaluating the intelligibility of the speech synthesis voice, by using syntactically correct as well as semantically unpredictable sentences (which have no meaning).
The EvaSy evaluation package contains the following data and tools:
1) For the evaluation of the grapheme-to-phoneme conversion module:
1) About 8,000 proper names (4,115 pairs firstname-surname) were extracted from Le Monde newspaper of 1992–2000 (over 200 million words), manually phonetised with variants and annotated with linguistic tags. The reference phonetisation was checked and corrected after the adjudication phase.
2) A corpus of emails (about 115,000 words) anonymised, segmented by paragraph and phonetised in SAMPA. The reference phonetisation was not checked. The evaluation of thos data was not carried out within EvaSy.
3) The SCLITE tool (developed by NIST) was used to compare the reference phonetisation with the one from the evaluated system, and to calculate the number of mistaken phonemes (inserted, forgotten or substituted phonemes).
4) The Post-align tool was used to align the reference phonetisation with the one from the evaluated system on a word-by-word basis.
2) For the evaluation of the prosodic module:
- Text data: 7 phonetically-balanced sentences extracted from the BREF corpus (cf. ELRA-S0067), with a duration lasting from 4 to 11 seconds.
- Speech data: 7 sentences read by one speaker.
- The Mbroli tool, which converts *.pho prosodic files into *.wav speech files, together with the MBROLA fr1 diphone database.
- The Mbrolign tool, which aligns the phonemes with the signal, extracts the prosodic parameters of the signal and copy them in the MBROLA diphone databas.
3) For the global evaluation of the quality of speech synthesis systems:
a) For ACR tests (Absolute Category Rating):
- Text data: 40 abstracts with 5 sentences each of 20 second duration, extracted from the EUROM1f French corpus (cf. ELRA-S0014-01).
- Données audio : lecture des 40 passages par un locuteur EUROM1.
b) For SUS tests (Semantically Unpredictable Sentences):
- Text data: 24 lists of 12 SUS sentences. Phonemes are also distributed by list.
- Speech data: 24 lists read by a professional speaker.
A description of the project is available at the following address:
(in French language)
Distribution medium :
Click on the arrow to display content.
Number of languages
TEXT_CLIPPING_RATE_PERCENTAGE18 files at 10 kHz, 453 files at 16 kHz, 38 files at 22.05 kHz
Source Channel :
Academic - Evaluation 150.00 EUR
Commercial - Evaluation 500.00 EUR
Non Member Prices
Academic - Evaluation 300.00 EUR
Commercial - Evaluation 1000.00 EUR
Saturday 25 November, 2017
24182552 requests since Monday 27 September, 2004
Copyright © 2008