Acoustic database for Polish unit selection speech synthesis – ELRA Catalogue

Last view: 2024-04-18

519 Last view: 2024-04-18

Last update: 2017-10-05

1 Last update: 2017-10-05

Acoustic database for Polish unit selection speech synthesis

View resource name in all available languages

Base de données acoustique pour la synthèse de parole par sélection d’unités en polonais

ISLRN: 981-910-282-065-4

ID:

ELRA-S0339

This database contains parliamentary statements and newspaper reviews read by a semi-professional male speaker. It consists of a selection of 2150 sentences annotated and manually verified, including 100 rare phonemes in words. Prompts vary in length from 2.3 to 13.4 seconds, with an average length of 6.3 seconds.

The recordings took place in an anechoic chamber using one table stand dynamic microphone (Rode NT1000). A 48 kHz sampling frequency and 16 bit resolution was used. The total duration of the recordings is 3.45 hours.

The signal was automatically aligned with the transcription, and manually corrected using Praat speech analysis program. The database is phonetically annotated and manually corrected, which represents a lexicon of 11761 words with phonetic transcription.

The package also includes a version of the speech database re-sampled at 16 kHz and edited. In all these files DC offset and the identified distortions which could affect the quality of speech synthesis were removed using High-pass filter.

For a more detailed description, see “Oliver D. Szklanny K. Creation and analysis of a Polish speech database for use in unit selection synthesis, LREC Genoa, Italy 2006”: http://www.lrec-conf.org/proceedings/lrec2006/pdf/688_pdf.pdf

View resource description in French

Cette base de données contient des rapports parlementaires et des revues de journaux lus par un locuteur homme semi-professionnel. Elle consiste en une sélection de 2150 phrases annotées et vérifiées manuellement, regroupant 100 phonèmes rares inclus dans les mots. Les énoncés varient en longueur entre 2,3 et 13,4 secondes, pour une longueur moyenne de 6,3 secondes.

Les enregistrements ont eu lieu dans une chambre anéchoïque via un microphone de table dynamique (Rode NT1000). Une fréquence d’échantillonnage de 48 kHz et une résolution de 16 bit ont été utilisés. La durée totale des enregistrements se monte à 3,45 heures.

Le signal a été aligné automatiquement à la transcription et corrigé manuellement au moyen du programme d’analyse de parole Praat. La base de données est annotée au niveau phonétique et a été corrigée manuellement, représentant un lexique de 11761 mots avec transcription phonétique.

Les données comprennent également une version de la base de données audio à 16 kHz et éditée. Dans tous ces fichiers, le décalage du courant continu et les distorsions identifiées qui peuvent affecter la qualité de synthèse de parole ont été supprimés en utilisant un filtre passe-haut.

Pour une description plus détaillée, voir: “Oliver D. Szklanny K. Creation and analysis of a Polish speech database for use in unit selection synthesis, LREC Genoa, Italy 2006”: http://www.lrec-conf.org/proceedings/lrec2006/pdf/688_pdf.pdf

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	250.00 €	1000.00 €
Licence: Commercial Use - ELRA VAR	1000.00 €	1000.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	300.00 €	2000.00 €
Licence: Commercial Use - ELRA VAR	2000.00 €	2000.00 €

DistributionAvailability start date 20/03/2012 Contact Person

Valérie Mapelli

audio

Monolingual audio corpusLanguages

Polish

Linguality

Linguality type: Monolingual

Size

no size available

Size

3.45 Hours

Classification

Audio genre: Other

Audio FormatsRecording

Recording device type details: Table stand dynamic microphone

Source channel: Other

Metadata

Created: 05/12/2005

Metadata Language: French, English (fr, en)

Version

Version: 1.0

Last Updated: 03/20/2012

Usage

Actual Use - Nlp Applications

Use specific to NLP: Speech Synthesis

People who looked at this resource also viewed the following: