S0308 : Egyptian Arabic Speecon database The Egyptian Arabic Speecon database comprises the recordings of 550 adult Egyptian speakers and 50 child Egyptian speakers who uttered respectively over 290 items and 210 items (read and spontaneous).
|
T0374 : Terminology database of natural sciences This dictionary covers the three kingdoms: Animal, Vegetal, Mineral. It contains 50,000 species with numerous synonyms in French, English and Latin and many breeds and varieties. Minerals are given with their chemical formula. About 7,900 definitions in French are included. It also includes synonyms and linguistic variants.
|
W0053 : Catalan-Spanish Parallel Corpus This corpus contains more than 100 million words and it contains 10 years of bilingual articles from “El Periódico de Catalunya”. The data are aligned at sentence level and stored in text files, in a one sentence per line basis. The data are provided in plain text, with no encoding whatsoever.
|
S0307 : BABEL Polish database The BABEL Polish Database is a speech database that was produced by a research consortium funded by the European Union under the COPERNICUS programme (COPERNICUS Project 1304). It consists of the basic "common" set which contains the Many Talker Set (30 males, 30 females), the Few Talker Set (5 males, 5 females), the Very Few Talker Set (1 male, 1 female).
|
S0305 : EPAC Corpus: orthographic transcriptions This corpus consists of approx. 100 hours of manual orthographic transcriptions, which were produced from 1,677 hours of non transcribed recordings from the ESTER Evaluation Campaign (Technolangue programme). This corpus also consists of automatic transcriptions of the full 1,677 hours.
|
| (last update: July 2010) |