Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
17 Language Resources
Order by:
- Arabic
- Chinese
- English
- French
- German
- Italian
- Japanese
- Modern Greek (1453-)
- Persian
- Russian
- Spanish; Castilian
ID: ELRA-E0018
ISLRN: 875-865-064-331-9The ARCADE II Evaluation Package was produced within the French national project ARCADE II (Evaluation of parallel text alignment systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ARCADE II project enabled to carry out a cam...
MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
150.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
300.00 €
|
1000.00 €
|
- Bulgarian
- Czech
- Dutch; Flemish
- English
- Finnish
- French
- German
- Hungarian
- Italian
- Persian
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
ID: ELRA-E0036
ISLRN: 378-279-085-589-0The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...
MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
150.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
300.00 €
|
1000.00 €
|
Special offers are also available. Check here for details.
- English
- German
- Russian
ID: ELRA-E0037
ISLRN: 609-362-685-537-2The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...
MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
150.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
300.00 €
|
1000.00 €
|
Special offers are also available. Check here for details.
- Arabic
- Chinese
- Croatian
- Czech
- Danish
- Dutch; Flemish
- English
- Finnish
- French
- German
- Hindi
- Italian
- Japanese
- Korean
- Modern Greek (1453-)
- Norwegian
- Persian
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
- Thai
- Turkish
- Vietnamese
ID: ELRA-S0383
ISLRN: 398-655-047-044-5The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the audio files corresponding t...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3360.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4480.00 €
|
- Arabic
- Chinese
- Croatian
- Czech
- Danish
- Dutch; Flemish
- English
- Finnish
- French
- German
- Italian
- Japanese
- Korean
- Modern Greek (1453-)
- Norwegian
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
- Thai
- Turkish
- Vietnamese
ID: ELRA-S0382
ISLRN: 309-438-781-042-2The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the corresponding audio files c...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3640.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5200.00 €
|
- Albanian
- Bulgarian
- Chinese
- Czech
- Danish
- Dutch; Flemish
- English
- Estonian
- French
- German
- Italian
- Japanese
- Latin
- Lithuanian
- Malay (macrolanguage)
- Modern Greek (1453-)
- Norwegian
- Portuguese
- Russian
- Scottish Gaelic; Gaelic
- Serbian
- Spanish; Castilian
- Swedish
- Turkish
- Uzbek
ID: ELRA-W0004
ISLRN: 511-168-567-582-5The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50.00 €
|
50.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50.00 €
|
50.00 €
|
- Arabic
- Bulgarian
- Chinese
- Croatian
- Czech
- French
- German
- Hausa
- Japanese
- Korean
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swahili (macrolanguage)
- Swedish
- Tamil
- Thai
- Turkish
- Ukrainian
- Vietnamese
ID: ELRA-S0400
ISLRN: 331-592-378-424-7The GlobalPhone 2000 Speaker Package contains transcribed read speech spoken by 2000 native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin (ELRA-S0193), C...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1400.00 €
|
7200.00 €
|
Licence: Commercial Use - ELRA VAR |
7200.00 €
|
7200.00 €
|
- Arabic
- Bulgarian
- Chinese
- Croatian
- Czech
- French
- German
- Hausa
- Japanese
- Korean
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swahili (macrolanguage)
- Swedish
- Tamil
- Thai
- Turkish
- Ukrainian
- Vietnamese
ID: ELRA-S0399
ISLRN: 204-945-263-927-6The GlobalPhone Multilingual Model Package contains about 22 hours of transcribed read speech spoken by native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Manda...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1400.00 €
|
7200.00 €
|
Licence: Commercial Use - ELRA VAR |
7200.00 €
|
7200.00 €
|
- Russian
ID: ELRA-S0202
ISLRN: 045-784-413-420-9The GlobalPhone corpus developed in collaboration with the Karlsruhe Institute of Technology (KIT) was designed to provide read speech data for the development and evaluation of large continuous speech recognition systems in the most widespread languages of the world, and to provide a uniform, mu...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
600.00 €
|
3000.00 €
|
Licence: Commercial Use - ELRA VAR |
3000.00 €
|
3000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
700.00 €
|
3600.00 €
|
Licence: Commercial Use - ELRA VAR |
3600.00 €
|
3600.00 €
|
Special offers are also available. Check here for details.
- Russian
ID: ELRA-W0080
ISLRN: 024-620-556-146-2The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Arabic
- Chinese
- Danish
- Dutch; Flemish
- English
- Finnish
- French
- German
- Hebrew
- Italian
- Japanese
- Korean
- Modern Greek (1453-)
- Northern Sami
- Norwegian
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
- Turkish
ID: ELRA-W0336
ISLRN: 471-919-856-164-1Parallel corpora for nearly 400 language pairs and numerous multilingual combinations, including 10 million bilingual segments and 90 million tokens in 20 languages: Arabic, Chinese (Simplified), Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Italian, Japanese, Korean, North Sami...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
0.10 €
|
0.10 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
0.11 €
|
0.11 €
|
Special offers are also available. Check here for details.
- Russian
ID: ELRA-S0050
ISLRN: 428-147-317-182-1The STC Russian speech database was recorded in 1996-1998. The main purpose of the database is to investigate individual speaker variability and to validate speaker recognition algorithms. The database was recorded through a 16-bit Vibra-16 Creative Labs sound card with an 11,025 Hz sampling rate...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
|
2000.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
800.00 €
|
4000.00 €
|
Licence: Commercial Use - ELRA VAR |
4000.00 €
|
4000.00 €
|
- Russian
ID: ELRA-S0443
ISLRN: 605-103-836-140-21960 Russian native speakers participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, in-vehicle and home. The text is manually proofread with high accuracy. It matches with mainstream Androi...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180861.00 €
|
180861.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180861.00 €
|
180861.00 €
|
Special offers are also available. Check here for details.
- Russian
ID: ELRA-S0228-95
ISLRN: 999-440-415-744-8This corpus comprises 19,164 entries uttered by 30 speakers (16 males and 14 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 44.1kHz for a total of 4.15 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
6000.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
6000.00 €
|
6000.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
- Russian
ID: ELRA-S0228-84
ISLRN: 206-347-009-523-5This corpus comprises 59,968 entries uttered by 50 speakers (25 males and 25 females), recorded over 4 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 44.1kHz for a total of 25.85 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Russian
ID: ELRA-S0228-91
ISLRN: 014-637-825-596-3This corpus comprises 99,940 entries uttered by 50 speakers (25 males and 25 females), recorded over 4 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 44.1kHz for a total of 32.13 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Dutch; Flemish
- English
- Finnish
- French
- German
- Italian
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
ID: ELRA-E0008
ISLRN: 317-005-302-361-6The CLEF Test Suite contains the data used for the main tracks of the CLEF campaigns carried out from 2000 to 2003: Multilingual text retrieval, Bilingual text retrieval, Monolingual text retrieval, and Domain-specific text retrieval. The CLEF Test Suite is composed of: • The multilingual docum...
MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
150.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Evaluation Use - ELRA EVALUATION |
300.00 €
|
1000.00 €
|
Special offers are also available. Check here for details.