Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1681 Language Resources (Page 67 of 85)
« Previous | Next »Order by:


- Pushto; Pashto
ID: ELRA-S0381
ISLRN: 918-508-885-913-7This corpus contains transcribed broadcast news recordings in Pashto. Recordings are collected from 5 sources: Ashna TV, Azadi Radio, Deewa Radio, Mashaal Radio and Shamshad TV. The corpus contains 108 hours of recordings covering more than 1,000 speakers. Transcriptions are provided together ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
20000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
20000.00 €
![]() |
20000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3500.00 €
![]() |
28000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
28000.00 €
![]() |
28000.00 €
![]() |


- English
- Pushto; Pashto
ID: ELRA-W0097
ISLRN: 612-936-517-010-2This is a parallel corpus, which contains 10,000 Pashto words translated into English by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto. The content has also been translated into French (see ELRA-W...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- English
- Pushto; Pashto
ID: ELRA-W0095
ISLRN: 006-102-605-738-4This is a parallel corpus, which contains 10,000 Pashto words translated into English. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011, 18/01/2011 and 19/01/2011. ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0096
ISLRN: 649-628-149-051-7This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto. The content has also been translated into English (see ELRA-W...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0094
ISLRN: 547-897-479-723-3This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0093
ISLRN: 802-643-297-429-4The corpus consists of the transcription of 106 hours of recordings in Pashto translated into French. The transcriptions are extracted from the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). It contains about 832,000 source words and 747,000 target words. No audio file is provided. Pasht...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3000.00 €
![]() |
10000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
10000.00 €
![]() |
10000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4000.00 €
![]() |
18000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
18000.00 €
![]() |
18000.00 €
![]() |


- Pushto; Pashto
ID: ELRA-W0092
ISLRN: 394-903-293-388-0This is a monolingual text corpus in Pashto. The corpus contains about 112,000,000 tokens collected from 46 different blogs and websites. Identified and negotiated or freely available sources have been crawled in 2012, cleaned and XML-formatted. Pashto is an indo-iranian language spoken by th...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
![]() |
3500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3500.00 €
![]() |
3500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Arabic
- English
ID: ELRA-W0126
ISLRN: 986-364-744-303-9The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
500.00 €
![]() |
500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
650.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
650.00 €
![]() |
650.00 €
![]() |


- English
ID: ELRA-S0120
ISLRN: 502-719-830-448-5LDC reference: https://catalog.ldc.upenn.edu/LDC2002T03 The Translanguage English Database (TED) Transcripts corpus contains transcriptions of thirty-nine of the 188 speeches of the TED Corpus made at Eurospeech'93 in Berlin. The thirty-nine transcripts in this publication are in Universal Tra...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- English
- Norwegian
ID: ELRA-W0156
ISLRN: 909-695-133-060-3This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation memories containing translations of EU legis...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- Swedish
ID: ELRA-W0236
ISLRN: 709-518-556-855-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation memory from Swedish National Audit Office
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- English
- Lithuanian
ID: ELRA-W0165
ISLRN: 691-158-541-313-8This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation Memories of Lithuanian legislation from Seim...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- French
- Modern Greek (1453-)
ID: ELRA-W0307
ISLRN: 954-287-236-137-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Trilingual (Greek-English-French) documents - standard f...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- English
- French
- German
ID: ELRA-W0013
ISLRN: 717-350-913-018-8The TSNLP project (LRE 62-089) has produced a database of test suites for English, French and German containing over 4,000 test items (sentences or fragment of sentences) per language which have been constructed for evaluating natural language processing systems, but which may also be useful for ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
100.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
100.00 €
![]() |
100.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
100.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
100.00 €
![]() |
100.00 €
![]() |



- English
ID: ELRA-W0048
ISLRN: 799-660-957-954-5TUNA (Towards a UNified Algorithm for the generation of referring expressions) is a research project funded by the UK's Engineering and Physical Sciences Research Council (EPSRC). The TUNA Corpus of Referring Expressions is built with the contributions from 50 native or fluent speakers of Engl...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
45.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
45.00 €
![]() |
45.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
45.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
45.00 €
![]() |
45.00 €
![]() |


- Turkish
ID: ELRA-S0121
ISLRN: 192-049-804-522-1This Turkish speech database was produced by the department of Théorie des Circuits et Traitement de Signal at the Faculté Polytechnique de Mons. The corpus was designed to provide read speech data for speech recognition purposes. The database contains 14 hours of speech (1618 words) from 43 Turk...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3000.00 €
![]() |
3000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
800.00 €
![]() |
6000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
6000.00 €
![]() |
6000.00 €
![]() |


- Turkish
ID: ELRA-S0178
ISLRN: 539-782-381-710-6The Turkish Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Turkish speakers (280 males, 270 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the reco...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50000.00 €
![]() |
67000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
67000.00 €
![]() |
67000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
60000.00 €
![]() |
75000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
75000.00 €
![]() |
75000.00 €
![]() |


- French
ID: ELRA-S0088
ISLRN: 167-303-567-257-8The Twin database named TWINDB1 includes recordings of 45 French speakers, consisting of 9 pairs of identical twins (8 males and 10 females) with similar voices, and 27 other speakers (13 males and 14 females) including 4 none-twin siblings. Each twin or sibling spoke for a total of 24 to 30 minu...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
200.00 €
![]() |
400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
400.00 €
![]() |
400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
![]() |
800.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
800.00 €
![]() |
800.00 €
![]() |


- Arabic
ID: ELRA-S0228-130
ISLRN: 737-957-734-087-9This corpus was recorded in a quiet office/home environment over 2 channels and collected from a total of 168 speakers, including 94 males and 74 females, all of whom have been carefully screened to ensure their standard and clear pronunciation.The audio scripts cover information such as news and...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
36000.00 €
![]() |
36000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
36000.00 €
![]() |
36000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
36000.00 €
![]() |
36000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
36000.00 €
![]() |
36000.00 €
![]() |


- English
ID: ELRA-S0215
ISLRN: 773-101-261-598-6The UK English Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 606 adult UK English speakers (325 males, 281 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place), and consisting of about 195 ho...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50000.00 €
![]() |
67000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
67000.00 €
![]() |
67000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
60000.00 €
![]() |
75000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
75000.00 €
![]() |
75000.00 €
![]() |