Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1680 Language Resources (Page 55 of 84)
« Previous | Next »Order by:


- Portuguese
ID: ELRA-S0228-83
ISLRN: 044-289-806-584-3This corpus comprises 49,988 entries uttered by 50 speakers (26 males and 24 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 26.41 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |


- Portuguese
ID: ELRA-S0228-122
ISLRN: 733-763-220-983-6This corpus was recorded in a quiet office environment over 2 channels and collected from a total of 200 speakers, including 102 males and 98 females, all of whom have been carefully screened to ensure their standard and clear pronunciation. The audio scripts cover information such as keywords. S...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
21600.00 €
![]() |
21600.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
21600.00 €
![]() |
21600.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
21600.00 €
![]() |
21600.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
21600.00 €
![]() |
21600.00 €
![]() |


- Portuguese
ID: ELRA-S0180
ISLRN: 824-839-200-501-4The Portuguese Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 553 adult Portuguese speakers (266 males, 287 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises th...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50000.00 €
![]() |
67000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
67000.00 €
![]() |
67000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
60000.00 €
![]() |
75000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
75000.00 €
![]() |
75000.00 €
![]() |


- Swedish
ID: ELRA-W0010
ISLRN: 860-303-374-818-4Språkdata has made available the first of its many Swedish corpora, PRESS 65. It consists of one million running words taken from Swedish newspapers from the year 1965. It has been categorised according to text type and is annotated down to the sentence level.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
12000.00 €
![]() |
12000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
20000.00 €
![]() |
20000.00 €
![]() |



- English
ID: ELRA-S0091
ISLRN: 095-481-429-979-3The Pronunciation lexicon of British place names, surnames and first names was produced by the University of Poitiers (France) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335). This lexicon is an SGML-enc...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
25000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
25000.00 €
![]() |
25000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
15000.00 €
![]() |
40000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
40000.00 €
![]() |
40000.00 €
![]() |


- Portuguese
ID: ELRA-W0060
ISLRN: 294-303-577-819-2The PTPARL Corpus contains 1,076 texts consisting of adapted transcriptions of the Portuguese Parliament sessions. The corpus contains 1,000,441 tokens. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- English
- Polish
ID: ELRA-W0187
ISLRN: 141-723-057-887-8This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of parallel Polish-English texts published ...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- English
- Polish
ID: ELRA-W0185
ISLRN: 865-835-648-658-1This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of parallel Polish-English texts published ...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- French
ID: ELRA-S0349
ISLRN: 074-668-446-920-0The Quaero Broadcast News Extended Named Entity corpus consists of the manual annotation of (i) the ESTER 2 corpus (see ELRA-S0338) and (ii) the Quaero Speech Recognition Evaluation corpus (manual and automatic transcriptions coming from 3 different ASR systems). The first part is the training co...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3000.00 €
![]() |
3000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- French
ID: ELRA-W0073
ISLRN: 864-217-681-552-4The Quaero Old Press Extended Named Entity corpus consists of the manual annotation of 76 newspaper issues published in 1890-1891 and provided by the French National Library (Bibliothèque Nationale de France). Three different titles are used (Le Temps, La Croix and Le Figaro) for a total of 295 p...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3000.00 €
![]() |
3000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Korean
ID: ELRA-W0034
ISLRN: 079-092-657-220-3Monolingual corpus in a .txt format, produced by KAIST KORTERM, containing 1020000 eojeols (Korean terms) in Korean. This corpus is morphologically analyzed, POS tagged, and rectified 3 times by specialists.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
667.00 €
![]() |
4000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4000.00 €
![]() |
4000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1333.00 €
![]() |
8000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
8000.00 €
![]() |
8000.00 €
![]() |


- English
- Modern Greek (1453-)
ID: ELRA-W0243
ISLRN: 497-530-909-088-2This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of 32 reports (16 in EL and 16 In EL) of th...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Open Under-PSI |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Open Under-PSI |
0.00 €
![]() |
0.00 €
![]() |



- French
ID: ELRA-E0044
ISLRN: 360-758-359-485-0The REPERE project (REconnaissance de PERsonnes dans des Emissions audiovisuelles) consists in a series of 3 evaluation campaigns for multimedia information processing systems. The project was funded by the DGA (Délégation Générale de l’Armement, France). The REPERE Evaluation Package contains t...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
5000.00 €
![]() |
Licence: Evaluation Use - ELRA EVALUATION |
1000.00 €
![]() | |
Licence: Commercial Use - ELRA VAR |
20000.00 €
![]() |
20000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
7500.00 €
![]() |
Licence: Evaluation Use - ELRA EVALUATION |
6500.00 €
![]() | |
Licence: Commercial Use - ELRA VAR |
25000.00 €
![]() |
25000.00 €
![]() |


- Romanian; Moldavian; Moldovan
ID: ELRA-W0085
ISLRN: 312-617-089-348-7ROCO is a Romanian journalistic corpus containing approximately 7.1 million tokens, the number of types being 231,626. It is rich in proper names, numerals and named entities. The corpus contains morphosyntactic information (MSD annotations) which has been assigned automatically with the high...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3000.00 €
![]() |
3000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0270
ISLRN: 131-157-185-289-5This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian-English corpus with studies, reports and statis...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Open Under-PSI |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Open Under-PSI |
0.00 €
![]() |
0.00 €
![]() |


- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0192
ISLRN: 050-476-818-226-7This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English literature corpus built fro...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0170
ISLRN: 085-350-774-090-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The New Civil Procedure Code in Romanian and English (bi...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0194
ISLRN: 100-905-126-706-7This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English news corpus built from Sout...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- Romanian; Moldavian; Moldovan
ID: ELRA-W0206
ISLRN: 422-693-047-625-3This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel aligned corpus in tmx format built from the Rom...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |