Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1681 Language Resources (Page 46 of 85)
« Previous | Next »Order by:


- Russian
ID: ELRA-W0080
ISLRN: 024-620-556-146-2The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Arabic
ID: ELRA-S0219
ISLRN: 479-507-036-103-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The Nemlar Broadcast News Speech Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4000.00 €
![]() |
4000.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-S0220
ISLRN: 361-216-121-305-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Written Corpus (ELRA-W0042) and the NEMLAR Broadcast News Speech Corpus (ELRA-S0219). The NEMLAR Speech Synthesis Corpus contains the reco...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
1250.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1000.00 €
![]() |
2500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
10000.00 €
![]() |
10000.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-W0042
ISLRN: 050-693-158-326-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
250.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |
Special offers are also available. Check here for details.


- Nepali (macrolanguage)
ID: ELRA-W0076
ISLRN: 325-796-965-405-9The Nepali Monolingual written corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localiza...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |


- Nepali (macrolanguage)
ID: ELRA-S0368
ISLRN: 688-800-566-571-0The Nepali Spoken Corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (“language communication”), also known as Nelralec, for Nepali Language Resources and Localization for Educ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |


- Arabic
ID: ELRA-S0157
ISLRN: 663-177-513-755-1The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news br...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
100.00 €
![]() |
1350.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1350.00 €
![]() |
1350.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
200.00 €
![]() |
2700.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2700.00 €
![]() |
2700.00 €
![]() |


- English
- French
ID: ELRA-T0362
ISLRN: 761-442-215-246-0Extended version of ELRA-T0090 GEOBASE. The terms were selected and collated by Dr M.S.N. CARPENTER during the course of his translation activities over the past ten years. The terms have been validated by publication in the scientific literature. Conceived as a bilingual terminological resource,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3420.00 €
![]() |
4788.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4788.00 €
![]() |
4788.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4788.00 €
![]() |
6840.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
6840.00 €
![]() |
6840.00 €
![]() |


- English
ID: ELRA-L0045
ISLRN: 044-694-748-731-5This is Oxford University Press's most comprehensive single-volume dictionary, with 170,000 entries covering all varieties of English worldwide. The NODE data set constitutes a fully integrated range of formal data types suitable for language engineering and NLP applications: It is available in X...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
6125.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
8750.00 €
![]() |


- English
ID: ELRA-L0047
ISLRN: 869-866-137-463-6The New Oxford Thesaurus of English is a completely new top-of-the-range thesaurus offering more alternative and opposite words than any of its competitors. The synonyms are arranged in order of ?relevance? to the look-up word, starting with an individually tagged core synonym, and followed by la...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4900.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
7000.00 €
![]() |


- English
ID: ELRA-L0046
ISLRN: 003-258-865-840-0The DIMAP version of NODE (first edition) is a machine-tractable version of the machine-readable dictionary files in the DIMAP dictionary maintenance programs, adding syntactic and semantic information in the conversion. In addition, DIMAP provides several mechanisms that will allow research into...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
7000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
10000.00 €
![]() |


- Spanish; Castilian
ID: ELRA-S0444
ISLRN: 469-588-696-069-61,630 non-Spanish nationality native Spanish speakers such as Mexicans and Colombians participated in the recording with authentic accent. The recorded script is designed by linguists and cover a wide range of topics including generic, interactive, in-vehicle and home. The text is manually proofr...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180975.00 €
![]() |
180975.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
180975.00 €
![]() |
180975.00 €
![]() |
Special offers are also available. Check here for details.


- Arabic
ID: ELRA-W0127
ISLRN: 305-450-745-774-1Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- Norwegian
ID: ELRA-S0301
ISLRN: 184-180-634-505-7EUROM1 is the first really multilingual speech database produced in Europe. Equivalent corpora for each of the European languages were collected with the same number of speakers selected in the same way, and recorded in the same conditions with common file formats. Initially eight European countr...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
800.00 €
![]() |
800.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
800.00 €
![]() |
800.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1600.00 €
![]() |
1600.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1600.00 €
![]() |
1600.00 €
![]() |


- Norwegian
ID: ELRA-S0081
ISLRN: 231-756-812-990-0The Norwegian SpeechDat(II) FDB-1000 comprises 1016 Norwegian speakers (517 males, 499 females) recorded over the Norwegian fixed telephone network. The FDB-1000 database is partitioned into 4 CDs. The speech databases made within the SpeechDat(II) project were validated by SPEX, the Netherlands,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
15000.00 €
![]() |
18000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
18000.00 €
![]() |
18000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
25000.00 €
![]() |
25000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
25000.00 €
![]() |
25000.00 €
![]() |


- Portuguese
ID: ELRA-W0089
ISLRN: 412-883-442-173-8NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- English
ID: ELRA-L0130
ISLRN: 007-544-786-822-8The NRC Emotion Lexicon was originally built by Saif M. Mohammad and Peter D. Turney through crowdsourcing. The NRC was created in order to assist with emotion analysis as other emotion lexicons were smaller at the time. In order to be able to fix this problem, Saif crowdsourced a huge collection...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - CC-BY-NC-4.0 |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - CC-BY-NC-4.0 |


- Mongolian
ID: ELRA-W0120
ISLRN: 492-817-146-504-9This is a corpus of Mongolian text mostly from domains like online or printed daily newspapers, literature, and laws. The collected raw texts was reduced from 5 to 4.8 million words after cleaning. The cleaned corpus comprises: - 144 texts from laws until 2009, - 288 texts from literature t...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
7000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
7000.00 €
![]() |
7000.00 €
![]() |


- English
ID: ELRA-L0059
ISLRN: 496-636-168-539-2Oxford University Press has developed two lists of offensive words and expressions, specifically developed for filter applications in the contexts of web pages and email. Each list features a grading system describing vocabulary type and offensive strength for each term, plus collocational infor...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
![]() |


- English
ID: ELRA-L0060
ISLRN: 359-137-869-557-3Oxford University Press has developed two lists of offensive words and expressions, specifically developed for filter applications in the contexts of web pages and email. Each list features a grading system describing vocabulary type and offensive strength for each term, plus collocational inform...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2500.00 €
![]() |