Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1680 Language Resources (Page 12 of 84)
« Previous | Next »Order by:


- English
ID: ELRA-S0228-78
ISLRN: 040-245-794-542-7This corpus comprises 50,858 entries uttered by 51 speakers (28 males and 23 females), recorded over 2 channels (desktop in quiet office/home). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 29.7 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |


- English
ID: ELRA-S0228-101
ISLRN: 535-041-750-483-4This corpus comprises 63,495 entries uttered by 54 speakers (27 males and 27 females), recorded over 3 channels (mobile in noisy café/restaurant/street). Speech samples are stored as a sequence of 16-bit 16 kHz for a total of 22.3 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |


- English
ID: ELRA-S0228-111
ISLRN: 920-976-101-187-7This corpus was recorded in a quiet office/home environment over 3 channels and collected from a total of 302 speakers, including 149 males and 153 females, all of whom have been carefully screened to ensure their standard and clear pronunciation. The audio scripts come from news and tweets. Spee...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
32400.00 €
![]() |
32400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
32400.00 €
![]() |
32400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
32400.00 €
![]() |
32400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
32400.00 €
![]() |
32400.00 €
![]() |


- Bulgarian
ID: ELRA-W0329
ISLRN: 832-960-876-604-2The Bulgarian Event Corpus is composed 324,905 tokens appropriate for training Named Entity Recognition (NER), Named Entity Linking (NEL) and Event Recognition models for Bulgarian in a multidomain context within Humanities. The texts are domain related. They include documents from the area of So...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: ? - CC-BY-SA-3.0 |
0.00 €
![]() | |
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |


- Bulgarian
ID: ELRA-L0075
ISLRN: 450-247-052-039-5This database contains 81,647 entries in Bulgarian with a linguistic environment tool (for WINDOWS XP). The data may be used for morphological analysis and synthesis, syntactic agreement checking, phonetic stress determining. Structure of entries: Local linguistic variant File format: MS ACCESS ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
10000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
10000.00 €
![]() |
10000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4000.00 €
![]() |
16000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
16000.00 €
![]() |
16000.00 €
![]() |


- Bulgarian
ID: ELRA-L0030
ISLRN: 611-552-122-892-7This dictionary contains 67500 entries divided into 242 inflectional types (including proper nouns), morphosyntactic information for each entry, and a morphological engine (MS DOS and WINDOWS 95/NT) for morphological analysis and generation. The data may be used for morphological analysis and syn...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
45.00 €
![]() |
6000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
6000.00 €
![]() |
6000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
100.00 €
![]() |
12000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
12000.00 €
![]() |
12000.00 €
![]() |


- Bulgarian
ID: ELRA-W0328
ISLRN: 761-430-854-533-2The Bulgarian Treebank Corpus is composed of 156,149 tokens (11,138 sentences) coming from three main sources in the domain of Grammar Notebooks (1,391 sentences), News (6,698 sentences), Other (3,049 sentences). It is available with syntactical and morphological annotation on a sentence basis in...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |


- Bulgarian
ID: ELRA-L0132
ISLRN: 188-702-981-369-5The Bulgarian Valency Frame Lexicon is composed of 9547 lexical entries organized by frames with 960 mappings to Princeton WordNet available in XML format. It is a treebank-driven resource of extracted valency frames from BulTreeBank. The frames were manually curated. The frames followed the surf...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
![]() |
0.00 €
![]() |


- Bulgarian
- English
ID: ELRA-M0041
ISLRN: 941-120-951-927-7The Bulgarian WordNet is a network of lexical-semantic relations, an electronic thesaurus with a structure modelled on that of the Princeton WordNet and those constructed in the EuroWordNet and BalkaNet project. Bulgarian WordNet describes meaning of a lexical unit by placing it within a network ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4500.00 €
![]() |
4500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
600.00 €
![]() |
6000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
9000.00 €
![]() |
9000.00 €
![]() |


- Arabic
ID: ELRA-L0133
ISLRN: 462-532-124-988-8Comprehensive Arabic LEMmas is a lexicon covering a large list of Arabic lemmas and their corresponding inflected word forms (stems) with details (POS + Root). Each lexical entry represents a lemma followed by all its possible stems and each stem is enriched by its morphological features, especia...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
7500.00 €
![]() |
7500.00 €
![]() |


- English
ID: ELRA-S0228-85
ISLRN: 942-019-580-826-2This corpus comprises 6,976 entries uttered by 150 speakers (80 males and 70 females), recorded over 4 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 3.86 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4050.00 €
![]() |
4050.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4050.00 €
![]() |
4050.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4050.00 €
![]() |
4050.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
4050.00 €
![]() |
4050.00 €
![]() |


- English
ID: ELRA-S0228-89
ISLRN: 836-335-444-460-7This corpus comprises 2,250 entries uttered by 150 speakers (106 males and 44 females), recorded over the telephone network. Speech samples are stored as a sequence of 16-bit 8 kHz for a total of 2.83 hours of speech.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |


- English
ID: ELRA-S0228-90
ISLRN: 229-685-009-012-2This corpus comprises 2,400 entries uttered by 150 speakers (106 males and 44 females), recorded over the telephone network. Speech samples are stored as a sequence of 16-bit 8 kHz for a total of 2.24 hours of speech.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |


- English
ID: ELRA-S0228-87
ISLRN: 668-176-572-368-0This corpus comprises 1,500 entries uttered by 150 speakers (106 males and 44 females), recorded over the telephone network. Speech samples are stored as a sequence of 16-bit 8 kHz for a total of 2.09 hours of speech.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |


- English
ID: ELRA-S0228-88
ISLRN: 616-328-968-271-5This corpus comprises 1,500 entries uttered by 150 speakers (106 males and 44 females), recorded over the telephone network. Speech samples are stored as a sequence of 16-bit 8 kHz for a total of 3.6 hours of speech.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3240.00 €
![]() |
3240.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3240.00 €
![]() |
3240.00 €
![]() |


- French
ID: ELRA-S0228-72
ISLRN: 360-129-212-036-3This corpus comprises 75,147 entries uttered by 50 speakers (25 males and 25 females), recorded over 3 channels (mobile quiet office). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 25.67 hours of speech per channel.
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5400.00 €
![]() |
5400.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5400.00 €
![]() |
5400.00 €
![]() |


- English
ID: ELRA-S0423
ISLRN: 560-196-682-941-2466 native Canadian speakers involved, balanced for gender. The recording corpus is rich in content, and it covers a wide domain such as generic command and control category, human-machine interaction category; smart home category; in-car category. The transcription corpus has been manually proof...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
45229.50 €
![]() |
45229.50 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
45229.50 €
![]() |
45229.50 €
![]() |
Special offers are also available. Check here for details.


- Chinese
ID: ELRA-S0427
ISLRN: 722-447-977-629-5995 local Cantonese speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transcribed m...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
98030.50 €
![]() |
98030.50 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
98030.50 €
![]() |
98030.50 €
![]() |
Special offers are also available. Check here for details.


- Chinese
ID: ELRA-S0478
ISLRN: 049-624-028-135-7It collects 4,888 speakers from Guangdong Province and is recorded in quiet indoor environment. The recorded content covers 500,000 commonly used spoken sentences, including high-frequency words in weico and daily used expressions. The average number of repetitions is 1.5 and the average sentence...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
141246.00 €
![]() |
141246.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
141246.00 €
![]() |
141246.00 €
![]() |
Special offers are also available. Check here for details.