Text (1052)
Audio (679)
Video (23)
True (226)
TEI (10)
TMX (6)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

1681 Language Resources (Page 43 of 85)

« Previous | Next »Order by:

 MEDIA speech database for French    
  • French

ID: ELRA-S0272

ISLRN: 195-971-767-455-9

The MEDIA speech database for French was produced by ELDA within the French national project MEDIA (Automatic evaluation of man-machine dialogue systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). It contains 1,258 transcribed ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 Memorandum for a ESM programme (Processed)    
  • English
  • Modern Greek (1453-)

ID: ELRA-W0210

ISLRN: 043-737-892-695-4

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Memorandum of Understanding for a three-year European St...

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 Metalogue Multi-Issue Bargaining Dialogue    
  • English

ID: ELRA-S0394

ISLRN: 217-906-813-531-9

INTRODUCTION Metalogue Multi-Issue Bargaining Dialogue was developed by the Metalogue Consortium (http://cordis.europa.eu/project/rcn/110655_en.html) under the European Community's Seventh Framework Programme for Research and Technological Development (https://ec.europa.eu/research/fp7/index_e...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
250.00 € submit
Licence: Commercial Use - ELRA VAR
250.00 € submit
250.00 € submit
 Methodological Reconciliation (Processed)    
  • English
  • Modern Greek (1453-)

ID: ELRA-W0208

ISLRN: 462-928-711-185-4

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Methodological Reconciliation Table Council Directive 20...

MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Other - Public Domain
0.00 € submit
0.00 € submit
 Mexican Spanish Kids Speech Recognition Corpus (Desktop)    
  • Spanish; Castilian

ID: ELRA-S0228-94

ISLRN: 217-568-306-452-3

This corpus comprises 19,156 entries uttered by 30 speakers (16 males and 14 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 44.1kHz for a total of 5 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5400.00 € submit
5400.00 € submit
Licence: Commercial Use - ELRA VAR
5400.00 € submit
5400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5400.00 € submit
5400.00 € submit
Licence: Commercial Use - ELRA VAR
5400.00 € submit
5400.00 € submit
 Mexican Spanish Speech Recognition Corpus (Mobile)    
  • Spanish; Castilian

ID: ELRA-S0228-104

ISLRN: 866-276-372-885-9

This corpus was recorded in a quiet office environment over 3 channels and collected from a total of 826 speakers, including 408 males and 418 females, all of whom have been carefully screened to ensure their standard and clear pronunciation. The audio scripts cover information such as news. Spee...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
81000.00 € submit
81000.00 € submit
Licence: Commercial Use - ELRA VAR
81000.00 € submit
81000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
81000.00 € submit
81000.00 € submit
Licence: Commercial Use - ELRA VAR
81000.00 € submit
81000.00 € submit
 MGB-5 Moroccan Dialect    
  • Arabic

ID: ELRA-S0404

ISLRN: 938-639-614-524-5

The MGB-5 Moroccan Dialect comprises 14 hours of Moroccan Arabic speech extracted from 93 YouTube videos distributed across seven genres: comedy, cooking, family/children, fashion, drama, sports, and science clips. Given that dialectal Arabic does not have a clearly defined orthography, differ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 MHATLex      
  • French

ID: ELRA-S0100

ISLRN: 740-149-502-864-8

MHATLex is a new enhanced lexical resource for written and speech automatic processing for French. It is derived from BDLex (see ELRA-S0004). It contains three levels of representation: - Syntactic level: S - Phonological word level: W - Phonetic level: P At the W level, a word has two repr...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2500.00 € submit
7500.00 € submit
Licence: Commercial Use - ELRA VAR
7500.00 € submit
7500.00 € submit
 MICROAES    
  • Spanish; Castilian

ID: ELRA-S0165

ISLRN: 313-534-255-935-8

The ATLAS Spanish Microphone Database (MICROAES) has been collected in Spain by Applied Technologies on Language and Speech, S.L. (ATLAS). This database comprises microphone recordings from 300 different speakers, who have been selected from five different dialectal areas. Sex and age distributio...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18000.00 € submit
28000.00 € submit
Licence: Commercial Use - ELRA VAR
28000.00 € submit
28000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
22000.00 € submit
32000.00 € submit
Licence: Commercial Use - ELRA VAR
32000.00 € submit
32000.00 € submit
 MIST Multi-lingual Interoperability in Speech Technology database    
  • Dutch; Flemish
  • English
  • French
  • German

ID: ELRA-S0238

ISLRN: 189-835-264-931-4

In 1996, some 75 Dutch people participated in recording a multi-purpose continuous speech database. Most of them were recruited from the TNO Human Factors Research Institute, where the recordings were made. The main part of the database consisted of Dutch sentences. However, most speakers partici...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
500.00 € submit
 Mixed Speech with Chinese and English Data by Mobile Phone - 1,535 Hours    
  • Chinese
  • English

ID: ELRA-S0457

ISLRN: 451-966-049-653-3

The data is recorded by 3972 Chinese native speakers with accents covering seven major dialect areas. The recorded text is a mixture of Chinese and English sentences, covering general scenes and human-computer interaction scenes. It is rich in content and accurate in transcription. It can be used...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
145825.00 € submit
145825.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
145825.00 € submit
145825.00 € submit

Special offers are also available. Check here for details.

 MLCC Multilingual and Parallel Corpora    
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Italian
  • Modern Greek (1453-)
  • Portuguese
  • Spanish; Castilian

ID: ELRA-W0023

ISLRN: 963-635-729-341-8

The MLCC text corpus has two main components - one set to allow comparable studies to be carried out in different languages and one set as the basis for translation studies. The first set is referred as the Polylingual Document Collection, a collection of newspaper articles from financial new...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3600.00 € submit
 Modern French Corpus including Anaphors Tagging    
  • French

ID: ELRA-W0032

ISLRN: 488-420-763-510-8

The corpus that includes the tagging of the anaphors was created by the CRISTAL-GRESEC (Stendhal-Grenoble 3 University, France) team and XRCE (Xerox Research Centre Europe, France) in the framework of the call launched by the DGLF-LF (national institution for the French language and the languages...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
250.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
1000.00 € submit
 Monolingual documents from the Government of Lithuania (Processed)    
  • Lithuanian

ID: ELRA-W0299

ISLRN: 268-109-862-136-1

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Monolingual documents received from the Government of th...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Monolingual Greek corpus    
  • Modern Greek (1453-)

ID: ELRA-W0014

ISLRN: 546-958-429-693-4

Monolingual Greek corpus of 1 million words. The corpus consists of articles written in 1996 from the Greek daily newspaper ELEFTHEROTIPIA. Each file contains annotated text with SGML mark-up accompanied by a text header.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
360.00 € submit
360.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
600.00 € submit
 Monolingual Vietnamese Annotated Corpus    
  • Vietnamese

ID: ELRA-W0310

ISLRN: 004-081-406-421-7

The Monolingual Vietnamese Annotated Corpus consists of 100,000 sentences, manually annotated with word boundaries, POS, named entities, with an average length of 20 words per sentence. The corpus is provided in XML format and is annotated according to TEI-encoding guidelines.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
900.00 € submit
Licence: Commercial Use - ELRA VAR
1800.00 € submit
1800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
1300.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 Morphological lexicon - Dutch    
  • Dutch; Flemish

ID: ELRA-L0203-01

ISLRN: 760-041-713-085-4

Morphological lists linking inflected forms to their lemmas, distributed as follows (catalogue references from ELRA-L0203-01 to ELRA-L0203-15): Language Code Lemmas Word forms Dutch nl 157,000 205,603 English en 69,308 160,441 French fr 79,843 442,085 German de 95,282 456,244 Hebrew he ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1570.00 € submit
1570.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1648.50 € submit
1648.50 € submit
 Morphological lexicon - English    
  • English

ID: ELRA-L0203-02

ISLRN: 946-506-741-622-0

Morphological lists linking inflected forms to their lemmas, distributed as follows (catalogue references from ELRA-L0203-01 to ELRA-L0203-15): Language Code Lemmas Word forms Dutch nl 157,000 205,603 English en 69,308 160,441 French fr 79,843 442,085 German de 95,282 456,244 Hebrew he ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
693.08 € submit
693.08 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
727.73 € submit
727.73 € submit
 Morphological lexicon - French    
  • French

ID: ELRA-L0203-03

ISLRN: 337-580-829-362-9

Morphological lists linking inflected forms to their lemmas, distributed as follows (catalogue references from ELRA-L0203-01 to ELRA-L0203-15): Language Code Lemmas Word forms Dutch nl 157,000 205,603 English en 69,308 160,441 French fr 79,843 442,085 German de 95,282 456,244 Hebrew he ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
798.43 € submit
798.43 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
838.35 € submit
838.35 € submit
 Morphological lexicon - German    
  • German

ID: ELRA-L0203-04

ISLRN: 490-423-051-831-0

Morphological lists linking inflected forms to their lemmas, distributed as follows (catalogue references from ELRA-L0203-01 to ELRA-L0203-15): Language Code Lemmas Word forms Dutch nl 157,000 205,603 English en 69,308 160,441 French fr 79,843 442,085 German de 95,282 456,244 Hebrew he ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
952.82 € submit
952.82 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1000.46 € submit
1000.46 € submit

« Previous | Next »