Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

118 Language Resources (Page 1 of 6)

« Previous | Next »Order by:

 ARCADE II Evaluation Package    
  • Arabic
  • Chinese
  • English
  • French
  • German
  • Italian
  • Japanese
  • Modern Greek (1453-)
  • Persian
  • Russian
  • Spanish; Castilian

ID: ELRA-E0018

ISLRN: 875-865-064-331-9

The ARCADE II Evaluation Package was produced within the French national project ARCADE II (Evaluation of parallel text alignment systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ARCADE II project enabled to carry out a cam...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 Cantonese Conversational Speech Data by Mobile Phone and Voice Recorder - 607 Hours    
  • Chinese

ID: ELRA-S0427

ISLRN: 722-447-977-629-5

995 local Cantonese speakers participated in the recording, and conducted face-to-face communication in a natural way. They had free discussion on a number of given topics, with a wide range of fields; the voice was natural and fluent, in line with the actual dialogue scene. Text is transcribed m...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
98030.50 € submit
98030.50 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
98030.50 € submit
98030.50 € submit

Special offers are also available. Check here for details.

 Cantonese Dialect Speech Data by Mobile Phone - 1,652 Hours    
  • Chinese

ID: ELRA-S0478

ISLRN: 049-624-028-135-7

It collects 4,888 speakers from Guangdong Province and is recorded in quiet indoor environment. The recorded content covers 500,000 commonly used spoken sentences, including high-frequency words in weico and daily used expressions. The average number of repetitions is 1.5 and the average sentence...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
141246.00 € submit
141246.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
141246.00 € submit
141246.00 € submit

Special offers are also available. Check here for details.

 Cantonese Speecon database    
  • Chinese

ID: ELRA-S0287

ISLRN: 537-563-219-913-3

The Cantonese Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Cantonese speakers (273 males, 277 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 Changsha Dialect Speech Data by Mobile Phone - 997 Hours    
  • Chinese

ID: ELRA-S0453

ISLRN: 520-610-210-012-3

2,000 Changsha natives participated in the recording, covering multiple age groups, with a balanced gender distribution and authentic accent. The recorded text is rich in content, covering general, interactive, car, home and other categories. Local people in changsha check and proofread. The ac...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
94715.00 € submit
94715.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
94715.00 € submit
94715.00 € submit

Special offers are also available. Check here for details.

 Chinese Children Speech data by Mobile phone - 3,255 Hours    
  • Chinese

ID: ELRA-S0458

ISLRN: 607-995-858-759-4

Mobile phone captured audio data of Chinese children, with total duration of 3,255 hours. 9,780 speakers are children aged 6 to 12, with accent covering seven dialect areas; the recorded text contains common children languages such as essay stories, numbers, and their interactions on cars, at hom...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
247380.00 € submit
247380.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
247380.00 € submit
247380.00 € submit

Special offers are also available. Check here for details.

 Chinese Digital Speech Data by Mobile Phone - 11,010 People    
  • Chinese

ID: ELRA-S0419

ISLRN: 434-094-443-871-0

11,010 Chinese native speakers participated in the recording with equal gender. Each speaker reads 30 sentences of 4 -8 digit number. Format:16kHz, 16bit, uncompressed wav, mono channel Recording environment:quiet indoor environment, without echo Recording content (read speech):four to eight...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
41838.00 € submit
41838.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
41838.00 € submit
41838.00 € submit

Special offers are also available. Check here for details.

 Chinese Mandarin (North) database    
  • Chinese

ID: ELRA-S0398

ISLRN: 353-548-770-894-7

This database contains the recordings of 500 Chinese Mandarin speakers from Northern China (250 males and 250 females), from 18 to 60 years’ old, recorded in quiet studios located in Shenzhen and in Hong Kong Special Administrative Region, People’s Republic of China. Demographics of native sp...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5200.00 € submit
7400.00 € submit
Licence: Commercial Use - ELRA VAR
7400.00 € submit
7400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7400.00 € submit
7400.00 € submit
Licence: Commercial Use - ELRA VAR
7400.00 € submit
7400.00 € submit
 Chinese Mandarin (South) database    
  • Chinese

ID: ELRA-S0397

ISLRN: 503-886-852-083-2

This database contains the recordings of 1000 Chinese Mandarin speakers from Southern China (500 males and 500 females), from 18 to 60 years’ old, recorded in quiet studios located in Shenzhen and in Hong Kong Special Administrative Region, People’s Republic of China. Demographics of native s...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
10400.00 € submit
14800.00 € submit
Licence: Commercial Use - ELRA VAR
14800.00 € submit
14800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14800.00 € submit
14800.00 € submit
Licence: Commercial Use - ELRA VAR
14800.00 € submit
14800.00 € submit
 Chinese Mandarin Speech Recognition Corpus (Mobile) - 204.2 hours    
  • Chinese

ID: ELRA-S0228-67

ISLRN: 509-044-363-238-7

This corpus comprises 120,144 entries uttered by 400 speakers (199 males and 201 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16 kHz for a total of 204.2 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
24000.00 € submit
24000.00 € submit
Licence: Commercial Use - ELRA VAR
24000.00 € submit
24000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
24000.00 € submit
24000.00 € submit
Licence: Commercial Use - ELRA VAR
24000.00 € submit
24000.00 € submit
 Chinese Mandarin Speech Recognition Corpus (Mobile) - 67.4 hours    
  • Chinese

ID: ELRA-S0228-61

ISLRN: 599-273-322-100-1

This corpus comprises 91,729 entries uttered by 304 speakers (151 males and 153 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16kHz for a total of 67.4 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18000.00 € submit
18000.00 € submit
Licence: Commercial Use - ELRA VAR
18000.00 € submit
18000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18000.00 € submit
18000.00 € submit
Licence: Commercial Use - ELRA VAR
18000.00 € submit
18000.00 € submit
 Chinese Mandarin Speech Recognition Corpus (Mobile) - 85 hours    
  • Chinese

ID: ELRA-S0228-60

ISLRN: 654-695-177-609-6

This corpus comprises 60,216 entries uttered by 201 speakers (101 males and 100 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16kHz for a total of 85 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
12000.00 € submit
12000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
12000.00 € submit
12000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit
 Chinese-Vietnamese Parallel Corpus    
  • Chinese
  • Vietnamese

ID: ELRA-W0312

ISLRN: 128-772-037-486-0

The Chinese-Vietnamese Parallel Corpus consists of 200,000 sentence pairs, with an average length of 15 words per sentence. The corpus is provided in XML format and is annotated according to TEI-encoding guidelines.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
400.00 € submit
Licence: Commercial Use - ELRA VAR
1400.00 € submit
1400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
600.00 € submit
Licence: Commercial Use - ELRA VAR
2100.00 € submit
2100.00 € submit
 Chinese-Vietnamese - PhraseBank with audio files    
  • Chinese
  • Vietnamese

ID: ELRA-S0485

ISLRN: 428-557-564-826-7

Chinese-Vietnamese - PhraseBank with audio files of daily conversations spoken by native speakers containing 4002 sentence pairs. Scripts with Pinyin, Topic, Cat, Vietnamese translation with corresponding audio in Chinese and Vietnamese. Corpus in XML and WAV formats.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
900.00 € submit
900.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
750.00 € submit
Licence: Commercial Use - ELRA VAR
1350.00 € submit
1350.00 € submit
 Collins Multilingual database (MLD) – PhraseBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0383

ISLRN: 398-655-047-044-5

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the audio files corresponding t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3360.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4480.00 € submit
 Collins Multilingual database (MLD) – WordBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0382

ISLRN: 309-438-781-042-2

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the corresponding audio files c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3640.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5200.00 € submit
 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)    
  • Albanian
  • Bulgarian
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • French
  • German
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Portuguese
  • Russian
  • Scottish Gaelic; Gaelic
  • Serbian
  • Spanish; Castilian
  • Swedish
  • Turkish
  • Uzbek

ID: ELRA-W0004

ISLRN: 511-168-567-582-5

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
 English-Chinese-Vietnamese Trilingual Parallel Corpus    
  • Chinese
  • English
  • Vietnamese

ID: ELRA-W0314

ISLRN: 637-630-726-817-9

The English-Chinese-Vietnamese Trilingual Parallel Corpus consists of 20,046 trilingual sets of sentence pairs. The corpus is provided in XML format and is annotated according to TEI-encoding guidelines.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
225.00 € submit
750.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
 GlobalPhone 2000 Speaker Package    
  • Arabic
  • Bulgarian
  • Chinese
  • Croatian
  • Czech
  • French
  • German
  • Hausa
  • Japanese
  • Korean
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swahili (macrolanguage)
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-S0400

ISLRN: 331-592-378-424-7

The GlobalPhone 2000 Speaker Package contains transcribed read speech spoken by 2000 native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin (ELRA-S0193), C...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1400.00 € submit
7200.00 € submit
Licence: Commercial Use - ELRA VAR
7200.00 € submit
7200.00 € submit
 GlobalPhone Chinese-Mandarin    
  • Chinese

ID: ELRA-S0193

ISLRN: 976-318-571-969-1

The GlobalPhone corpus developed in collaboration with the Karlsruhe Institute of Technology (KIT) was designed to provide read speech data for the development and evaluation of large continuous speech recognition systems in the most widespread languages of the world, and to provide a uniform, mu...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

« Previous | Next »