Available (94)
Audio (82)
Text (12)
Video (4)
True (4)
TEI (1)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

94 Language Resources (Page 1 of 5)

« Previous | Next »Order by:

 2006 CoNLL Shared Task - Ten Languages    
  • Bulgarian
  • Danish
  • Dutch; Flemish
  • German
  • Japanese
  • Portuguese
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Turkish

ID: ELRA-W0086

ISLRN: 578-227-532-044-0

2006 CoNLL Shared Task - Ten Languages consists of dependency treebanks in ten languages used as part of the CoNLL 2006 shared task on multi-lingual dependency parsing. The languages covered in this release are: Bulgarian, Danish, Dutch, German, Japanese, Portuguese, Slovene, Spanish, Swedish and...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 aGender    
  • German

ID: ELRA-S0365

ISLRN: 038-476-412-610-4

aGender contains speech sample recordings over public telephone lines with read and (semi-)spontaneous speech. Native German speakers called a voice portal from their private phone, and read text + answered some open questions. The purpose of the corpus is the automatic detection of gender and/or...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
327.00 € submit
8127.00 € submit
Licence: Commercial Use - ELRA VAR
8127.00 € submit
8127.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
455.00 € submit
8255.00 € submit
Licence: Commercial Use - ELRA VAR
8255.00 € submit
8255.00 € submit
 Alcohol Language Corpus (BAS ALC)    
  • German

ID: ELRA-S0299

ISLRN: 780-368-852-139-3

ALC contains recordings of German speakers that are either intoxicated or sober. The type of speech ranges from read single digits to full conversation style. Recordings were done during drinking test where speakers drank beer or wine to reach a self-chosen level of alcoholic intoxication. The ac...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
510.00 € submit
510.00 € submit
Licence: Commercial Use - ELRA VAR
510.00 € submit
510.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1020.00 € submit
1020.00 € submit
Licence: Commercial Use - ELRA VAR
1020.00 € submit
1020.00 € submit
 ANITA (Audio eNhancement In Telecom Applications)    
  • English
  • French
  • German
  • Spanish; Castilian

ID: ELRA-S0156

ISLRN: 537-894-870-719-4

ANITA (Audio eNhancement In secured Telecommunication Applications) is a European project launched on the initiative of EADS TELECOM with the objective of reducing audio acoustics noise in secured communications in adverse environments (sirens, alarms, engines, water pumps, stress situations, etc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 AURORA Project database - Subset of SpeechDat-Car - German database - Evaluation Package    
  • German

ID: ELRA-AURORA-CD0003-03

ISLRN: 613-751-084-730-0

The Aurora project was originally set up to establish a world wide standard for the feature extraction software which forms the core of the front-end of a DSR (Distributed Speech Recognition) system. ETSI formally adopted this activity as work items 007 and 008. The two work items within ETSI ar...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
1000.00 € submit
 Austrian SpeechDat(AT) FDB-1000 database    
  • German

ID: ELRA-S0142

ISLRN: 989-950-794-642-6

The SpeechDat(AT) FDB-1000 database contains the recordings of 1,000 Austrian speakers (544 males, 456 females) recorded over the Austrian fixed telephone network. The database is partitioned into 5 CD-ROMs, in ISO 9660 format. Speech samples are stored as sequences of 8-bit 8 kHz A-law, uncompr...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
16000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
20000.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

 Austrian SpeechDat(AT) MDB-1000 database    
  • German

ID: ELRA-S0143

ISLRN: 294-112-593-120-6

The Austrian SpeechDat(AT) MDB-1000 database contains the recordings of 1,000 Austrian speakers (543 males, 457 females) recorded over the Austrian mobile telephone network. The database is partitioned into 5 CD-ROMs, in ISO 9660 format. Speech samples are stored as sequences of 8-bit 8 kHz A-la...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
24000.00 € submit
30000.00 € submit
Licence: Commercial Use - ELRA VAR
30000.00 € submit
30000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
30000.00 € submit
35000.00 € submit
Licence: Commercial Use - ELRA VAR
35000.00 € submit
35000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

 BAS PHATT 1.0.X (sub-set)    
  • German

ID: ELRA-S0282-01

ISLRN: 704-844-083-488-7

The Ph@ttSessionz speech database, funded by the German Ministry of Science and Education (BMBF), contains recordings of 864 adolescent speakers of German (age range 12-20). The recordings were performed via the WWW in public schools (Gymnasium) in 41 locations in Germany. The speech material rec...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
512.00 € submit
2512.00 € submit
Licence: Commercial Use - ELRA VAR
2512.00 € submit
2512.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
512.00 € submit
3512.00 € submit
Licence: Commercial Use - ELRA VAR
3512.00 € submit
3512.00 € submit
 BAS PHATT 1.1.X (complete corpus)    
  • German

ID: ELRA-S0282-02

ISLRN: 847-046-185-654-8

The Ph@ttSessionz speech database, funded by the German Ministry of Science and Education (BMBF), contains recordings of 864 adolescent speakers of German (age range 12-20). The recordings were performed via the WWW in public schools (Gymnasium) in 41 locations in Germany. The speech material rec...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1917.30 € submit
3917.30 € submit
Licence: Commercial Use - ELRA VAR
3917.30 € submit
3917.30 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3834.75 € submit
6834.75 € submit
Licence: Commercial Use - ELRA VAR
6834.75 € submit
6834.75 € submit
 BITS Logatome Synthesis Corpus – BITS-LG    
  • German

ID: ELRA-S0217

ISLRN: 887-235-135-658-7

BITS stands for "BAS Infrastructures for Technical Speech Processing" and was funded by the German Ministry of Science and Education during 2003-2005. The BITS synthesis corpus consists of two parts: a set of logatome recordings for controlled diphone synthesis (ELRA-S0217) and a set of sentence ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
627.17 € submit
4627.17 € submit
Licence: Commercial Use - ELRA VAR
4627.17 € submit
4627.17 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
754.35 € submit
9000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
 BITS Unit Selection Synthesis Corpus    
  • German

ID: ELRA-S0224

ISLRN: 553-776-339-039-5

BITS stands for "BAS Infrastructures for Technical Speech Processing" and was funded by the German Ministry of Science and Education during 2003-2005. The BITS synthesis corpus consists of two parts: a set of logatome recordings for controlled diphone synthesis (ELRA-S0217) and a set of sentenc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
627.17 € submit
4627.17 € submit
Licence: Commercial Use - ELRA VAR
4627.17 € submit
4627.17 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
754.35 € submit
9000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
 Collins Multilingual database (MLD) – PhraseBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0383

ISLRN: 398-655-047-044-5

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the audio files corresponding t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3360.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4480.00 € submit
 Collins Multilingual database (MLD) – WordBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0382

ISLRN: 309-438-781-042-2

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the corresponding audio files c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3640.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5200.00 € submit
 deL1L2IM corpus    
  • German

ID: ELRA-W0083

ISLRN: 339-799-085-669-8

The deL1L2IM corpus, created between May and August 2012 and last updated in August 2014, has been collected within the framework of a PhD project on the development of a learning method implying conversations with an artificial companion. This PhD work is presented as a qualitative investigation...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 ECI-ELSNET Italian & German tagged sub-corpus    
  • German
  • Italian

ID: ELRA-W0005

ISLRN: 869-857-775-378-7

The objective is to provide a small but fine grained morphosyntactically tagged corpus, 50.000 running words for each of the two languages (Italian and German) to be used in research work on tagging methods and models. The text for German comes from the Frankfurter Rundschau extracted from the EC...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
20.00 € submit
20.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
45.00 € submit
 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)    
  • Albanian
  • Bulgarian
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • French
  • German
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Portuguese
  • Russian
  • Scottish Gaelic; Gaelic
  • Serbian
  • Spanish; Castilian
  • Swedish
  • Turkish
  • Uzbek

ID: ELRA-W0004

ISLRN: 511-168-567-582-5

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
 Erlanger Bahnansage - ERBA    
  • German

ID: ELRA-S0013

ISLRN: 839-887-396-051-4

Over 10.000 utterances read by over 100 German speakers (60 male and 40 female), in the domain of train inquiries. All recordings were made in a quiet office room (4 CDROMs).

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
511.29 € submit
6263.29 € submit
Licence: Commercial Use - ELRA VAR
6263.29 € submit
6263.29 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1022.58 € submit
6774.58 € submit
Licence: Commercial Use - ELRA VAR
6774.58 € submit
6774.58 € submit
 EUROM1g German    
  • German

ID: ELRA-S0014-03

ISLRN: 348-315-352-666-4

The first really multilingual speech database produced in Europe. Equivalent corpora for each of the European languages: same number of speakers selected in the same way, and recorded in the same conditions with common file formats. Initially eight European countries have made recordings: Italy, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
800.00 € submit
800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1600.00 € submit
1600.00 € submit
Licence: Commercial Use - ELRA VAR
1600.00 € submit
1600.00 € submit
 GeFRePaC - German French Reciprocal Parallel Corpus    
  • French
  • German

ID: ELRA-W0031

ISLRN: 086-761-267-762-3

The German-French Reciprocal Parallel Corpus (GeFRePaC) was produced by the Multilinguale Forschung/Multilingual Research Abteilung Lexik, Institut für Deutsche Sprache (Germany) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 German Kids Speech Recognition Corpus (Desktop)    
  • German

ID: ELRA-S0228-100

ISLRN: 623-099-052-821-2

This corpus comprises 9,572 entries uttered by 30 speakers (15 males and 15 females), recorded over 2 channels (desktop in quiet office/home). Speech samples are stored as a sequence of 16-bit 44.1kHz for a total of 4.25 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit

« Previous | Next »