15 Language Resources

Order by:

 APASCI    
  • Italian

ID: ELRA-S0039

ISLRN: 501-292-014-931-9

APASCI is an Italian speech database recorded in an insulated room with a Sennheiser MKH 416 T microphone. It includes 5,290 phonetically rich sentences and 10,800 isolated digits, for a total of 58,924 word occurrences (2,191 different words) and 641 minutes of speech. The speech material was re...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1600.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
 ARCADE/ROMANSEVAL corpus    
  • English
  • French
  • Italian

ID: ELRA-W0018

ISLRN: 681-769-134-114-2

The ARCADE/ROMANSEVAL corpus was used as a reference corpus in two international competitions: · ARCADE, an exercise on multilingual text alignment financed by AUPELF-UREF · ROMANSEVAL, part of the SENSEVAL exercise sponsored by ACL-SIGLEX and EURALEX, on word sense disambiguation. The corpus con...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 CLIPS_MT_MANUAL    
  • Italian

ID: ELRA-S0369

ISLRN: 865-893-448-970-5

CLIPS_MT_MANUAL is a sub-corpus of the original Italian CLIPS corpus (Corpora e Lessici dell'Italiano Parlato e Scritto). This corpus contains 3228 inspected and partially repaired WAV signal files, each containing one dialogue turn (*.wav), 3228 corrected original CLIPS annotation files (*.acs, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
127.82 € submit
255.64 € submit
Licence: Commercial Use - ELRA VAR
255.64 € submit
255.64 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
255.65 € submit
511.30 € submit
Licence: Commercial Use - ELRA VAR
511.30 € submit
511.30 € submit
 C-ORAL-ROM - Integrated reference corpora for spoken romance languages. Multi-media edition; tools of analysis; standard linguistic measurements for validation in HLT    
  • French
  • Italian
  • Portuguese
  • Spanish; Castilian

ID: ELRA-S0172

ISLRN: 318-977-046-077-4

Description The C-ORAL-ROM resource is a multilingual corpus of spontaneous1 speech for the main romance languages of around 1,200,000 words (IST 2000-26228). The resource comprises three components: a)Multimedia corpus; b)Speech software; c)Appendix. The corpus consists of four comparable recor...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
 Italian Speech Corpus 1 (Appen)    
  • Italian

ID: ELRA-S0147

ISLRN: 458-657-455-735-5

The Italian Speech Corpus 1 contains the recordings of 202 native Italian speakers (112 males, 90 females) recorded in an office and a closed public place, over 4 channels, in a range of low to medium background noise environments (Plantronics Audio 10 (computer/desk mic), Shure SM58 (desk mounte...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
9500.00 € submit
Licence: Commercial Use - ELRA VAR
9500.00 € submit
9500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Italian SpeechDat-Car database    
  • Italian

ID: ELRA-S0144

ISLRN: 513-325-829-468-0

The Italian SpeechDat-Car database contains the recordings of 300 Italian speakers (149 females, 151 males) recorded over the GSM telephone network, in a car. This database is partitioned into 14 DVDs. The speech data files are in two formats. Four of the 5 microphones were recorded on the comput...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
90000.00 € submit
90000.00 € submit
Licence: Commercial Use - ELRA VAR
90000.00 € submit
90000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
120000.00 € submit
120000.00 € submit
Licence: Commercial Use - ELRA VAR
120000.00 € submit
120000.00 € submit
 Italian Speecon database    
  • Italian

ID: ELRA-S0213

ISLRN: 239-555-046-548-2

The Italian Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Italian speakers (273 males, 277 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the reco...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 Italian Syntactic-Semantic Treebank (ISST)    
  • Italian

ID: ELRA-W0044

ISLRN: 927-246-660-947-9

ISST comprises 89,941 tokens for the financial-domain part and 215,606 tokens for the general part. It is formatted in XML. ISST has a five-level structure covering orthographic, morpho-syntactic, syntactic and semantic levels of linguistic description. Syntactic annotation is distributed over t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 Italian TTS Speech Corpus (Appen)    
  • Italian

ID: ELRA-S0148

ISLRN: 976-246-706-503-6

The Italian TTS Speech Corpus contains the recordings of 1 native Italian speaker (male, 50 years old) recorded in a studio over 1 channel (Shure SM15 unidirectional professional head-word condenser microphone). The data collection and transcription were performed by Appen (Australia). Speech sam...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
9000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3500.00 € submit
11000.00 € submit
Licence: Commercial Use - ELRA VAR
11000.00 € submit
11000.00 € submit
 MULTEXT JOC Corpus    
  • English
  • French
  • German
  • Italian
  • Spanish; Castilian

ID: ELRA-W0017

ISLRN: 900-482-746-635-0

This CD-ROM contains a part of the corpus developed in the MULTEXT project financed by the European Commission (LRE 62-050). This part contains raw, tagged and aligned data from the Written Questions and Answers of the Official Journal of the European Community. The corpus contains approx. 5 mill...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 MULTEXT Prosodic database    
  • English
  • French
  • German
  • Italian
  • Spanish; Castilian

ID: ELRA-S0060

ISLRN: 098-719-242-965-4

This database comprises one CD-ROM for each five languages (French, English, Italian, German and Spanish), totalling 4 hours and 20 minutes of speech and involving 50 different speakers (5 male and 5 female per language). The recordings on which the corpus is based consist of passages of about f...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 PortMedia French and Italian corpus    
  • French
  • Italian

ID: ELRA-S0371

ISLRN: 135-793-959-390-8

The PortMedia French and Italian corpus was produced by ELDA, with the same paradigm and specifications as the MEDIA speech database (ELRA-S0272) but on a different domain. The method chosen for the corpus construction process is that of a ‘Wizard of Oz’ (WoZ) system. This consists of simulating...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
20000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
1000.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
25000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
6500.00 € submit
6500.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

 SPK    
  • Italian

ID: ELRA-S0049

ISLRN: 604-522-044-889-5

SPK is an Italian speech database of isolated and connected digits. It was designed and collected at the Istituto per la Ricerca Scientifica e Tecnologica (ITC/IRST), Trento, Italy. SPK was conceived for speaker recognition and verification purposes.With this CD-ROM, speech material corresponding...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
800.00 € submit
800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
1600.00 € submit
Licence: Commercial Use - ELRA VAR
1600.00 € submit
1600.00 € submit
 The "SIVA" Speech Database for Speaker Verification and Identification    
  • Italian

ID: ELRA-S0028

ISLRN: 405-085-829-227-5

The Italian speech database SIVA (?Speaker Identification and Verification Archives: SIVA?), is a database comprising more than two thousands calls, collected over the public switched telephone network, and available very soon via ELRA. The SIVA database consists of four speaker categories: male...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
4500.00 € submit
Licence: Commercial Use - ELRA VAR
4500.00 € submit
4500.00 € submit
 Venice Italian Treebank (VIT)    
  • Italian

ID: ELRA-W0040

ISLRN: 008-645-281-539-8

The VIT, Venice Italian Treebank is the effort of the collaboration of people working at the Laboratory of Computational Linguistics of the University of Venice in the years 1995-2005. It is partly the result of annotation carried out internally with no specific project in mind and no financial s...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit