23 Language Resources (Page 1 of 2)

« Previous | Next »Order by:

 APASCI    
  • Italian

ID: ELRA-S0039

ISLRN: 501-292-014-931-9

APASCI is an Italian speech database recorded in an insulated room with a Sennheiser MKH 416 T microphone. It includes 5,290 phonetically rich sentences and 10,800 isolated digits, for a total of 58,924 word occurrences (2,191 different words) and 641 minutes of speech. The speech material was re...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1600.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
 ARCADE/ROMANSEVAL corpus    
  • English
  • French
  • Italian

ID: ELRA-W0018

ISLRN: 681-769-134-114-2

The ARCADE/ROMANSEVAL corpus was used as a reference corpus in two international competitions: · ARCADE, an exercise on multilingual text alignment financed by AUPELF-UREF · ROMANSEVAL, part of the SENSEVAL exercise sponsored by ACL-SIGLEX and EURALEX, on word sense disambiguation. The corpus con...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 AURORA Project database - Subset of SpeechDat-Car - Italian database - Evaluation Package    
  • Italian

ID: ELRA-AURORA-CD0003-05

ISLRN: 928-117-373-440-7

The Aurora project was originally set up to establish a world wide standard for the feature extraction software which forms the core of the front-end of a DSR (Distributed Speech Recognition) system. ETSI formally adopted this activity as work items 007 and 008.The two work items within ETSI are ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
1000.00 € submit
 CLIPS_MT_MANUAL    
  • Italian

ID: ELRA-S0369

ISLRN: 865-893-448-970-5

CLIPS_MT_MANUAL is a sub-corpus of the original Italian CLIPS corpus (Corpora e Lessici dell'Italiano Parlato e Scritto). This corpus contains 3228 inspected and partially repaired WAV signal files, each containing one dialogue turn (*.wav), 3228 corrected original CLIPS annotation files (*.acs, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
127.82 € submit
255.64 € submit
Licence: Commercial Use - ELRA VAR
255.64 € submit
255.64 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
255.65 € submit
511.30 € submit
Licence: Commercial Use - ELRA VAR
511.30 € submit
511.30 € submit
 C-ORAL-ROM - Integrated reference corpora for spoken romance languages. Multi-media edition; tools of analysis; standard linguistic measurements for validation in HLT    
  • French
  • Italian
  • Portuguese
  • Spanish; Castilian

ID: ELRA-S0172

ISLRN: 318-977-046-077-4

Description The C-ORAL-ROM resource is a multilingual corpus of spontaneous1 speech for the main romance languages of around 1,200,000 words (IST 2000-26228). The resource comprises three components: a)Multimedia corpus; b)Speech software; c)Appendix. The corpus consists of four comparable recor...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
 ECI-ELSNET Italian & German tagged sub-corpus    
  • German
  • Italian

ID: ELRA-W0005

ISLRN: 869-857-775-378-7

The objective is to provide a small but fine grained morphosyntactically tagged corpus, 50.000 running words for each of the two languages (Italian and German) to be used in research work on tagging methods and models. The text for German comes from the Frankfurter Rundschau extracted from the EC...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
20.00 € submit
20.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
45.00 € submit
 European Parliament Interpretation Corpus (EPIC)    
  • English
  • Italian
  • Spanish; Castilian

ID: ELRA-S0323

ISLRN: 716-168-855-843-2

The EPIC corpus is a parallel corpus of European Parliament speeches and their corresponding simultaneous interpretations. This corpus includes source speeches in Italian, English and Spanish and interpreted speeches in all possible combinations and directions (from English into Italian and Spani...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
 IBNC - An Italian Broadcast News Corpus    
  • Italian

ID: ELRA-S0093

ISLRN: 133-155-327-792-1

The Italian Broadcast News Corpus (IBNC) was produced by the ITC-IRST (Italy) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335). RAI, the major Italian broadcast company, supplied studio quality recordings...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
8000.00 € submit
25000.00 € submit
 Italian Speech Corpus 1 (Appen)    
  • Italian

ID: ELRA-S0147

ISLRN: 458-657-455-735-5

The Italian Speech Corpus 1 contains the recordings of 202 native Italian speakers (112 males, 90 females) recorded in an office and a closed public place, over 4 channels, in a range of low to medium background noise environments (Plantronics Audio 10 (computer/desk mic), Shure SM58 (desk mounte...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
9500.00 € submit
Licence: Commercial Use - ELRA VAR
9500.00 € submit
9500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Italian SpeechDat-Car database    
  • Italian

ID: ELRA-S0144

ISLRN: 513-325-829-468-0

The Italian SpeechDat-Car database contains the recordings of 300 Italian speakers (149 females, 151 males) recorded over the GSM telephone network, in a car. This database is partitioned into 14 DVDs. The speech data files are in two formats. Four of the 5 microphones were recorded on the comput...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
90000.00 € submit
90000.00 € submit
Licence: Commercial Use - ELRA VAR
90000.00 € submit
90000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
120000.00 € submit
120000.00 € submit
Licence: Commercial Use - ELRA VAR
120000.00 € submit
120000.00 € submit
 Italian Speecon database    
  • Italian

ID: ELRA-S0213

ISLRN: 239-555-046-548-2

The Italian Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Italian speakers (273 males, 277 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the reco...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 Italian Syntactic-Semantic Treebank (ISST)    
  • Italian

ID: ELRA-W0044

ISLRN: 927-246-660-947-9

ISST comprises 89,941 tokens for the financial-domain part and 215,606 tokens for the general part. It is formatted in XML. ISST has a five-level structure covering orthographic, morpho-syntactic, syntactic and semantic levels of linguistic description. Syntactic annotation is distributed over t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 Italian TTS Speech Corpus (Appen)    
  • Italian

ID: ELRA-S0148

ISLRN: 976-246-706-503-6

The Italian TTS Speech Corpus contains the recordings of 1 native Italian speaker (male, 50 years old) recorded in a studio over 1 channel (Shure SM15 unidirectional professional head-word condenser microphone). The data collection and transcription were performed by Appen (Australia). Speech sam...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
9000.00 € submit
Licence: Commercial Use - ELRA VAR
9000.00 € submit
9000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3500.00 € submit
11000.00 € submit
Licence: Commercial Use - ELRA VAR
11000.00 € submit
11000.00 € submit
 MLCC Multilingual and Parallel Corpora    
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Italian
  • Modern Greek (1453-)
  • Portuguese
  • Spanish; Castilian

ID: ELRA-W0023

ISLRN: 963-635-729-341-8

The MLCC text corpus has two main components - one set to allow comparable studies to be carried out in different languages and one set as the basis for translation studies. The first set is referred as the Polylingual Document Collection, a collection of newspaper articles from financial newsp...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3600.00 € submit
 MULTEXT JOC Corpus    
  • English
  • French
  • German
  • Italian
  • Spanish; Castilian

ID: ELRA-W0017

ISLRN: 900-482-746-635-0

This CD-ROM contains a part of the corpus developed in the MULTEXT project financed by the European Commission (LRE 62-050). This part contains raw, tagged and aligned data from the Written Questions and Answers of the Official Journal of the European Community. The corpus contains approx. 5 mill...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 MULTEXT Prosodic database    
  • English
  • French
  • German
  • Italian
  • Spanish; Castilian

ID: ELRA-S0060

ISLRN: 098-719-242-965-4

This database comprises one CD-ROM for each five languages (French, English, Italian, German and Spanish), totalling 4 hours and 20 minutes of speech and involving 50 different speakers (5 male and 5 female per language). The recordings on which the corpus is based consist of passages of about f...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 PANACEA Environment Italian monolingual corpus    
  • Italian

ID: ELRA-W0069

ISLRN: 843-358-936-298-5

The PANACEA Environment Italian monolingual corpus was acquired in the framework of the PANACEA project (Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies), under the European Commission's Seventh Framework Programme...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 PANACEA Labour Italian monolingual corpus    
  • Italian

ID: ELRA-W0070

ISLRN: 393-864-255-110-7

The PANACEA Labour Italian monolingual corpus was acquired in the framework of the PANACEA project (Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies), under the European Commission's Seventh Framework Programme. T...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 PAROLE Italian Corpus    
  • Italian

ID: ELRA-W0043

ISLRN: 608-362-291-385-1

The PAROLE Italian Corpus comprises 3,135,651 words collected from four different domains: • newspapers: 2,179,800 words from La Stampa, La Repubblica, Il Corriere della Sera, L’Unione Sarda, Il Sole 24ore, between 1992 and 1996, • periodicals: 143,810 words from Casaviva, 100cose, Epoca, Espan...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
100.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
150.00 € submit
 PortMedia French and Italian corpus    
  • French
  • Italian

ID: ELRA-S0371

ISLRN: 135-793-959-390-8

The PortMedia French and Italian corpus was produced by ELDA, with the same paradigm and specifications as the MEDIA speech database (ELRA-S0272) but on a different domain. The method chosen for the corpus construction process is that of a ‘Wizard of Oz’ (WoZ) system. This consists of simulating...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
20000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
1000.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
25000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
6500.00 € submit
6500.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

« Previous | Next »