43 Language Resources (Page 2 of 3)

« Previous | Next »Order by:

 GlobalPhone Portuguese (Brazilian)    
  • Portuguese

ID: ELRA-S0201

ISLRN: 803-518-309-388-6

The GlobalPhone corpus developed in collaboration with the Karlsruhe Institute of Technology (KIT) was designed to provide read speech data for the development and evaluation of large continuous speech recognition systems in the most widespread languages of the world, and to provide a uniform, mu...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

 GlobalPhone Portuguese (Brazilian) Pronunciation Dictionary      
  • Portuguese

ID: ELRA-S0355

ISLRN: 240-605-210-375-4

The GlobalPhone pronunciation dictionaries, created within the framework of the multilingual speech and language corpus GlobalPhone, were developed in collaboration with the Karlsruhe Institute of Technology (KIT). The GlobalPhone pronunciation dictionaries contain the pronunciations of all wo...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

 LABEL-LEX (MW)    
  • Portuguese

ID: ELRA-L0054

ISLRN: 502-837-497-805-9

LABEL-LEX (MW) is a Portuguese formalized lexicon, containing 88 619 inflected multiword lexical units (formally, sequences of simple words). The units are distributed as follows: - 85,881 nouns, with information about type, gender, number, inflected forms, irregular inflected forms and subcatego...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 LABEL-LEX (SW)    
  • Portuguese

ID: ELRA-L0055

ISLRN: 154-511-437-811-6

LABEL-LEX (SW) is a Portuguese formalized lexicon, containing 1,545,481 simple inflected words. The words are distributed as follows: - 142,236 nouns, with information about type, gender, number, inflected forms, and irregular inflected forms - 3,155 adverbs, with information about degree, polari...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Learner Corpus of Portuguese L2 – COPLE2    
  • Portuguese

ID: ELRA-W0331

ISLRN: 936-320-703-366-7

The Learner Corpus of Portuguese as Second/Foreign Language (COPLE2) is a corpus of written and oral texts produced by students of Portuguese as Foreign/Second Language courses in the Instituto de Cultura e Língua Portuguesa (the Institute of Portuguese Language and Culture) (ICLP – FLUL) and by ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 LECTRA (LECture TRAnscriptions in European Portuguese)    
  • Portuguese

ID: ELRA-S0366

ISLRN: 298-379-572-530-5

This corpus is composed of the audio and the manual transcriptions of the LECTRA Corpus: classroom LECture TRAnscriptions in European Portuguese. The corpus includes seven 1-semester University courses. All lectures were taught at Technical University of Lisbon (IST), recorded in the presence of ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
1250.00 € submit
1250.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 LEX-MWE-PT - Word Combination in Portuguese    
  • Portuguese

ID: ELRA-L0097

ISLRN: 353-430-176-260-6

LEX-MWE-PT is a lexicon of European Portuguese containing multiword expressions (MWE) extracted from a balanced 50.8M-word written corpus – a subcorpus of the Reference Corpus of Contemporary Portuguese (CRPC). This corpus covers different genres, being mainly constituted by journalistic texts (5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
180.00 € submit
1800.00 € submit
Licence: Commercial Use - ELRA VAR
1800.00 € submit
1800.00 € submit
 LT Corpus    
  • Portuguese

ID: ELRA-W0059

ISLRN: 569-208-468-863-2

The LT Corpus is composed of 70 fiction texts from Portuguese renowned authors. The corpus contains 1,781,083 tokens. The texts date from before 1940. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each tex...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 LusoLEX European Portuguese Lexicon    
  • Portuguese

ID: ELRA-L0033

ISLRN: 686-955-010-935-8

LusoLEX is a multifunctional monolingual lexicon of the European variety of Portuguese, developed by the Natural Language Group of INESC. It has about 61,000 entries (lemmas) and 1,600 correspondent inflexion paradigms. The set of entries includes compound words and the inflexion paradigms includ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
30000.00 € submit
Licence: Commercial Use - ELRA VAR
30000.00 € submit
30000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

 MCL - Multifunctional Computational Lexicon of Contemporary Portuguese    
  • Portuguese

ID: ELRA-L0096

ISLRN: 489-956-642-755-8

MCL is a 26,443 lemma Frequency Lexicon with 140,315 tokens, with the minimum lemma frequency of 6, extracted from CORLEX, a contemporary Portuguese corpus (16,210,438 words). CORLEX is a subcorpus of the Reference Corpus of Contemporary Portuguese and contains written and spoken texts of several...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 Morphological lexicon - Portuguese    
  • Portuguese

ID: ELRA-L0203-11

ISLRN: 678-879-069-421-5

Morphological lists linking inflected forms to their lemmas, distributed as follows (catalogue references from ELRA-L0203-01 to ELRA-L0203-15): Language Code Lemmas Word forms Dutch nl 157,000 205,603 English en 69,308 160,441 French fr 79,843 442,085 German de 95,282 456,244 Hebrew he ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1260.00 € submit
1260.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1323.00 € submit
1323.00 € submit
 NPChunks    
  • Portuguese

ID: ELRA-W0089

ISLRN: 412-883-442-173-8

NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 PAROLE Portuguese Corpus - complete version    
  • Portuguese

ID: ELRA-W0024-01

ISLRN: 150-996-959-735-6

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
875.00 € submit
1575.00 € submit
Licence: Commercial Use - ELRA VAR
2450.00 € submit
2450.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1250.00 € submit
2250.00 € submit
Licence: Commercial Use - ELRA VAR
3500.00 € submit
3500.00 € submit
 PAROLE Portuguese Corpus - tagged subset    
  • Portuguese

ID: ELRA-W0024-02

ISLRN: 421-666-892-484-5

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
525.00 € submit
875.00 € submit
Licence: Commercial Use - ELRA VAR
1750.00 € submit
1750.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 PAROLE Portuguese Lexicon    
  • Portuguese

ID: ELRA-L0035

ISLRN: 288-684-309-273-5

The PAROLE Portuguese Lexicon is constituted by 20 thousand entries morpho-syntactically and syntactically encoded, accordingly to the parole common encoding standards. The data is in SGML format. *** Introduction on the PAROLE project LE-PAROLE project (MLAP/LE2-4017) aims to offer a large-sca...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1400.00 € submit
3500.00 € submit
Licence: Commercial Use - ELRA VAR
10500.00 € submit
10500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Portuguese SpeechDat(II) FDB-4000    
  • Portuguese

ID: ELRA-S0092

ISLRN: 886-605-380-771-9

The Portuguese SpeechDat(II) FDB-4000 comprises 4027 Portuguese speakers (1861 males, 2166 females) recorded over the Portuguese fixed telephone network. This database is partitioned into 11 CDs. The speech databases made within the SpeechDat(II) project were validated by SPEX, the Netherlands, t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
28000.00 € submit
40000.00 € submit
Licence: Commercial Use - ELRA VAR
40000.00 € submit
40000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
48000.00 € submit
56000.00 € submit
Licence: Commercial Use - ELRA VAR
56000.00 € submit
56000.00 € submit
 Portuguese SpeechDat(M) database    
  • Portuguese

ID: ELRA-S0068

ISLRN: 181-020-544-041-9

The Portuguese SpeechDat(M) database contains the recordings of 1,001 speakers (453 males, 548 females). This speech database was collected by Portugal Telecom within the European SpeechDat project. Speech signals are stored as sequences of 8 kHz, 8-bit A-law. Files are stored according to the f...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11000.00 € submit
14000.00 € submit
Licence: Commercial Use - ELRA VAR
14000.00 € submit
14000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
 Portuguese Speech Recognition Corpus (Desktop)    
  • Portuguese

ID: ELRA-S0228-83

ISLRN: 044-289-806-584-3

This corpus comprises 49,988 entries uttered by 50 speakers (26 males and 24 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 26.41 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 Portuguese Speecon database    
  • Portuguese

ID: ELRA-S0180

ISLRN: 824-839-200-501-4

The Portuguese Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 553 adult Portuguese speakers (266 males, 287 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises th...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 PTPARL Corpus    
  • Portuguese

ID: ELRA-W0060

ISLRN: 294-303-577-819-2

The PTPARL Corpus contains 1,076 texts consisting of adapted transcriptions of the Portuguese Parliament sessions. The corpus contains 1,000,441 tokens. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit

« Previous | Next »