32 Language Resources (Page 2 of 2)

« Previous | Next » Order by:

 MCL - Multifunctional Computational Lexicon of Contemporary Portuguese    
  • Portuguese

ID: ELRA-L0096

ISLRN: 489-956-642-755-8

MCL is a 26,443 lemma Frequency Lexicon with 140,315 tokens, with the minimum lemma frequency of 6, extracted from CORLEX, a contemporary Portuguese corpus (16,210,438 words). CORLEX is a subcorpus of the Reference Corpus of Contemporary Portuguese and contains written and spoken texts of several...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 MLCC Multilingual and Parallel Corpora    
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Italian
  • Modern Greek (1453-)
  • Portuguese
  • Spanish; Castilian

ID: ELRA-W0023

ISLRN: 963-635-729-341-8

The MLCC text corpus has two main components - one set to allow comparable studies to be carried out in different languages and one set as the basis for translation studies. The first set is referred as the Polylingual Document Collection, a collection of newspaper articles from financial newsp...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3600.00 € submit
 NPChunks    
  • Portuguese

ID: ELRA-W0089

ISLRN: 412-883-442-173-8

NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, in...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 PAROLE Portuguese Corpus - complete version    
  • Portuguese

ID: ELRA-W0024-01

ISLRN: 150-996-959-735-6

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
875.00 € submit
1575.00 € submit
Licence: Commercial Use - ELRA VAR
2450.00 € submit
2450.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1250.00 € submit
2250.00 € submit
Licence: Commercial Use - ELRA VAR
3500.00 € submit
3500.00 € submit
 PAROLE Portuguese Corpus - tagged subset    
  • Portuguese

ID: ELRA-W0024-02

ISLRN: 421-666-892-484-5

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
525.00 € submit
875.00 € submit
Licence: Commercial Use - ELRA VAR
1750.00 € submit
1750.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 PAROLE Portuguese Lexicon    
  • Portuguese

ID: ELRA-L0035

ISLRN: 288-684-309-273-5

The PAROLE Portuguese Lexicon is constituted by 20 thousand entries morpho-syntactically and syntactically encoded, accordingly to the parole common encoding standards. The data is in SGML format. *** Introduction on the PAROLE project LE-PAROLE project (MLAP/LE2-4017) aims to offer a large-sca...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1400.00 € submit
3500.00 € submit
Licence: Commercial Use - ELRA VAR
10500.00 € submit
10500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Portuguese SpeechDat(II) FDB-4000    
  • Portuguese

ID: ELRA-S0092

ISLRN: 886-605-380-771-9

The Portuguese SpeechDat(II) FDB-4000 comprises 4027 Portuguese speakers (1861 males, 2166 females) recorded over the Portuguese fixed telephone network. This database is partitioned into 11 CDs. The speech databases made within the SpeechDat(II) project were validated by SPEX, the Netherlands, t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
28000.00 € submit
40000.00 € submit
Licence: Commercial Use - ELRA VAR
40000.00 € submit
40000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
48000.00 € submit
56000.00 € submit
Licence: Commercial Use - ELRA VAR
56000.00 € submit
56000.00 € submit
 Portuguese SpeechDat(M) database    
  • Portuguese

ID: ELRA-S0068

ISLRN: 181-020-544-041-9

The Portuguese SpeechDat(M) database contains the recordings of 1,001 speakers (453 males, 548 females). This speech database was collected by Portugal Telecom within the European SpeechDat project. Speech signals are stored as sequences of 8 kHz, 8-bit A-law. Files are stored according to the f...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11000.00 € submit
14000.00 € submit
Licence: Commercial Use - ELRA VAR
14000.00 € submit
14000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
 Portuguese Speecon database    
  • Portuguese

ID: ELRA-S0180

ISLRN: 824-839-200-501-4

The Portuguese Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 553 adult Portuguese speakers (266 males, 287 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises th...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 PTPARL Corpus    
  • Portuguese

ID: ELRA-W0060

ISLRN: 294-303-577-819-2

The PTPARL Corpus contains 1,076 texts consisting of adapted transcriptions of the Portuguese Parliament sessions. The corpus contains 1,000,441 tokens. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each te...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 Spoken Portuguese Corpus    
  • Portuguese

ID: ELRA-S0345

ISLRN: 969-074-010-182-2

The Spoken Portuguese corpus was collected among sociolinguistically diverse speakers having Portuguese as mother tongue or as second language. In a total of 86 recordings, the texts exemplify the Portuguese spoken in Portugal (30), in Brazil (20), in the African countries with Portuguese as its ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 The CINTIL Corpus – International Corpus of Portuguese    
  • Portuguese

ID: ELRA-W0050

ISLRN: 176-775-844-396-0

CINTIL-Corpus Internacional do Português is a linguistically interpreted written and spoken corpus of European Portuguese. It is composed of one million annotated tokens, each one of which verified by human expert annotators. The annotation comprises information on part-of-speech, open class lemm...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit

« Previous | Next »