Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

11 Language Resources

Order by:

 CINTIL-DeepBank    
  • Portuguese

ID: ELRA-W0062

ISLRN: 368-672-631-502-0

The CINTIL-DeepBank (Branco et al., 2010) is a corpus of sentences annotated with their full-fledged deep grammatical representations, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-DependencyBank    
  • Portuguese

ID: ELRA-W0061

ISLRN: 133-035-138-613-6

The CINTIL-DependencyBank (Silva and Branco, 2012) is a corpus of sentences annotated with their syntactic dependency graphs and grammatical function tags composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-PropBank    
  • Portuguese

ID: ELRA-W0056

ISLRN: 723-486-478-286-6

The CINTIL-PropBank is a corpus of sentences annotated with their constituency structure and semantic role tags, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082 tokens). In addition,...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-TreeBank    
  • Portuguese

ID: ELRA-W0055

ISLRN: 411-691-515-701-9

The CINTIL-TreeBank is a corpus of syntactic constituency trees of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens). In addition, there are 779 sentences (5,654 t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 Learner Corpus of Portuguese L2 – COPLE2    
  • Portuguese

ID: ELRA-W0331

ISLRN: 936-320-703-366-7

The Learner Corpus of Portuguese as Second/Foreign Language (COPLE2) is a corpus of written and oral texts produced by students of Portuguese as Foreign/Second Language courses in the Instituto de Cultura e Língua Portuguesa (the Institute of Portuguese Language and Culture) (ICLP – FLUL) and by ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 LT Corpus    
  • Portuguese

ID: ELRA-W0059

ISLRN: 569-208-468-863-2

The LT Corpus is composed of 70 fiction texts from Portuguese renowned authors. The corpus contains 1,781,083 tokens. The texts date from before 1940. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each tex...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 NPChunks    
  • Portuguese

ID: ELRA-W0089

ISLRN: 412-883-442-173-8

NPChunks is a training corpus containing approximately 1,000 sentences, with a total of 24,243 tokens, selected randomly from the written part of the CINTIL corpus. For more information on the CINTIL corpus, see ELRA-W0050, ISLRN: 176-775-844-396-0. The corpus is PoS-annotated at token level, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 PAROLE Portuguese Corpus - complete version    
  • Portuguese

ID: ELRA-W0024-01

ISLRN: 150-996-959-735-6

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
875.00 € submit
1575.00 € submit
Licence: Commercial Use - ELRA VAR
2450.00 € submit
2450.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1250.00 € submit
2250.00 € submit
Licence: Commercial Use - ELRA VAR
3500.00 € submit
3500.00 € submit
 PAROLE Portuguese Corpus - tagged subset    
  • Portuguese

ID: ELRA-W0024-02

ISLRN: 421-666-892-484-5

The parole Portuguese corpus contains approximately 3 million running words of European Portuguese distributed by Medium, as follows: * Newspaper: about 65%, covering the period 1996-1997 of 3 titles; * Book: about 20%, concerning 12 titles from 3 editing houses; * Periodical: about ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
525.00 € submit
875.00 € submit
Licence: Commercial Use - ELRA VAR
1750.00 € submit
1750.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 PTPARL Corpus    
  • Portuguese

ID: ELRA-W0060

ISLRN: 294-303-577-819-2

The PTPARL Corpus contains 1,076 texts consisting of adapted transcriptions of the Portuguese Parliament sessions. The corpus contains 1,000,441 tokens. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 The CINTIL Corpus – International Corpus of Portuguese    
  • Portuguese

ID: ELRA-W0050

ISLRN: 176-775-844-396-0

CINTIL-Corpus Internacional do Português is a linguistically interpreted written and spoken corpus of European Portuguese. It is composed of one million annotated tokens, each one of which verified by human expert annotators. The annotation comprises information on part-of-speech, open class lemm...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit