Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

2 Language Resources

Order by:

 ROCO Romanian journalistic corpus    
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0085

ISLRN: 312-617-089-348-7

ROCO is a Romanian journalistic corpus containing approximately 7.1 million tokens, the number of types being 231,626. It is rich in proper names, numerals and named entities. The corpus contains morphosyntactic information (MSD annotations) which has been assigned automatically with the high...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 ROMBAC - Romanian balanced corpus    
  • Romanian; Moldavian; Moldovan

ID: ELRA-W0088

ISLRN: 162-192-982-061-0

ROMBAC is a Romanian corpus containing equal shares of texts from 5 different genres: journalism, legalese, fiction, medicine and biographical data for Romanian literary personalities. For each genre, texts have been selected containing around 7,000,000 words, so that the entire corpus counts aro...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
8000.00 € submit
Licence: Commercial Use - ELRA VAR
8000.00 € submit
8000.00 € submit