Text (113)
Audio (1)
Video (1)
Available (113)
True (5)
TEI (5)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

113 Language Resources (Page 2 of 6)

« Previous | Next »Order by:

 CINTIL-DependencyBank    
  • Portuguese

ID: ELRA-W0061

ISLRN: 133-035-138-613-6

The CINTIL-DependencyBank (Silva and Branco, 2012) is a corpus of sentences annotated with their syntactic dependency graphs and grammatical function tags composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-PropBank    
  • Portuguese

ID: ELRA-W0056

ISLRN: 723-486-478-286-6

The CINTIL-PropBank is a corpus of sentences annotated with their constituency structure and semantic role tags, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082 tokens). In addition,...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-TreeBank    
  • Portuguese

ID: ELRA-W0055

ISLRN: 411-691-515-701-9

The CINTIL-TreeBank is a corpus of syntactic constituency trees of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens). In addition, there are 779 sentences (5,654 t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CLEFeHealth 2013 Task 3 Evaluation Package    
  • English

ID: ELRA-E0042

ISLRN: 749-772-201-451-6

The CLEF Initiative (Conference and Labs of the Evaluation Forum) promotes the systematic evaluation of information access systems through experimentation on shared tasks, with an emphasis on multilingual and multimodal information. The CLEFeHealth 2013 Task 3 Evaluation Package contains data ...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit
 CLEFeHealth 2014 Task 3 Evaluation Package    
  • English

ID: ELRA-E0043

ISLRN: 725-020-897-275-7

The CLEF Initiative (Conference and Labs of the Evaluation Forum) promotes the systematic evaluation of information access systems through experimentation on shared tasks, with an emphasis on multilingual and multimodal information. It is structured in two main parts: a series of Evaluation Labs ...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit
 CLEF QAST (2007-2009) – Evaluation Package    
  • English
  • French
  • Spanish; Castilian

ID: ELRA-E0039

ISLRN: 460-370-870-489-0

The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit

Special offers are also available. Check here for details.

 Corpus for fine-grained analysis and automatic detection of irony on Twitter    
  • English

ID: ELRA-W0337

ISLRN: 478-366-550-085-8

The Corpus for fine-grained analysis and automatic detection of irony on Twitter was carefully annotated by trained annotators (Master’s students in Linguistics) using a detailed annotation scheme for irony categorization, which describes four labels: ‘ironic by means of a polarity contrast’, ‘si...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
100.00 € submit
Licence: Commercial Use - ELRA VAR
100.00 € submit
100.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
200.00 € submit
Licence: Commercial Use - ELRA VAR
200.00 € submit
200.00 € submit
 Corpus of Contemporaneous Spanish Novels    
  • Spanish; Castilian

ID: ELRA-W0041

ISLRN: 837-873-214-287-0

This corpus consists of 11 novels written in Castilian Spanish by Inmaculada Ferrer-Vidal Turull, a contemporaneous author. The list of novels consists of: - La búsqueda: 113,639 words - Tristeza: 41,125 words - Cuarto menguante: 42,419 words - Recuerdos: 55,694 words - Sucedió en Abril: 46,040 w...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
800.00 € submit
800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 Corpus of Icelandic texts from the Central Bank of Iceland (Processed)    
  • Icelandic

ID: ELRA-W0298

ISLRN: 420-670-865-427-1

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Corpus of Icelandic texts from the Central Bank of Icela...

MEMBERacademiccommercial
Licence: Attribution, Other - Open Under-PSI
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Other - Open Under-PSI
0.00 € submit
0.00 € submit
 Danish Gigaword Corpus    
  • Danish

ID: ELRA-W0318

ISLRN: 024-504-318-388-3

The Danish Gigaword Project (DAGW) maintains a corpus for Danish with over a billion words. The general goals are to create a dataset that is: 1. representative; 2. accessible; 3. a suitable common starting point for Danish NLP models. The present version 1.0 was collected from various webs...

MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution - CC-BY-4.0
0.00 € submit
0.00 € submit
 Danish Propbank    
  • Danish

ID: ELRA-W0117

ISLRN: 213-212-351-142-5

The Danish Propbank (DPB) is a multi-layer treebank, annotated not only with morphosyntactic, but also with semantic information, in particular propositions/frames with VerbNet classes and semantic roles for both arguments and satellites. In addition, the corpus has been annotated with 20 Named E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
 DEFT'08 Evaluation Package    
  • French

ID: ELRA-E0035

ISLRN: 161-881-080-899-5

DEFT (DEfi Fouille de Texte – Text Mining Challenge) organizes evaluation campaigns in the field of text mining. The topic of DEFT 2008 edition is related to the classification of texts by topics and genres. Automatic classification has multiple applications in text mining. Many application fiel...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
240.91 € submit
240.91 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
313.18 € submit
313.18 € submit
 deL1L2IM corpus    
  • German

ID: ELRA-W0083

ISLRN: 339-799-085-669-8

The deL1L2IM corpus, created between May and August 2012 and last updated in August 2014, has been collected within the framework of a PhD project on the development of a learning method implying conversations with an artificial companion. This PhD work is presented as a qualitative investigation...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 Dutch PAROLE Distributable Corpus    
  • Dutch; Flemish

ID: ELRA-W0019

ISLRN: 440-290-917-102-7

The Dutch PAROLE Distributable Corpus is a 3 million words selection from the 20 million words Dutch PAROLE Reference corpus. The Dutch corpus annotation and checking was made accordingly to the common core PAROLE tagset. The Dutch data were also checked for type. The Dutch PAROLE Distributable...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
270.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
1600.00 € submit
1600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1300.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit

Special offers are also available. Check here for details.

 EASy Evaluation Package    
  • French

ID: ELRA-E0034

ISLRN: 238-723-334-894-5

The EASy Evaluation Package was produced within the French national project EASy (Evaluation of syntactic parsers of French), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The project enabled to carry out a campaign for the evaluation...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 ECI-ELSNET Italian & German tagged sub-corpus    
  • German
  • Italian

ID: ELRA-W0005

ISLRN: 869-857-775-378-7

The objective is to provide a small but fine grained morphosyntactically tagged corpus, 50.000 running words for each of the two languages (Italian and German) to be used in research work on tagging methods and models. The text for German comes from the Frankfurter Rundschau extracted from the EC...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
20.00 € submit
20.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
45.00 € submit
 Ema-lon Manipuri Corpus (including word embedding and language model)    
  • English
  • Manipuri

ID: ELRA-W0316

ISLRN: 588-170-827-016-7

The Ema-lon Manipuri Corpus consists of a set of resources for Manipuri language (locally known as Meiteilon) for the purpose of machine translation. The main source for these resources is the Sangai Express news website. The resources that constitute the present corpus are listed below: 1. EM C...

MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use - CC-BY-NC-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use - CC-BY-NC-4.0
0.00 € submit
0.00 € submit
 EQueR Evaluation Package    
  • French

ID: ELRA-E0022

ISLRN: 725-358-759-122-3

The EQueR Evaluation Package was produced within the French national project EQueR (Evaluation campaign for Question-Answering systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The EQueR project enabled to carry out a campaign f...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 EvaSy Evaluation Package      
  • French

ID: ELRA-E0023

ISLRN: 340-228-754-954-9

The EvaSy Evaluation Package was produced within the French national project EvaSy (Evaluation of speech synthesis systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The EvaSy project enabled to carry out a campaign for the evalu...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 German Political Speeches Corpus    
  • German

ID: ELRA-W0330

ISLRN: 381-445-879-769-5

This corpus consists of a collection of political speeches in German crawled from the online archive of the German presidency (Bundespraësident) and the Chancellery (Bundesregierung). For the German Presidency the speeches are available from July 1, 1984 to February 17, 2012 and the corpus con...

MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Share Alike - CC-BY-SA
0.00 € submit
0.00 € submit

« Previous | Next »