Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Anglais Français
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-W0122
    2007 CoNLL Shared Task - Greek, Hungarian & Italian
    2007 CoNLL Shared Task - Greek, Hungarian & Italian consists of dependency treebanks in three languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Greek, Hungarian and Italian.

    The Conference on Computational Natural Language Learning (CoNLL) is accompanied every year by a shared task intended to promote natural language processing applications and evaluate them in a standard setting. In 2006 and 2007, the shared task was devoted to the parsing of syntactic dependencies using corpora from up to thirteen languages. The task aimed to define and extend the then-current state of the art in dependency parsing, a technology that complemented previous tasks by producing a different kind of syntactic description of input text. The 2007 shared task added a domain adaptation track for English in addition to the multilingual track. More information about CoNLL and the 2007 shared task are available respectively at: http://www.signll.org/conll/ and http://www.conll.org/previous-tasks.

    The source data in the treebanks in this release consists principally of various texts (e.g., textbooks, news, literature) annotated in dependency format. In general, dependency grammar is based on the idea that the verb is the center of the clause structure and that other units in the sentence are connected to the verb as directed links or dependencies. This is a one-to-one correspondence: for every element in the sentence there is one node in the sentence structure that corresponds to that element. In constituency or phrase structure grammars, on the other hand, clauses are divided into noun phrases and verb phrases and in each sentence, one or more nodes may correspond to one element. All of the data sets in this release are dependency treebanks.

    The individual data sets are:
    Greek Dependency Treebank (Greek)
    The Szeged Treebank (SzTB) (Hungarian)
    ISST-CoNLL (Italian)

    This corpus is distributed jointly with LDC. LDC Catalogue Reference is: https://catalog.ldc.upenn.edu/LDC2018T07.

    ISLRN : 270-733-242-642-3
    Creation date : 2007
    Technical Information
    Distribution medium : Downloadable
    Contents Click on the arrow to display content.
    written corpus 
    Members Prices
    Academic - Research Free
    Commercial - Research Free
    Non Member Prices
    Academic - Research Free
    Commercial - Research Free

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0