ELRA ELRA
  Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-L0072-01
    PAROLE-SIMPLE-CLIPS PISA Italian Lexicon – Full lexicon
    This lexicon is subdivided into five different subsets:
    L0072-01 Full lexicon
    L0072-02 Phonetic layer
    L0072-03 Morphological layer
    L0072-04 Syntactic layer
    L0072-05 Semantic layer

    PAROLE-SIMPLE-CLIPS is a four-level, general purpose lexicon that has been elaborated over three different projects. The kernel of the morphological and syntactic lexicons was built in the framework of the LE-PAROLE project. The linguistic model and the core of the semantic lexicon were elaborated in the LE-SIMPLE project, while the phonological level of description and the extension of the lexical coverage were performed in the context of the Italian project Corpora e Lessici dell'Italiano Parlato e Scritto (CLIPS).

    The PAROLE-SIMPLE-CLIPS Pisa Italian Lexicon comprises a total of 387,267 phonetic units, 53,044 morphological units (53,044 lemmas), 37,406 syntactic units (28,111 lemmas) and 28,346 semantic units (19,216 lemmas). It was encoded at the semantic level, in full accordance with the international standards set out in the PAROLE-SIMPLE model and based on EAGLES. Syntactic and semantic encoding were performed jointly with Thamus (Consortium for Multilingual Documentary Engineering), which is responsible for 25,000 extra entries (to be released soon).

    PAROLE-SIMPLE-CLIPS offers therefore the advantage of being compatible with the other eleven PAROLE-SIMPLE lexicons that were built for European languages and that share a common theoretical model, representation language and building methodology.

    A PAROLE-SIMPLE-CLIPS entry gathers together all the phonological, morphological and inherent syntactic and semantic properties of a headword. Its subcategorization pattern is (or are) described in terms of optionality, syntactic function, syntagmatic realization as well as morpho-syntactic, syntactic and lexical properties of each slot filler. At the semantic level, the theoretical approach adopted by the SIMPLE model is essentially grounded on a revisited version of some fundamental aspects of the Generative Lexicon.

    A SIMPLE-CLIPS semantic unit is richly endowed with a wide range of fine-grained, structured information, most relevant for NLP applications. First among them, the ontological typing: the lexicon is in fact structured in terms of a multidimensional type system based on both hierarchical and non-hierarchical conceptual relations, taking into account the principle of orthogonal inheritance. Other relevant information types in a word entry are its domain of use; type of denoted event; synonymy and morphological derivation relations; membership in a class of regular polysemy as well as any relevant distinctive semantic features. Particularly outstanding is the information encoded in the Extended Qualia Structure (a set of 60 semantic relations that allow modelling both the different meaning dimensions of a word sense and its relationships to other lexical units) and the Predicative Representation which describes the semantic scenario the word sense considered is involved in and characterizes its participants in terms of thematic roles and semantic constraints.

    In a word’s description, lexical information is interrelated across the four description levels. Syntactic and semantic information, in particular, is related to each other through the projection of the predicate-argument structure onto its syntactic realization(s).

    References :
    Ruimy N., Corazzari O., Gola E., Spanu A., Calzolari N., Zampolli A. 2003. The PAROLE model and the Italian Syntactic lexicon. In A. Zampolli, N. Calzolari, L. Cignoni, (eds.), Computational Linguistics in Pisa - Linguistica Computazionale a Pisa. Linguistica Computazionale, Special Issue, XVIII-XIX, (2003). Pisa-Roma, IEPI. Tomo II, 793-820.

    Lenci A., Busa F., Ruimy N., Gola E., Monachini M., Calzolari N., Zampolli A. et al., 2000. SIMPLE Linguistic Specifications, SIMPLE LE4-8346 EC Project, Deliverable D2.1 & D2.2, WP02, Final version, March 2000, ILC and University of Pisa, 404 pp. (http://www.ub.es/gilcub/SIMPLE/simple.html#Specifications).

    Ruimy N., Monachini M., Gola E., Calzolari N., Del Fiorentino M.C., Ulivieri M., Rossi S. 2003. A computational semantic lexicon of Italian: SIMPLE. In A. Zampolli, N. Calzolari, L. Cignoni, (eds.), Computational Linguistics in Pisa - Linguistica Computazionale a Pisa. Linguistica Computazionale, Special Issue, XVIII-XIX, (2003). Pisa-Roma, IEPI. Tomo II, 821-864.

    Ruimy N., Monachini M., Distante R., Guazzini E., Molino S., Ulivieri M., Calzolari N., Zampolli A. 2002. CLIPS, A Multi-level Italian Computational Lexicon: a Glimpse to Data. LREC 2002: Third LREC. Las Palmas de Gran Canaria, Spain 29th, 30th & 31 May 2002. Proceedings, Volume III, Paris, The European Languages Resources Association (ELRA). 792-799.
    Identification
    Period of coverage :
    Version :
    Version history : Update frequencly: monthly Last update: May 2005
    Production
    Project : PAROLE-SIMPLE-CLIPS Creation date : 1996-2003
    Technical Information
    Platform : PC
    Fileformat : Plain text
    Contents Click on the arrow to display content.
    written lexicon 
     
    Members Prices
    Academic - Commercial 12000.00 EUR
    Academic - Research 1500.00 EUR
    Commercial - Commercial 12000.00 EUR
    Commercial - Research 12000.00 EUR
    Non Member Prices
    Academic - Commercial 15600.00 EUR
    Academic - Research 2000.00 EUR
    Commercial - Commercial 15600.00 EUR
    Commercial - Research 15600.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0