ELRA ELRA
  Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : W0030
    Al-Hayat Arabic Corpus
    The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University.
    The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes.
    The data have been distributed into 7 subject-specific databases, thus following the Al-Hayat subject tags: General, Car, Computer, News, Economics, Science, and Sport.
    Mark-up, numbers, special characters and punctuation have been removed. The size of the total file is 268 MB. The dataset contains 18,639,264 distinct tokens in 42,591 articles, organised in 7 domains.
    Technical Information
    Distribution medium : CD-ROM
    Contents Click on the arrow to display content.
    written corpus 
     
    Members Prices
    Academic - Commercial 960.00 EUR
    Academic - Research 480.00 EUR
    Commercial - Commercial 960.00 EUR
    Commercial - Research 960.00 EUR
    Non Member Prices
    Academic - Commercial 1440.00 EUR
    Academic - Research 720.00 EUR
    Commercial - Commercial 1440.00 EUR
    Commercial - Research 1440.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0