Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Languages
Informations
Purchase procedure & Conditions
Pricing & user licences
How to promote your resources ?
Contact Us
Catalog Reference : W0030
Al-Hayat Arabic Corpus
The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University.
The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes.
The data have been distributed into 7 subject-specific databases, thus following the Al-Hayat subject tags: General, Car, Computer, News, Economics, Science, and Sport.
Mark-up, numbers, special characters and punctuation have been removed. The size of the total file is 268 MB. The dataset contains 18,639,264 distinct tokens in 42,591 articles, organised in 7 domains.
Technical Information
Distribution medium :
CD-ROM
Contents
Click on the arrow to display content.
written corpus
Number of languages
:
Language(s) :
Arabic
Members Prices
Academic - Commercial 960.00 EUR
Academic - Research 480.00 EUR
Commercial - Commercial 960.00 EUR
Commercial - Research 960.00 EUR
Non Member Prices
Academic - Commercial 1440.00 EUR
Academic - Research 720.00 EUR
Commercial - Commercial 1440.00 EUR
Commercial - Research 1440.00 EUR
Saturday 31 July, 2010
5266516 requests since Monday 27 September, 2004
Copyright © 2008
ELRA
ELRACatalogue 0.8.0