ELRA ELRA
  Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-W0051
    English-Persian parallel Corpus
    The corpus consists of about 3,500,000 English and Persian (Farsi) words aligned at sentence level (about 100,000 sentences, distributed over 50,021 entries). The format of the files is Unicode. It has been originally created with SQL Server, but it is presented in access file type. The texts in the corpus include a variety of text types, wich are distributed as follows:
    - Art: 1804 entries (3.61%)
    - Culture: 5097 entries (10.19%)
    - Idiom: 435 entries (0.87%)
    - Law: 2266 entries (4.53%)
    - Literature: 11470 entries (22.93%)
    - Medicine: 1089 entries (2.18%)
    - Others: 16989 entries (33.96%)
    - Poetry: 692 entries (1.38%)
    - Politics: 5493 entries (10.98%)
    - Proverb: 292 entries (0.58%)
    - Religion: 686 entries (1.37%)
    - Science: 3708 entries (7.41%)
    Technical Information
    Distribution medium : CD-ROM
    Contents Click on the arrow to display content.
    written corpus 
     
    Members Prices
    Academic - Commercial 2500.00 EUR
    Academic - Research 500.00 EUR
    Commercial - Commercial 2500.00 EUR
    Commercial - Research 2500.00 EUR
    Non Member Prices
    Academic - Commercial 3000.00 EUR
    Academic - Research 600.00 EUR
    Commercial - Commercial 3000.00 EUR
    Commercial - Research 3000.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0