Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Anglais Français
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-S0250
    TC-STAR English-Spanish Training Corpora for Machine Translation: Aligned Final Text Editions of EPPS
    TC-STAR is a European integrated project focusing on all core technologies for Speech-to-Speech Translation (SST): Automatic Speech Recognition (ASR), Spoken Language Translation (SLT), and Text to Speech Synthesis (TTS).

    This corpus consists of respectively 34 million (English) and 38 million (Spanish) running words of bilingual sentence segmented and aligned texts in English and Spanish obtained from the Final Text Editions provided by the European Parliament (http://www.europarl.europa.eu) from April 1996 to Sept. 2004, Dec. 2004 to May 2005, and Dec. 2005 to May 2006. The data is accompanied by tools for further preprocessing.

    ISLRN : 219-619-756-916-1
    Project : TC-STAR
    Technical Information
    Distribution medium : Downloadable
    Contents Click on the arrow to display content.
     speech lexicon 
    Members Prices
    Academic - Commercial 4250.00 EUR
    Academic - Research 3000.00 EUR
    Commercial - Commercial 4250.00 EUR
    Commercial - Research 4250.00 EUR
    Non Member Prices
    Academic - Commercial 5600.00 EUR
    Academic - Research 3925.00 EUR
    Commercial - Commercial 5600.00 EUR
    Commercial - Research 5600.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0