ELRA ELRA
  Home Catalogue » Spoken Resources » Desktop/microphone
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-B0009
    TC-STAR English Training Corpora for ASR: Transcriptions of EPPS Speech
    TC-STAR is a European integrated project focusing on all core technologies for Speech-to-Speech Translation (SST): Automatic Speech Recognition (ASR), Spoken Language Translation (SLT), and Text to Speech Synthesis (TTS).

    This corpus consists of transcriptions from 92 hours of EPPS (European Parliament Plenary Sessions) speeches held or interpreted in European English (a mixture of native and non-native English). The recordings (not included in the present package) were obtained from Europe by Satellite (https://ec.europa.eu/avservices/ebs/schedule.cfm) from May 2004 until May 2006. The corpus consists of 63 transcriptions files. The transcription files are stored in Transcriber XML file format.

    The speech databases made within the TC-STAR project were validated by SPEX, in the Netherlands, to assess their compliance with the TC-STAR format and content specifications.

    For corresponding recordings, see ELRA-S0251.

    ISLRN : 521-254-874-619-5
    Production
    Project : TC-STAR
    Applications
    Applications existing : Automatic speech recognition
    Technical Information
    Distribution medium : Downloadable
    Contents Click on the arrow to display content.
     speech corpus 
    Resource files
  • ICON_FILE_DOWNLOAD Validation report
  • TC-STAR English Training Corpora for ASR: Recordings of EPPS Speech
    TC-STAR is a European integrated project focusing on all core technologies for Speech-to-Speech Translation (SST): Automatic Speech Recognition (ASR), Spoken Language Translation (SLT), and Text to Speech Synthesis (TTS).

    This corpus consists of the recordings of around 290 hours from EPPS (European Parliament Plenary Sessions) speeches held or interpreted in European English (a mixture of native and non-native English), 92 hours of which were annotated (transcribed) (the transcriptions are not included in the present package). These recordings were obtained from Europe by Satellite (https://ec.europa.eu/avservices/ebs/schedule.cfm) from May 2004 until May 2006.

    The speech signals were submitted by EbS via internet in Real Media format and via satellite in MPEG1-layer2 format. The signals were decoded, resampled and are stored in WAVE RIFF (Resource Interchange File Format). Each file contains a single channel with 16-bit resolution at a sample rate of 16kHz.

    The speech databases made within the TC-STAR project were validated by SPEX, in the Netherlands, to assess their compliance with the TC-STAR format and content specifications.

    For corresponding transcriptions, see ELRA-S0249.

    ISLRN : 428-162-628-204-7
    Production
    Project : TC-STAR
    Applications
    Applications existing : Automatic speech recognition
    Technical Information
    Distribution medium : Downloadable
    Contents Click on the arrow to display content.
     speech corpus 
    Resource files
  • ICON_FILE_DOWNLOAD Validation report
  •  
    Members Prices
    Academic - Commercial 6850.00 EUR
    Academic - Research 4800.00 EUR
    Commercial - Commercial 6850.00 EUR
    Commercial - Research 6850.00 EUR
    Non Member Prices
    Academic - Commercial 9000.00 EUR
    Academic - Research 6270.00 EUR
    Commercial - Commercial 9000.00 EUR
    Commercial - Research 9000.00 EUR

    Resources available separately
    Click on the reference below for standalone resources.
    ELRA-S0249
    ELRA-S0251

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0