ELRA ELRA
  Home Catalogue » Spoken Resources » Broadcast Resources
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Broadcast Resources
    Displaying 1 to 15 (of 15 products) Result Pages:  1 

    ELRA-E0021
    ESTER Evaluation Package (Available since 28/06/2007)


    The ESTER Evaluation Package was produced within the French national project ESTER (Evaluation of Broadcast News enriched transcription systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ESTER project enabled to carry out a campaign for the evaluation of Broadcast News enriched transcription systems for French.
    This package includes the material that was used for the ESTER evaluation campaign. It includes resources, protocols, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of these evaluation packages is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself.
    The campaign is distributed over three actions: orthographic transcription, segmentation and information extraction (named entity tracking).
    For research or commercial use, please refer to ELRA-S0241 ESTER Corpus.
    Language(s) : French
    ISLRN : 110-079-844-983-7

    Membres Academic org. Commercial org.
    Evaluation Use 300.00 EUR 1000.00 EUR
    * 1,700 hours of non-transcribed radio broadcast news recordings available on hard disk with an extra cost of 100 Euro.

    Non Membres Academic org. Commercial org.
    Evaluation Use 2000.00 EUR 6500.00 EUR
    * 1,700 hours of non-transcribed radio broadcast news recordings available on hard disk with an extra cost of 100 Euro.


    ELRA-E0044
    REPERE Evaluation Package (Available since 25/02/2015)


    The REPERE Evaluation Package contains the visual annotation of 60 hours of French news TV shows, for the purpose of person recognition within TV programs. This annotation concerns both persons and written information appearing on screen. Provided data consists of:
    - video files with indexes and with manual transcriptions in XGTF format (Viper),
    - audio files compressed in WAV format with transcriptions in TRS format (Transcriber).
    Language(s) : French
    ISLRN : 360-758-359-485-0

    Membres Academic org. Commercial org.
    Research Use 300.00 EUR 5000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR
    Evaluation Use 1000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 7500.00 EUR
    Commercial Use 25000.00 EUR 25000.00 EUR
    Evaluation Use 6500.00 EUR


    ELRA-E0046
    ETAPE Evaluation Package (Available since 23/02/2017)


    The ETAPE Evaluation Package consists of ca. 30 hours of radio and TV data, selected to include mostly non planned speech and a reasonable proportion of multiple speaker data. All data were carefully transcribed, including named entity annotation.
    This package includes the material that was used for the ETAPE evaluation campaign. It includes resources, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of this evaluation package is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself.
    Language(s) : French
    ISLRN : 425-777-374-455-4

    Membres Academic org. Commercial org.
    Research Use 300.00 EUR 5000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR
    Evaluation Use 1000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 7500.00 EUR
    Commercial Use 25000.00 EUR 25000.00 EUR
    Evaluation Use 6500.00 EUR


    ELRA-S0093
    IBNC - An Italian Broadcast News Corpus (Available since 15/12/2000)


    Produced through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335), the collection consists of 150 broadcast programs from the RAI, for a total time of about 30 hours, issued in 36 different days, between 1992 and 1999. down-sampled to 16kHz 16 bit, and encoded into the NIST Sphere PCM format.
    Language(s) : Italian
    ISLRN : 133-155-327-792-1

    Membres Academic org. Commercial org.
    Research Use 5000.00 EUR 15000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 8000.00 EUR 25000.00 EUR
    ICON_FILE_DOWNLOAD 


    ELRA-S0157
    NetDC Arabic BNSC (Broadcast News Speech Corpus) (Available since 08/02/2007)


    The NetDC Arabic BNSC (Broadcast News Speech Corpus) is a corpus developed by ELDA in the framework of the European-funded project Network of Data Centres (NetDC). The project was done in collaboration with the LDC (Linguistic Data Consortium), which has produced a similar corpus from the news broadcasted by Voice of America Arabic in the United States. The database contains ca. 22.5 hours of broadcast news speech recorded from Radio Orient (France) during a 3-month period.
    Language(s) : Arabic
    ISLRN : 663-177-513-755-1

    Membres Academic org. Commercial org.
    Research Use 100.00 EUR 1350.00 EUR
    Commercial Use 1350.00 EUR 1350.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 200.00 EUR 2700.00 EUR
    Commercial Use 2700.00 EUR 2700.00 EUR
    ICON_FILE_DOWNLOAD 


    ELRA-S0172
    C-ORAL-ROM - Integrated reference corpora for spoken romance languages. Multi-media edition; tools of analysis; standard linguistic measurements for validation in HLT (C-ORAL-ROM)(Available since 23/12/2004)


    C-ORAL-ROM is a multilingual corpus which consists of four comparable recording collections of French, Italian, Portuguese, and Spanish spontaneous speech sessions. It contains around 1,200,000 words (around 300,000 words per language) and provides the acoustic source of each session together with the orthographic transcription, session metadata, and text to speech synchronization, in Win Pitch Corpus format. The multimedia corpus comes with the speech software Win Pitch Corpus.
    Language(s) : Italian - French - Spanish, Castilian - Portuguese
    ISLRN : 318-977-046-077-4

    Membres Academic org. Commercial org.
    Research Use 1500.00 EUR 10000.00 EUR
    Commercial Use 10000.00 EUR 10000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 3000.00 EUR 20000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR
    ICON_FILE_DOWNLOAD 


    ELRA-S0219
    NEMLAR Broadcast News Speech Corpus (Available since 11/08/2006)


    The Nemlar Broadcast News Speech Corpus consists of about 40 hours of Standard Arabic news broadcasts. The broadcasts were recorded from four different radio stations: Medi1, Radio Orient, RMC – Radio Monte Carlo, RTM – Radio Television Maroc. All files were recorded in linear PCM format, 16 kHz, 16 bit.
    Language(s) : Arabic
    ISLRN : 479-507-036-103-9

    Membres Academic org. Commercial org.
    Research Use 150.00 EUR 500.00 EUR
    Commercial Use 2000.00 EUR 2000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 300.00 EUR 1000.00 EUR
    Commercial Use 4000.00 EUR 4000.00 EUR
    ICON_FILE_DOWNLOAD  Special Prices available.


    ELRA-S0241
    ESTER Corpus (Available since 28/06/2007)


    The ESTER Corpus is a subset of the ESTER Evaluation Package (catalogue ref. ELRA-E0021), which was produced within the French national project ESTER (Evaluation of Broadcast News enriched transcription systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ESTER project enabled to carry out a campaign for the evaluation of Broadcast News enriched transcription systems for French.
    This corpus includes the material that was used for the ESTER evaluation campaign, excluding the textual data (available in this catalogue and referenced ELRA-W0015 and ELRA-W0023).
    Language(s) : French
    ISLRN : 055-636-352-982-9

    Membres Academic org. Commercial org.
    Research Use 300.00 EUR 5000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR
    Evaluation Use 1000.00 EUR
    * 1,700 hours of non-transcribed radio broadcast news recordings available on hard disk with an extra cost of 100 Euro.

    Non Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 7500.00 EUR
    Commercial Use 25000.00 EUR 25000.00 EUR
    Evaluation Use 6500.00 EUR
    * 1,700 hours of non-transcribed radio broadcast news recordings available on hard disk with an extra cost of 100 Euro.


    ELRA-S0275
    Slovenian BNSI Broadcast News Speech Corpus (Available since 22/04/2008)


    This speech database consists of TV news shows (both evening news, “TV Dnevnik” and late night news, “Odmevi”), from the archive of a Slovenian national broadcaster RTV Slovenia. The recordings took place between June 1999 and May 2003. The database comprises a total of 36 hours of recordings, transcribed and manually checked using the Transcriber tool. 1,565 speakers were recorded (1,069 males, 477 females, 19 unspecified).
    Language(s) : Slovenian
    ISLRN : 502-280-144-938-4

    Membres Academic org. Commercial org.
    Research Use 6000.00 EUR 19000.00 EUR
    Commercial Use 19000.00 EUR 19000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 10000.00 EUR 33000.00 EUR
    Commercial Use 33000.00 EUR 33000.00 EUR


    ELRA-S0305
    EPAC Corpus: orthographic transcriptions (Available since 29/03/2010)


    This corpus consists of approx. 100 hours of manual orthographic transcriptions, which were produced from 1,677 hours of non transcribed recordings from the ESTER Evaluation Campaign (Technolangue programme). This corpus also consists of automatic transcriptions of the full 1,677 hours.
    Language(s) : French
    ISLRN : 483-703-007-740-8

    Membres Academic org. Commercial org.
    Research Use 300.00 EUR 5000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 7500.00 EUR
    Commercial Use 25000.00 EUR 25000.00 EUR


    ELRA-S0338
    ESTER 2 Corpus (Available since 29/03/2012)


    ESTER 2 Corpus, produced within the ESTER 2 evaluation campaign, consists of a manually transcribed radio broadcast news corpus amounting about 100 hours and quick transcriptions of African radios amounting about 6 hours. An annotation of named entities is provided within the development data (about 6 hours).
    Language(s) : French
    ISLRN : 123-207-221-143-8

    Membres Academic org. Commercial org.
    Research Use 300.00 EUR 5000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 7500.00 EUR
    Commercial Use 25000.00 EUR 25000.00 EUR


    ELRA-S0349
    Quaero Broadcast News Extended Named Entity corpus (Available since 13/02/2013)


    This corpus consists of the manual annotation of (i) the ESTER 2 (see also ELRA-S0338) manual transcription corpus and (ii) the Quaero Speech Recognition Evaluation corpus (manual and automatic transcriptions coming from 3 different ASR systems). The corpus is fully manually annotated according to the Quaero extended and structured named entity definition.
    Language(s) : French
    ISLRN : 074-668-446-920-0

    Membres Academic org. Commercial org.
    Research Use 0.00 EUR 3000.00 EUR
    Commercial Use 3000.00 EUR 3000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 0.00 EUR 5000.00 EUR
    Commercial Use 5000.00 EUR 5000.00 EUR


    ELRA-S0374
    FoxPersonTracks: a Benchmark for Person Re-Identification from TV Broadcast Shows (Available since 06/04/2016)


    FoxPersonTracks is a person track dataset dedicated to person re-identification. The dataset is built from a set of real life TV shows broadcasted from BFMTV and LCP TV french channels, provided during REPERE challenge. It contains a total 4,604 persontracks (short video sequences featuring an individual with no background) from 266 persons. The dataset also provides re-identification results using space-time histograms as a baseline, together with an evaluation tool in order to ease the comparison to other re- identification methods.
    Language(s) : French
    ISLRN : 168-132-570-218-1

    Membres Academic org. Commercial org.
    Research Use 0.00 EUR 2000.00 EUR
    Commercial Use 2000.00 EUR 2000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 0.00 EUR 2500.00 EUR
    Commercial Use 2500.00 EUR 2500.00 EUR


    ELRA-S0381
    TRAD Pashto Broadcast News Speech Corpus (Available since 06/04/2016)


    This corpus contains 108 hours of broadcast news recordings transcribed, covering more than 1,000 speakers. Transcriptions are provided together with the audio files and include about 46,000 segments and 1.1M words.
    Language(s) : Pushto
    ISLRN : 918-508-885-913-7

    Membres Academic org. Commercial org.
    Research Use 2000.00 EUR 20000.00 EUR
    Commercial Use 20000.00 EUR 20000.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 3500.00 EUR 28000.00 EUR
    Commercial Use 28000.00 EUR 28000.00 EUR


    ELRA-S0391
    The FAME! Speech Corpus (Available since 07/04/2017)


    This Frisian corpus consists of 203 audio segments of approximately 5 minutes long extracted from various radio programs covering a time span of almost 50 years (1966-2015), adding a longitudinal dimension to the database. The content of the recordings are very diverse including radio programs about culture, history, literature, sports, nature, agriculture, politics, society and languages. There are 309 identified speakers in the FAME! Speech Corpus, 21 of whom appear at least 3 times in the database. The total duration of the manually annotated radio broadcasts sums up to 18 hours, 33 minutes and 57 seconds.
    Language(s) : Frisian
    ISLRN : 340-994-352-616-4

    Membres Academic org. Commercial org.
    Research Use 0.00 EUR 1500.00 EUR
    Commercial Use 1500.00 EUR 1500.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 0.00 EUR 3500.00 EUR
    Commercial Use 3500.00 EUR 3500.00 EUR


    Displaying 1 to 15 (of 15 products) Result Pages:  1 

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0