ELRA ELRA
  Home Catalogue » Advanced Search » Search Results
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Products meeting the search criteria Products meeting the search criteria
    Displaying 1 to 20 (of 57 products) Result Pages:  1  2  3  [Next >>] 

    ELRA-S0228-01
    Mandarin Chinese Speech Synthesis Corpus (Basic Corpus) (Available since 17/01/2007.)


    This corpus contains the recordings of 1 native Chinese speaker (female).
    The corpus is composed of 20 texts with 109,227 words and has been proofread manually. The corpus contents include: phrases, digit strings, letter strings, uncommon words, neutral tone, final retroflexion, Latin alphabet, interrogative sentences, 282 English words.
    The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 18.2 hours.
    Speech samples are stored as sequences of 16-bit 44,1 kHz PCM on two channels. The total data size is 5.67 Gb for a total of 12,679 files. The data is encoded in GB-2312 format.
    The transcriptions include labels for four-class pause boundaries.
    This database is aimed to be used within text-to-speech and speech synthesis applications.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 13932.00 EUR 13932.00 EUR
    Commercial Use 13932.00 EUR 13932.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 13932.00 EUR 13932.00 EUR
    Commercial Use 13932.00 EUR 13932.00 EUR


    ELRA-S0228-02
    Mandarin Chinese Speech Synthesis Corpus (Available since 17/01/2007.)


    This corpus contains the recordings of 1 native Chinese speaker (female).
    The corpus is complementing the Basic Corpus (ELRA-S0228/01) and aims at covering a variety of speech context data which does not include syllables.
    The corpus is composed of 28 texts with 75,841 words and has been proofread manually. The corpus contents include: text of statements, digit strings, uncommon words, letter strings, measurement units, neutral tone, final retroflexion, latin alphabet, interrogative sentences, English words and room-ordering stimulation.
    The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 30.2 hours.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 5971.00 EUR 5971.00 EUR
    Commercial Use 5971.00 EUR 5971.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 5971.00 EUR 5971.00 EUR
    Commercial Use 5971.00 EUR 5971.00 EUR


    ELRA-S0228-03
    Mandarin Chinese Speech Synthesis Corpus (Integrated Corpus) (Available since 17/01/2007.)


    The Mandarin Chinese Speech Synthesis Integrated Corpus includes both Basic and Accessory Corpora (see ELRA-S0228/01 and ELRA- S0228/02).
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 19903.00 EUR 19903.00 EUR
    Commercial Use 19903.00 EUR 19903.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 19903.00 EUR 19903.00 EUR
    Commercial Use 19903.00 EUR 19903.00 EUR


    ELRA-S0228-04
    Mandarin Chinese Telephone Speech Recognition Corpus - Person Name, Place Name (Mobile telephone 265) (Available since 17/01/2007.)


    This corpus comprises 6,952 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the mobile telephone network. The database comprises 13,942 Chinese personal names and place names. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 17.6 hours of speech. The total capacity of the data is 964 Mb.
    Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 3383.00 EUR 3383.00 EUR
    Commercial Use 3383.00 EUR 3383.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 3383.00 EUR 3383.00 EUR
    Commercial Use 3383.00 EUR 3383.00 EUR


    ELRA-S0228-05
    Mandarin Chinese Telephone Speech Recognition Corpus -Person Name, Place Name (Available since 17/01/2007.)


    This corpus comprises 7,298 entries uttered by 285 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the fixed telephone network. The database comprises 14,492 Chinese personal names and place names. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 17.6 hours of speech. The total capacity of the data is 968 Mb.
    Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 2786.00 EUR 2786.00 EUR
    Commercial Use 2786.00 EUR 2786.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2786.00 EUR 2786.00 EUR
    Commercial Use 2786.00 EUR 2786.00 EUR


    ELRA-S0228-06
    Mandarin Chinese Telephone Speech Recognition Corpus - Digit String (Available since 17/01/2007.)


    This corpus comprises 5,309 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the fixed telephone network. The database comprises 7,606 Chinese digit strings. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 11.8 hours of speech. The total capacity of the data is 648 Mb.
    Each speaker read 25-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1791.00 EUR 1791.00 EUR
    Commercial Use 1791.00 EUR 1791.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 1791.00 EUR 1791.00 EUR
    Commercial Use 1791.00 EUR 1791.00 EUR


    ELRA-S0228-07
    Mandarin Chinese Telephone Speech Recognition Corpus - Digit String (Available since 17/01/2007.)


    This corpus comprises 6,140 entries uttered by 265 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the mobile telephone network. The database comprises 8,109 Chinese digit strings. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 11.8 hours of speech. The total capacity of the data is 669 Mb.
    Each speaker read 25-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1592.00 EUR 1592.00 EUR
    Commercial Use 1592.00 EUR 1592.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 1592.00 EUR 1592.00 EUR
    Commercial Use 1592.00 EUR 1592.00 EUR


    ELRA-S0228-08
    Mandarin Chinese Telephone Speech Recognition Corpus - Stock (Available since 17/01/2007.)


    This corpus comprises 3,085 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the mobile telephone network. The database comprises 6,972 Chinese stocks. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 7 hours of speech. The total capacity of the data is 387 Mb.
    Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1592.00 EUR 1592.00 EUR
    Commercial Use 1592.00 EUR 1592.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 1592.00 EUR 1592.00 EUR
    Commercial Use 1592.00 EUR 1592.00 EUR


    ELRA-S0228-09
    Mandarin Chinese Telephone Speech Recognition Corpus - Stock (Available since 17/01/2007.)


    This corpus comprises 3,077 entries uttered by 285 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the fixed telephone network. The database comprises 7,239 Chinese stocks. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 7 hours of speech. The total capacity of the data is 373 Mb.
    Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1394.00 EUR 1394.00 EUR
    Commercial Use 1394.00 EUR 1394.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 1394.00 EUR 1394.00 EUR
    Commercial Use 1394.00 EUR 1394.00 EUR


    ELRA-S0228-10
    Mandarin Chinese Telephone Speech Recognition Corpus – SMS (Mobile telephone 64) (Available since 17/01/2007.)


    This corpus comprises 1,079 entries uttered by 64 speakers of different dialects, ages and various educational levels (52 males and 12 females), recorded over the mobile telephone network. The database comprises 3,190 Chinese short messages (SMS). Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 3 hours of speech. The total capacity of the data is 161 Mb.
    Each speaker read 50 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 796.00 EUR 796.00 EUR
    Commercial Use 796.00 EUR 796.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 796.00 EUR 796.00 EUR
    Commercial Use 796.00 EUR 796.00 EUR


    ELRA-S0228-11
    Mandarin Chinese Telephone Speech Recognition Corpus – SMS (Fixed phone 86) (Available since 17/01/2007.)


    This corpus comprises 1,648 entries uttered by 86 speakers of different dialects, ages and various educational levels (64 males and 22 females), recorded over the fixed telephone network. The database comprises 4,282 Chinese short messages (SMS). Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 3.7 hours of speech. The total capacity of the data is 205 Mb.
    Each speaker read 50 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 796.00 EUR 796.00 EUR
    Commercial Use 796.00 EUR 796.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 796.00 EUR 796.00 EUR
    Commercial Use 796.00 EUR 796.00 EUR


    ELRA-S0228-12
    Mandarin Chinese Desktop Speech Recognition Corpus - SMS (200 people) (Available since 17/01/2007.)


    This corpus comprises 7,276 entries uttered by 200 speakers of different dialects, ages and various educational levels (87 males and 113 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 23,949 short messages (SMS) per channel. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 35.6 hours of speech per channel. The total capacity of the data is 21.1 Gb.
    Each speaker read 120 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 5175.00 EUR 5175.00 EUR
    Commercial Use 5175.00 EUR 5175.00 EUR
    * Price for one channel only. For 2 channels: 8280€

    Non Membres Academic org. Commercial org.
    Research Use 5175.00 EUR 5175.00 EUR
    Commercial Use 5175.00 EUR 5175.00 EUR
    * Price for one channel only. For 2 channels: 8280€


    ELRA-S0228-13
    Mandarin Chinese Desktop Speech Recognition Corpus - Digit String (200 people) (Available since 17/01/2007.)


    This corpus comprises 1,500 entries uttered by 200 speakers of different dialects, ages and various educational levels (87 males and 113 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 6,000 digit strings per channel. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 11.5 hours of speech per channel. The total capacity of the data is 6.82 Gb.
    Each speaker read 30 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1194.00 EUR 1194.00 EUR
    Commercial Use 1194.00 EUR 1194.00 EUR
    * Price for one channel only. For 2 channels: 1911€

    Non Membres Academic org. Commercial org.
    Research Use 1194.00 EUR 1194.00 EUR
    Commercial Use 1194.00 EUR 1194.00 EUR
    * Price for one channel only. For 2 channels: 1911€


    ELRA-S0228-14
    Mandarin Chinese Desktop Speech Recognition Corpus - Person name, Place Name (10 people) (Available since 17/01/2007.)


    This corpus comprises 782 entries uttered by 10 speakers of different dialects, ages and various educational levels (3 males and 7 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 800 Chinese items per channel: 30 stocks, 10 nation names, 10 Chinese city names, 30 person names. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 0.97 hours of speech per channel. The total capacity of the data is 587 Mb.
    Each speaker read 120 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 100.00 EUR 100.00 EUR
    Commercial Use 100.00 EUR 100.00 EUR
    * Price for one channel only. For 2 channels: 159€

    Non Membres Academic org. Commercial org.
    Research Use 100.00 EUR 100.00 EUR
    Commercial Use 100.00 EUR 100.00 EUR
    * Price for one channel only. For 2 channels: 159€


    ELRA-S0228-15
    Mandarin Chinese Desktop Speech Recognition Corpus - SMS (120 people) (Available since 17/01/2007.)


    This corpus comprises 7,142 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 16,499 short messages (SMS). Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 21.7 hours of speech. The total capacity of the data is 3.2 Gb.
    Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 3185.00 EUR 3185.00 EUR
    Commercial Use 3185.00 EUR 3185.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 3185.00 EUR 3185.00 EUR
    Commercial Use 3185.00 EUR 3185.00 EUR


    ELRA-S0228-16
    Mandarin Chinese Desktop Speech Recognition Corpus - Digit String (120 people) (Available since 17/01/2007.)


    This corpus comprises 1,500 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 3,600 digit strings. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 6.2 hours of speech. The total capacity of the data is 945 Mb.
    Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 597.00 EUR 597.00 EUR
    Commercial Use 597.00 EUR 597.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 597.00 EUR 597.00 EUR
    Commercial Use 597.00 EUR 597.00 EUR


    ELRA-S0228-17
    Mandarin Chinese Desktop Speech Recognition Corpus - Person Name, Place Name (70 people) (Available since 17/01/2007.)


    This corpus comprises 9,667 entries uttered by 70 speakers of different dialects, ages and various educational levels (38 males and 32 females), recorded through head-mounted noise-canceling microphone. The database comprises 12,596 items. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 15 hours of speech. The total capacity of the data is 2.17 Gb.
    Each speaker read 60 person names, 20 country names, 10 Chinese city names, 30 street names, 50 company and organization names, 10 geographical names. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 1991.00 EUR 1991.00 EUR
    Commercial Use 1991.00 EUR 1991.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 1991.00 EUR 1991.00 EUR
    Commercial Use 1991.00 EUR 1991.00 EUR


    ELRA-S0228-18
    Mandarin Chinese Desktop Speech Recognition Corpus - Stock (70 people) (Available since 17/01/2007.)


    This corpus comprises 1,586 entries uttered by 70 speakers of different dialects, ages and various educational levels (38 males and 32 females), recorded through head-mounted noise-canceling microphone. The database comprises 4,199 items. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 5.1 hours of speech. The total capacity of the data is 776 Mb.
    Each speaker read 60 stocks. Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 597.00 EUR 597.00 EUR
    Commercial Use 597.00 EUR 597.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 597.00 EUR 597.00 EUR
    Commercial Use 597.00 EUR 597.00 EUR


    ELRA-S0228-19
    Mandarin Chinese Desktop Speech Recognition Corpus - Spontaneous Speech (50 people) (Available since 17/01/2007.)


    This corpus comprises spontaneous speech (elicited) from 50 speakers of different dialects, ages and various educational levels (21 males and 29 females), who uttered 36 different topics in a working environment, recorded through head-mounted noise-cancelling microphone. The database comprises 600 speech files. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for a total of 8 hours of speech. The total capacity of the data is 2.37 Gb.
    Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 2986.00 EUR 2986.00 EUR
    Commercial Use 2986.00 EUR 2986.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2986.00 EUR 2986.00 EUR
    Commercial Use 2986.00 EUR 2986.00 EUR


    ELRA-S0228-20
    Mandarin Chinese Desktop Speech Recognition Corpus - Stock、 Person Name 、Digit String、Simple Chinese sentences、Spontaneous Speech (50 people) (Available since 17/01/2007.)


    This corpus comprises 8,206 entries including stocks, person names, digit strings and 8,511 speech files composed of spontaneous speech, uttered by 50 speakers of different dialects, ages and various educational levels (22 males and 28 females), recorded from a stand microphone (SHURE SM58). Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for a total of 24 hours of speech. The total capacity of the data is 7 Gb.
    Text files are stored in Unicode format. All data have been proofread manually.
    The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
    The corpus aims to be applied to the testing and telephone natural speech recognition system.
    Language(s) : Chinese

    Membres Academic org. Commercial org.
    Research Use 2192.00 EUR 2192.00 EUR
    Commercial Use 2192.00 EUR 2192.00 EUR

    Non Membres Academic org. Commercial org.
    Research Use 2192.00 EUR 2192.00 EUR
    Commercial Use 2192.00 EUR 2192.00 EUR


    Displaying 1 to 20 (of 57 products) Result Pages:  1  2  3  [Next >>] 
    Back

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0