 |
Language Resources |
 |
|
 |
Bug reports |
 |
|
 |
Search Catalogue |
 |
|
 |
Languages |
 |
|
 |
Informations |
 |
|
|
| Products meeting the search criteria |
 |
|
 |
| Displaying 1 to 20 (of 57 products) |
Result Pages: 1 |
This corpus contains the recordings of 1 native Chinese speaker (female).
The corpus is composed of 20 texts with 109,227 words and has been proofread manually. The corpus contents include: phrases, digit strings, letter strings, uncommon words, neutral tone, final retroflexion, Latin alphabet, interrogative sentences, 282 English words.
The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 18.2 hours.
Speech samples are stored as sequences of 16-bit 44,1 kHz PCM on two channels. The total data size is 5.67 Gb for a total of 12,679 files. The data is encoded in GB-2312 format.
The transcriptions include labels for four-class pause boundaries.
This database is aimed to be used within text-to-speech and speech synthesis applications.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
13932.00 EUR |
13932.00 EUR |
| Commercial Use |
13932.00 EUR |
13932.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
13932.00 EUR |
13932.00 EUR |
| Commercial Use |
13932.00 EUR |
13932.00 EUR |
|
|
|
This corpus contains the recordings of 1 native Chinese speaker (female).
The corpus is complementing the Basic Corpus (ELRA-S0228/01) and aims at covering a variety of speech context data which does not include syllables.
The corpus is composed of 28 texts with 75,841 words and has been proofread manually. The corpus contents include: text of statements, digit strings, uncommon words, letter strings, measurement units, neutral tone, final retroflexion, latin alphabet, interrogative sentences, English words and room-ordering stimulation.
The speaker has been recorded in a professional recording studio over 2 channels: microphone and glottis wave (fundamental frequency) signals for a total of 30.2 hours.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
5971.00 EUR |
5971.00 EUR |
| Commercial Use |
5971.00 EUR |
5971.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
5971.00 EUR |
5971.00 EUR |
| Commercial Use |
5971.00 EUR |
5971.00 EUR |
|
|
|
The Mandarin Chinese Speech Synthesis Integrated Corpus includes both Basic and Accessory Corpora (see ELRA-S0228/01 and ELRA- S0228/02).
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
19903.00 EUR |
19903.00 EUR |
| Commercial Use |
19903.00 EUR |
19903.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
19903.00 EUR |
19903.00 EUR |
| Commercial Use |
19903.00 EUR |
19903.00 EUR |
|
|
|
This corpus comprises 6,952 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the mobile telephone network. The database comprises 13,942 Chinese personal names and place names. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 17.6 hours of speech. The total capacity of the data is 964 Mb.
Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
3383.00 EUR |
3383.00 EUR |
| Commercial Use |
3383.00 EUR |
3383.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
3383.00 EUR |
3383.00 EUR |
| Commercial Use |
3383.00 EUR |
3383.00 EUR |
|
|
|
This corpus comprises 7,298 entries uttered by 285 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the fixed telephone network. The database comprises 14,492 Chinese personal names and place names. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 17.6 hours of speech. The total capacity of the data is 968 Mb.
Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
2786.00 EUR |
2786.00 EUR |
| Commercial Use |
2786.00 EUR |
2786.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
2786.00 EUR |
2786.00 EUR |
| Commercial Use |
2786.00 EUR |
2786.00 EUR |
|
|
|
This corpus comprises 5,309 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the fixed telephone network. The database comprises 7,606 Chinese digit strings. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 11.8 hours of speech. The total capacity of the data is 648 Mb.
Each speaker read 25-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1791.00 EUR |
1791.00 EUR |
| Commercial Use |
1791.00 EUR |
1791.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1791.00 EUR |
1791.00 EUR |
| Commercial Use |
1791.00 EUR |
1791.00 EUR |
|
|
|
This corpus comprises 6,140 entries uttered by 265 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the mobile telephone network. The database comprises 8,109 Chinese digit strings. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 11.8 hours of speech. The total capacity of the data is 669 Mb.
Each speaker read 25-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1592.00 EUR |
1592.00 EUR |
| Commercial Use |
1592.00 EUR |
1592.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1592.00 EUR |
1592.00 EUR |
| Commercial Use |
1592.00 EUR |
1592.00 EUR |
|
|
|
This corpus comprises 3,085 entries uttered by 265 speakers of different dialects, ages and various educational levels (134 males and 131 females), recorded over the mobile telephone network. The database comprises 6,972 Chinese stocks. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 7 hours of speech. The total capacity of the data is 387 Mb.
Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1592.00 EUR |
1592.00 EUR |
| Commercial Use |
1592.00 EUR |
1592.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1592.00 EUR |
1592.00 EUR |
| Commercial Use |
1592.00 EUR |
1592.00 EUR |
|
|
|
This corpus comprises 3,077 entries uttered by 285 speakers of different dialects, ages and various educational levels (144 males and 141 females), recorded over the fixed telephone network. The database comprises 7,239 Chinese stocks. Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 7 hours of speech. The total capacity of the data is 373 Mb.
Each speaker read 15-30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1394.00 EUR |
1394.00 EUR |
| Commercial Use |
1394.00 EUR |
1394.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1394.00 EUR |
1394.00 EUR |
| Commercial Use |
1394.00 EUR |
1394.00 EUR |
|
|
|
This corpus comprises 1,079 entries uttered by 64 speakers of different dialects, ages and various educational levels (52 males and 12 females), recorded over the mobile telephone network. The database comprises 3,190 Chinese short messages (SMS). Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 3 hours of speech. The total capacity of the data is 161 Mb.
Each speaker read 50 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
796.00 EUR |
796.00 EUR |
| Commercial Use |
796.00 EUR |
796.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
796.00 EUR |
796.00 EUR |
| Commercial Use |
796.00 EUR |
796.00 EUR |
|
|
|
This corpus comprises 1,648 entries uttered by 86 speakers of different dialects, ages and various educational levels (64 males and 22 females), recorded over the fixed telephone network. The database comprises 4,282 Chinese short messages (SMS). Speech samples are stored as a sequence of 16-bit 8kHz WAV for a total of 3.7 hours of speech. The total capacity of the data is 205 Mb.
Each speaker read 50 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
796.00 EUR |
796.00 EUR |
| Commercial Use |
796.00 EUR |
796.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
796.00 EUR |
796.00 EUR |
| Commercial Use |
796.00 EUR |
796.00 EUR |
|
|
|
This corpus comprises 7,276 entries uttered by 200 speakers of different dialects, ages and various educational levels (87 males and 113 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 23,949 short messages (SMS) per channel. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 35.6 hours of speech per channel. The total capacity of the data is 21.1 Gb.
Each speaker read 120 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
5175.00 EUR |
5175.00 EUR |
| Commercial Use |
5175.00 EUR |
5175.00 EUR |
| * Price for one channel only. For 2 channels: 8280€ |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
5175.00 EUR |
5175.00 EUR |
| Commercial Use |
5175.00 EUR |
5175.00 EUR |
| * Price for one channel only. For 2 channels: 8280€ |
|
|
|
This corpus comprises 1,500 entries uttered by 200 speakers of different dialects, ages and various educational levels (87 males and 113 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 6,000 digit strings per channel. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 11.5 hours of speech per channel. The total capacity of the data is 6.82 Gb.
Each speaker read 30 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1194.00 EUR |
1194.00 EUR |
| Commercial Use |
1194.00 EUR |
1194.00 EUR |
| * Price for one channel only. For 2 channels: 1911€ |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1194.00 EUR |
1194.00 EUR |
| Commercial Use |
1194.00 EUR |
1194.00 EUR |
| * Price for one channel only. For 2 channels: 1911€ |
|
|
|
This corpus comprises 782 entries uttered by 10 speakers of different dialects, ages and various educational levels (3 males and 7 females), recorded over 4 channels (Mic1: SHURE SM58; Mic2: ANC-700 Head-mounted; Mic3: TELEX M-60; Mic4: ACOUSTIC MAGIC). The database comprises 800 Chinese items per channel: 30 stocks, 10 nation names, 10 Chinese city names, 30 person names. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 0.97 hours of speech per channel. The total capacity of the data is 587 Mb.
Each speaker read 120 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
100.00 EUR |
100.00 EUR |
| Commercial Use |
100.00 EUR |
100.00 EUR |
| * Price for one channel only. For 2 channels: 159€ |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
100.00 EUR |
100.00 EUR |
| Commercial Use |
100.00 EUR |
100.00 EUR |
| * Price for one channel only. For 2 channels: 159€ |
|
|
|
This corpus comprises 7,142 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 16,499 short messages (SMS). Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for 21.7 hours of speech. The total capacity of the data is 3.2 Gb.
Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
3185.00 EUR |
3185.00 EUR |
| Commercial Use |
3185.00 EUR |
3185.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
3185.00 EUR |
3185.00 EUR |
| Commercial Use |
3185.00 EUR |
3185.00 EUR |
|
|
|
This corpus comprises 1,500 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 3,600 digit strings. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 6.2 hours of speech. The total capacity of the data is 945 Mb.
Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
597.00 EUR |
597.00 EUR |
| Commercial Use |
597.00 EUR |
597.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
597.00 EUR |
597.00 EUR |
| Commercial Use |
597.00 EUR |
597.00 EUR |
|
|
|
This corpus comprises 9,667 entries uttered by 70 speakers of different dialects, ages and various educational levels (38 males and 32 females), recorded through head-mounted noise-canceling microphone. The database comprises 12,596 items. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 15 hours of speech. The total capacity of the data is 2.17 Gb.
Each speaker read 60 person names, 20 country names, 10 Chinese city names, 30 street names, 50 company and organization names, 10 geographical names. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
1991.00 EUR |
1991.00 EUR |
| Commercial Use |
1991.00 EUR |
1991.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
1991.00 EUR |
1991.00 EUR |
| Commercial Use |
1991.00 EUR |
1991.00 EUR |
|
|
|
This corpus comprises 1,586 entries uttered by 70 speakers of different dialects, ages and various educational levels (38 males and 32 females), recorded through head-mounted noise-canceling microphone. The database comprises 4,199 items. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 5.1 hours of speech. The total capacity of the data is 776 Mb.
Each speaker read 60 stocks. Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
597.00 EUR |
597.00 EUR |
| Commercial Use |
597.00 EUR |
597.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
597.00 EUR |
597.00 EUR |
| Commercial Use |
597.00 EUR |
597.00 EUR |
|
|
|
This corpus comprises spontaneous speech (elicited) from 50 speakers of different dialects, ages and various educational levels (21 males and 29 females), who uttered 36 different topics in a working environment, recorded through head-mounted noise-cancelling microphone. The database comprises 600 speech files. Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for a total of 8 hours of speech. The total capacity of the data is 2.37 Gb.
Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
2986.00 EUR |
2986.00 EUR |
| Commercial Use |
2986.00 EUR |
2986.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
2986.00 EUR |
2986.00 EUR |
| Commercial Use |
2986.00 EUR |
2986.00 EUR |
|
|
|
This corpus comprises 8,206 entries including stocks, person names, digit strings and 8,511 speech files composed of spontaneous speech, uttered by 50 speakers of different dialects, ages and various educational levels (22 males and 28 females), recorded from a stand microphone (SHURE SM58). Speech samples are stored as a sequence of 16-bit 44.1kHz WAV for a total of 24 hours of speech. The total capacity of the data is 7 Gb.
Text files are stored in Unicode format. All data have been proofread manually.
The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates.
The corpus aims to be applied to the testing and telephone natural speech recognition system.
Language(s) : Chinese
|
| Membres |
Academic org. |
Commercial org. |
| Research Use |
2192.00 EUR |
2192.00 EUR |
| Commercial Use |
2192.00 EUR |
2192.00 EUR |
| Non Membres |
Academic org. |
Commercial org. |
| Research Use |
2192.00 EUR |
2192.00 EUR |
| Commercial Use |
2192.00 EUR |
2192.00 EUR |
|
|
|
| Displaying 1 to 20 (of 57 products) |
Result Pages: 1 |
|
 |
 |
|