ELRA ELRA
  Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-S0165
    MICROAES
    The ATLAS Spanish Microphone Database (MICROAES) has been collected in Spain by Applied Technologies on Language and Speech, S.L. (ATLAS). This database comprises microphone recordings from 300 different speakers, who have been selected from five different dialectal areas. Sex and age distribution was also considered for speaker selection.

    The corpus has 30 sets of 15 paragraphs giving a total of 450 paragraphs. Each 15 paragraph set contains at least two allophones from the extended SAMPA symbols. For this purpose, coarticulation effect between words was considered.

    The recording platform is based on a laptop using a PCMCIA slot as interface to the audio equipment. Up to four microphones are recorded simultaneously:

    * Sennheiser ME 104 (close distance)
    * Nokia Lavalier HDC-6D (close distance)
    * Sennheiser ME 64 (medium distance)
    * Haun MBNM-550 E-L (far distance)

    In this database all recordings have been done in an office with no discussion or meeting during the recordings. The signals are stored in a raw file format, i.e. without headers in the signal file. Each of the four speech channels is recorded at 16 kHz with 16 bit quantization.

    A description of the sample rate, the quantization, and byte order used is held in the SAM label file that corresponds to each speech file. This label file also contains information about the signal quality value of the speech file.

    The transcription included in this database is an orthographic, lexical transcription with a few details that represent audible acoustic events (speech and non speech) present in the corresponding waveform files. Transcription includes segment markers dividing the paragraph in portions of less than 10 seconds using speaker pauses.
    The lexicon file included in this database has more that 7400 words with the corresponding pronunciation information using the SAMPA phonemic notation.

    The database contains 30 hours of speech and is distributed in 30 ISO 9660 CD-ROM volumes or 5 ISO 9660 DVD-ROM volumes.

    ISLRN : 313-534-255-935-8
    Technical Information
    Distribution medium : Downloadable
    Contents Click on the arrow to display content.
     speech corpus 
    Resource files
  • ICON_FILE_DOWNLOAD TEXT_QQC
  •  
    Members Prices
    Academic - Commercial 28000.00 EUR
    Academic - Research 18000.00 EUR
    Commercial - Commercial 28000.00 EUR
    Commercial - Research 28000.00 EUR
    Non Member Prices
    Academic - Commercial 32000.00 EUR
    Academic - Research 22000.00 EUR
    Commercial - Commercial 32000.00 EUR
    Commercial - Research 32000.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0