Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

2 Language Resources

Order by:

 EthioSpeech    
  • Amharic
  • Oromo
  • Sidamo
  • Somali
  • Tigrinya

ID: ELRA-S0494

ISLRN: 886-456-351-764-8

EthioSpeech Corpora is comprised of over 391 hours of recorded read speech in six different Ethiopian languages by ca. 200 speakers per language: Amharic (68 hours), Tigrigna (62 hours), Oromo (70 hours), Somali (56 hours), Afar (68 hours), and Sidama (68 hours). The dominating domain is media (m...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
4500.00 € submit
Licence: Commercial Use - ELRA VAR
4500.00 € submit
4500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5400.00 € submit
Licence: Commercial Use - ELRA VAR
5400.00 € submit
5400.00 € submit
 MiLQ: Mixed-Language Query Test Set for Bilingual Web Search – Evaluation Package    
  • Chinese
  • Finnish
  • French
  • German
  • Persian
  • Russian
  • Somali
  • Swahili (macrolanguage)

ID: ELRA-E0047

ISLRN: 200-586-423-805-2

MiLQ is a benchmark of mixed-language (code-switched) search queries created by bilingual speakers for evaluating Information Retrieval with mixed-language queries. It provides query versions where English expressions are embedded within native-language structures. This work is derived from The C...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
0.00 € submit
0.00 € submit