11 Language Resources

Order by:

 Database of Persian Names    
  • Persian

ID: ELRA-L0127

ISLRN: 739-878-734-567-6

A unique resource that has been developed in cooperation with a team of native-speaker experts in Persian phonology. The data includes a confidence rank to indicate the relative likelihood that a variant will be encountered in the real world.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7500.00 € submit
12500.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
25000.00 € submit
 Farsdat (Farsi Speech Database)    
  • Persian

ID: ELRA-S0112

ISLRN: 141-131-349-230-0

The Persian Speech Database Farsdat comprises the recordings of 300 Iranian speakers, who differ from each other with regards to age, sex, education level, and dialect (10 dialect regions of Iran were represented: Tehrani, Torki, Esfahani, Jonubi, Shomali, Khorassani, Baluchi, Kordi, Lori, and Ya...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Large Farsdat    
  • Persian

ID: ELRA-S0380

ISLRN: 067-486-870-902-0

Large Farsdat (L-FARSDAT) is a Persian (Farsi) Speech Database containing about 73 hours of read speech from formal Farsi texts (newspapers) which have been recorded by 100 speakers through a unidirectional desktop microphone. Each speaker uttered 20-25 pages of text from various subjects and rec...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6400.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
8000.00 € submit
30000.00 € submit
Licence: Commercial Use - ELRA VAR
30000.00 € submit
30000.00 € submit
 Persian 1984 corpus (Multext-East framework)    
  • Persian

ID: ELRA-W0054

ISLRN: 851-240-629-673-1

This corpus contains the Persian (Farsi) translation of a part of the novel “1984” (G. Orwell) annotated in the Multext-East framework (Multilingual Text Tools and Corpora for Eastern and Central European Languages). The aim of the Multext-East project was to develop standardized language resourc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Persian Audio Dictionary      
  • Persian

ID: ELRA-S0401

ISLRN: 133-181-128-420-9

This dictionary consists of more than 50,000 entries (along with almost all wordforms and proper names) with corresponding audio files in MP3 and English transliterations. The words have been recorded with standard Persian (Farsi) pronunciation (all by a single speaker). This dictionary is provid...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
900.00 € submit
900.00 € submit
Licence: Commercial Use - ELRA VAR
4500.00 € submit
4500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1100.00 € submit
1100.00 € submit
Licence: Commercial Use - ELRA VAR
5400.00 € submit
5400.00 € submit
 Persian Ezafe Construction Dataset    
  • Persian

ID: ELRA-W0315

ISLRN: 663-014-610-121-2

The Persian Ezafe Construction Dataset includes gold Ezafe tags in almost 30 thousand Persian sentences. The sentences were manually annotated by six annotators who where all native Persian speakers and linguists. The inter-annotator agreement of a small portion of the data (one thousand sentence...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
3750.00 € submit
Licence: Commercial Use - ELRA VAR
3750.00 € submit
3750.00 € submit
 Persian Kids’ Speech Corpus    
  • Persian

ID: ELRA-S0487

ISLRN: 822-550-731-416-7

The Persian Kids’ Speech Corpus consists of speech signals recorded by 286 children (141 girls, 145 boys), from 6 to 9 years old, through an Andreas Mic Anti-Noise microphone and a Premium Speechmike headphone. The CoolEdit Pro2.1 software was utilized to record the speech at 16 kHz, single-chann...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
 Persian Lexicon    
  • Persian

ID: ELRA-L0087

ISLRN: 547-614-436-004-7

This is a Persian (Farsi) lexicon of more than 40,000 entries of non-inflected forms of words. Each word is transliterated based on the proposed framework from MBROLA (Text-To-Speech synthesizer). The database includes a large variety of descriptors for each entry (plural, homograph, ...). This...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
 Persian Multext-East framework lexicon    
  • Persian

ID: ELRA-L0086

ISLRN: 884-966-712-343-0

This is a Persian (Farsi) morphosyntactic lexicon derived from the Persian 1984 corpus (Multext-East framework) (see ELRA-W0054). It contains the full inflectional paradigms of a superset of lemmas that appear in the Persian 1984 corpus. Each entry gives the word-form, its lemma and morphosyntact...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 Persian Speech Corpus    
  • Persian

ID: ELRA-S0393

ISLRN: 068-845-898-304-0

This about 2.5-hour Single-Speaker Speech corpus has been developed using the same methodologies used in the PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded in Persian (Tehrani accent) by one male speaker using a professional studio, through a "Blubb...

MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use, Share Alike - CC-BY-NC-SA
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Persian Speech Corpus    
  • Persian

ID: ELRA-S0415

ISLRN: 058-406-130-314-1

This dataset contains more than 31 hours and 30 minutes of Persian scripted monologue and dialogue data, recorded from 89 Persian speakers (39 males and 50 females) between 17-80 years old in Iran (Tehrani dialect). Recordings were made between April and January 2022. Data consists of read and sp...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit