This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.

22 Language

http://ohsu.eagle-i.net/i/0000013c-3a71-3952-c825-3bd680000000

Resource Description

"The 22 Language corpus consists of telephone speech from 22 languages: Eastern Arabic, Cantonese, Czech, Farsi, French, German, Hindi, Hungarian, Japanese, Korean, Malay, Mandarin, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Swahili, Tamil, Vietnamese, and English. Unfortunately French is not available. The corpus contains fixed vocabulary utterances (e.g. days of the week) as well as fluent continuous speech. We were expecting at least 300 callers in each language. Each utterance is verified by a native speaker to determine if the caller followed instructions when answering the prompts. Some of the calls in each language are transcribed orthographically."
Used by

Center for Spoken Language Understanding
Version

1.5
Data Input

Telephone speech from 22 languages
Website(s)

http://www.cslu.ogi.edu/corpora/22lang/

Inferred Types from the eagle-i Ontology (What is an ontology?)

Provenance Metadata About This Resource Record