eagle-i Oregon Health & Science UniversityOregon Health & Science University
See it in Search
This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.

22 Language

eagle-i ID


Resource Type

  1. Database


  1. Resource Description
    "The 22 Language corpus consists of telephone speech from 22 languages: Eastern Arabic, Cantonese, Czech, Farsi, French, German, Hindi, Hungarian, Japanese, Korean, Malay, Mandarin, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Swahili, Tamil, Vietnamese, and English. Unfortunately French is not available. The corpus contains fixed vocabulary utterances (e.g. days of the week) as well as fluent continuous speech. We were expecting at least 300 callers in each language. Each utterance is verified by a native speaker to determine if the caller followed instructions when answering the prompts. Some of the calls in each language are transcribed orthographically."
  2. Used by
    Center for Spoken Language Understanding
  3. Version
  4. Data Input
    Telephone speech from 22 languages
  5. Website(s)
Provenance Metadata About This Resource Record
Copyright © 2016 by the President and Fellows of Harvard College
The eagle-i Consortium is supported by NIH Grant #5U24RR029825-02 / Copyright 2016