This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.

Software for semi-supervised discriminative training of language models

eagle-i ID

http://ohsu.eagle-i.net/i/0000013a-d7e6-260e-c825-3bd680000000

Resource Type

Software

Properties

Related grant number

NSF Award Abstract #0964102
Resource Description

"This project is conducting fundamental research in statistical language modeling to improve human language technologies, including automatic speech recognition (ASR) and machine translation (MT). A language model (LM) is conventionally optimized, using text in the target language, to assign high probability to well-formed sentences. This method has a fundamental shortcoming: the optimization does not explicitly target the kinds of distinctions necessary to accomplish the task at hand, such as discriminating (for ASR) between different words that are acoustically confusable or (for MT) between different target-language words that express the multiple meanings of a polysemous source-language word. Discriminative optimization of the LM, which would overcome this shortcoming, requires large quantities of paired input-output sequences: speech and its reference transcription for ASR or source-language (e.g. Chinese) sentences and their translations into the target language (say, English) for MT. Such resources are expensive, and limit the efficacy of discriminative training methods. In a radical departure from convention, this project is investigating discriminative training using easily available, *unpaired* input and output sequences: un-transcribed speech or monolingual source-language text and unpaired target-language text. Two key ideas are being pursued: (i) unlabeled input sequences (e.g. speech or Chinese text) are processed to learn likely confusions encountered by the ASR or MT system; (ii) unpaired output sequences (English text) are leveraged to discriminate between these well-formed sentences from the (supposed) ill-formed sentences the system could potentially confuse them with. This self-supervised discriminative training, if successful, will advance machine intelligence in fundamental ways that impact many other applications."
Used by

Center for Spoken Language Understanding
Website(s)

http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0964102
Website(s)

http://www.ohsu.edu/xd/education/schools/school-of-medicine/departments/basic-science-departments/biomedical-engineering/center-for-spoken-language-understanding/semi-supervised-discriminative.cfm?WT_rank=1
Developed by

Roark, Brian E., Ph.D.
Developed by

Shafran, Izhak, Ph.D.

Inferred Types from the eagle-i Ontology (What is an ontology?)

Provenance Metadata About This Resource Record

workflow state

Published
contributor

nvasilevsky (Nicole Vasilevsky)
created

2012-11-06T16:47:09.541-06:00
creator

nvasilevsky (Nicole Vasilevsky)
modified

2012-11-06T16:49:56.860-06:00