eagle-i Oregon Health & Science UniversityOregon Health & Science University
See it in Search
This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.

Efficient hidden structure annotation via structural multiple-sequence alignments

eagle-i ID


Resource Type

  1. Algorithmic software component


  1. Resource Description
    "This software is meant as a development in finite-state syntactic processing models for natural language that use features encoding global structural constraints derived through multiple sequence alignment (MSA) techniques, to significantly improve accuracy without expensive context-free inference. MSAs are widely used in computational biology for building finite-state models that capture long-distance dependencies in sequences (e.g., in RNA secondary structure). Given a large set of functionally aligned sequences in MSA format, finite-state models can be constructed that allow for the efficient alignment of new sequences with the given MSA. In natural language processing (NLP), only very rarely have MSA techniques been used, and then to characterize phonetic or semantic similarity. This software explores the definition of a purely syntactic functional alignment between semantically unrelated strings from the same language, to define a structural MSA for constructing finite-state syntactic models. The software had two specific aims. The first aim was to develop natural language sequence processing algorithms and models that could: a) define sequence alignments with respect to syntactic function; b) build structural MSAs based on defined functional alignments; c) derive finite-state models to efficiently align new sequences with the built MSA; and d) extract features from an alignment with the MSA for improved sequence modeling. The second aim was to empirically validate this approach within a number of large-scale text processing applications in multiple domains and languages. These algorithms are expected to provide improved finite-state natural language models that will contribute to the state-of-the-art in critical text processing applications." This software processes natural language.
  2. Used by
    Center for Spoken Language Understanding
  3. Software purpose
    Natural language processing objective
  4. Website(s)
  5. Website(s)
  6. Developed by
    Roark, Brian E., Ph.D.
  7. Software license
    Open source software license
Provenance Metadata About This Resource Record
Copyright © 2016 by the President and Fellows of Harvard College
The eagle-i Consortium is supported by NIH Grant #5U24RR029825-02 / Copyright 2016