OpenFst Library
OpenFst is a library for constructing, combining, optimizing, and
searching
weighted finite-state transducers (FSTs). Weighted
finite-state transducers are automata where each transition has an
input label, an output label, and a
weight. The more familiar
finite-state acceptor is represented as a transducer with each
transition's input and output label equal. Finite-state acceptors
are used to represent sets of strings (specifically,
regular or
rational sets); finite-state transducers are used to represent
binary relations between pairs of strings (specifically,
rational transductions).
The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis,
machine translation, optical character recognition, pattern matching,
string processing, machine learning, information extraction and
retrieval among others. Often a weighted transducer is used to represent a
probabilistic model (e.g., an
n-gram model,
pronunciation model). FSTs can be optimized by
determinization and
minimization,
models can be applied to hypothesis sets (also represented as automata) or cascaded
by finite-state
composition, and the best results can be selected by
shortest-path algorithms.
This library was developed at Google Research (
M. Riley,
J. Schalkwyk,
W. Skut) and NYU's Courant Institute
(
C. Allauzen, M. Mohri). It is intended to be comprehensive, flexible, efficient and scale well to large problems. It is an open source project distributed under the
Apache
license.