
META TOPICPARENT 
name="WebHome" 
Efficiency 

 Fst Type: The specific FstType can affect both time and space in an application.
 Mutable:
VectorFst is recommended for general mutations, while EditFst can be used for small changes to an existing FST.


< < 

 Immutable:
ConstFst uses similar space to VectorFst but is much faster to read and write and possibly to access since it has less fragmentation. The CompactFst types permit smaller memory and file representations of acceptors, of unweighted acceptors and transducers and weighted and unweighted strings. These types admit to optional memorymapping (coming soon).
 Specialized: For some applications, specialized FST types can be created that are tailored to the structure of the automaton. One relatively simple way to do this is to extend the
CompactFst types by defining a new compactor. A more general, but complex, way is to define a entirely new FST type. For example, NGramFst (coming soon) can be used to very compactly represent ngram language models. This type admits to memorymapping.

> > 

 Immutable:
ConstFst uses similar space to VectorFst but is much faster to read and write and possibly to access since it has less fragmentation. The CompactFst types permit smaller memory and file representations of acceptors, of unweighted acceptors and transducers and weighted and unweighted strings. These types admit optional memorymapping (coming soon).
 Specialized: For some applications, specialized FST types can be created that are tailored to the structure of the automaton. One relatively simple way to do this is to extend the
CompactFst types by defining a new compactor. A more general, but complex, way is to define an entirely new FST type. For example, NGramFst (coming soon) can be used to very compactly represent ngram language models. This type admits memorymapping.


 Arc Type: A particularly simple way to improve the representation size is to use an Arc that has
StateId , Label , and Weight types appropriately sized to a problem.


< < 
 Interface Type: When creating state or arc iterators, the specific interface type can greatly affect efficiency. Using the most specific Fst type as the template argument of the iterators will access specializations that avoid virtual function and other overhead of more generic calls. E.g. prefer
ArcIterator<VectorFst<Arc>>(fst, s) to ArcIterator<Fst<Arc>>(fst, s) , ArcIterator<ExpandedFst<Arc>>(fst, s) , or ArcIterator<MutableFst<Arc>>(fst, s) if performance is critical.

> > 
 Interface Type: When creating state or arc iterators, the specific interface type can greatly affect efficiency. Using the most specific Fst type as the template argument of the iterators will access specializations that avoid virtual function and other overhead of more generic calls. E.g., prefer
ArcIterator<VectorFst<Arc>>(fst, s) to ArcIterator<Fst<Arc>>(fst, s) , ArcIterator<ExpandedFst<Arc>>(fst, s) , or ArcIterator<MutableFst<Arc>>(fst, s) if performance is critical.


Efficient Algorithms 
 General Issues
 Association Order: Although an operation may be associative, efficiency may greatly vary depending on the order of application. For example, prefer
(A ∪ B) ∪ C to A ∪ (B ∪ C) if A and B are small and C is large. Similar considerations apply to concatenation and intersection/composition.


< < 
 Optimization: What optimizations should be applied is very problemdependent. Epsilon removal, determinization and minimization of an unweighted automaton will give in the smallest, irredundant result. However this could be exponentially larger than the (nondeterministic) input, so applying some or or none of these operations may be preferred. In the weighted and/or transducer case, the input may not even be determinizable although they can be determinized and minimized as an appropriately encoded automaton.
 Eager vs Delayed Computation: If only a small portion of the result is visited (e.g. of a composition), then prefer delayed computation (e.g. invoke
ComposeFst ). On the other hand, if much of the result is visited, then prefer the eager version (e.g., invoke Compose ) due to lower overhead.
 Caching: Various operations such as the delayed FST operations and the CompactFst classes may use caching to avoid recomputation on repeated accesses. The size of the cache and whether (LRU) garbage collection is performed is controlled by options. In some cases, caching can also be disabled per state by arc iterator flags.
 Properties: An FST maintains stored property bits such as whether it is an acceptor, acyclic, or inputlabel sorted. Many algorithms check for various of these properties, which will be computed in linear time if they are not already stored. This linear pass can thus be avoided if the needed bits are already set (e.g. by the user).

> > 
 Optimization: What optimizations should be applied is very problemdependent. Epsilon removal, determinization and minimization of an unweighted automaton will give in the smallest, irredundant result. However this could be exponentially larger than the (nondeterministic) input, so applying some or none of these operations may be preferred. In the weighted and/or transducer case, the input may not even be determinizable although it can be determinized and minimized as an appropriately encoded automaton.
 Eager vs Delayed Computation: If only a small portion of the result is visited (e.g., of a composition), then prefer delayed computation (e.g., invoke
ComposeFst ). On the other hand, if much of the result is visited, then prefer the eager version (e.g., invoke Compose ) due to lower overhead.
 Caching: Various operations such as the delayed FST operations and the CompactFst classes may use caching to avoid recomputation on repeated access. The size of the cache and whether (LRU) garbage collection is performed is controlled by options. In some cases, caching can also be disabled per state by arc iterator flags.
 Properties: An FST maintains stored property bits such as whether it is an acceptor, acyclic, or inputlabel sorted. Many algorithms check for various of these properties, which will be computed in linear time if they are not already stored. This linear pass can thus be avoided if the needed bits are already set (e.g., by the user).


AlgorithmSpecific Issues 
 

< < 

> > 
 Shortest Path/Distance: The shortest path/distance algorithms use a queue discipline to determine the order of expansion of states:
 LifoQueue: when the input is unweighted
 StateOrderQueue: when the input is topologicallyordered
 TopOrderQueue: when the input is acyclic, but not topologicallyordered
 ShortestFirstQueue: when the input is cyclic, but has no negative weights
 FifoQueue: otherwise
 The queue is selected based on the property bits. This choice and possible testing can be overridden by passing the queue as an option. With the
ShortestFirstQueue , the shortest path algorithm can be passed the first_path=true option to stop when the first solution is discovered (correct if the weights are between 0 and 1). This is a case where delayed computation of the FST being searched may be preferred.


