TWiki
>
FST Web
>
FstAdvancedUsage
(revision 67) (raw view)
Edit
Attach
%TOC% ---+ <nop>OpenFst Advanced Usage Below are a variety of topics covered in greater depth or of more specialized interest than found in the Quick Tour. Reading the [[FstQuickTour][Quick Tour]] first is recommended. #ArcIterators ---++ Arc Iterators An arc iterator %DOX{fst::ArcIterator[%H%]}% is used to access the transitions leaving an FST state. It has the form: <pre> template <class F> class ArcIterator { typedef typename F::Arc Arc; typedef typename Arc::StateId StateId; public: ArcIterator(const &F fst, StateId s); %RED% // End of iterator? %ENDCOLOR% bool Done() const; %RED% // Current arc (when !Done) %ENDCOLOR% const Arc& Value() const; %RED% // Advances to next arc (when !Done) %ENDCOLOR% void Next(); %RED% // Returns current position %ENDCOLOR% size_t Position(); %RED% // Returns to initial position %ENDCOLOR% void Reset(); %RED% // Arc access by position %ENDCOLOR% void Seek(size_t pos); %RED% // Returns arc flags %ENDCOLOR% uint32 Flags() const; %RED% // Sets arc flags %ENDCOLOR% void SetFlags(uint32 flags, uint32 mask); }; </pre> It is templated on the Fst class =F= to allow efficient specializations but defaults to a generic version on the abstract base [[FstAdvancedUsage#BaseFsts][Fst]] class. See [[FstConventions][here]] for conventions that arc iterator use must respect. All current <nop>OpenFst library =Seek()= methods are constant time. An example use of an arc iterator is shown [[FstQuickTour#ArcIterator][here]]. A =MutableArcIterator= %DOX{fst::MutableArcIterator[%H%]}% is similar to an =ArcIterator= except its constructor takes a pointer to a =MutableFst= and it additionally has a =SetValue()= method. #ArcFilters ---++ Arc Filters Arc filters are accepted by various operations to control which arcs are transitioned. An arc filter has the form: <pre> template <class Arc> class SomeArcFilter { public: %RED% // Return true iff arc is to be transitioned. %ENDCOLOR% bool operator()(const Arc &arc) const; }; </pre> Pre-defined arc filters include: | *Name* | *Description* | ** | | =AnyArcFilter= | Accept all arcs | %DOX{fst::AnyArcFilter[%H%]}% | | =EpsilonArcFilter= | Accept only arcs with input and output epsilons |%DOX{fst::AnyArcFilter[%H%]}% | | =InputEpsilonArcFilter= | Accept only arcs with input epsilons | %DOX{fst::InputEpsilonArcFilter[%H%]}% | | =OutputEpsilonArcFilter= | Accept only arcs with output epsilons | %DOX{fst::InputEpsilonArcFilter[%H%]}% | #ArcMappers ---++ Arc Mappers _Arc mappers_ are function objects used by the [[ArcMapDoc][ArcMap]] operation to transform arcs and/or final states. An arc mapper has the form: <pre>%RED%// This determines how final weights are mapped. %ENDCOLOR% enum MapFinalAction { %RED% // A final weight is mapped into a final weight. An error // is raised if this is not possible. %ENDCOLOR% MAP_NO_SUPERFINAL, %RED% // A final weight is mapped to an arc to the superfinal state // when the result cannot be represented as a final weight. // The superfinal state will be added only if it is needed. %ENDCOLOR% MAP_ALLOW_SUPERFINAL, %RED% // A final weight is mapped to an arc to the superfinal state // unless the result can be represented as a final weight of weight // Zero(). The superfinal state is always added (if the input is // not the empty Fst). %ENDCOLOR% MAP_REQUIRE_SUPERFINAL }; %RED% // This determines how symbol tables are mapped. %ENDCOLOR% enum MapSymbolsAction { %RED% // Symbols should be cleared in the result by the map. %ENDCOLOR% MAP_CLEAR_SYMBOLS, %RED% // Symbols should be copied from the input FST by the map. %ENDCOLOR% MAP_COPY_SYMBOLS, %RED% // Symbols should not be modified in the result by the map itself. // (They may set by the mapper). %ENDCOLOR% MAP_NOOP_SYMBOLS }; class SomeArcMapper { public: %RED% // Assumes input arc type is A and result arc type is B %ENDCOLOR% typedef A FromArc; typedef B ToArc; %RED% // Maps an arc type A to arc type B. %ENDCOLOR% B operator()(const A &arc); %RED% // Specifies final action the mapper requires (see above). // The mapper will be passed final weights as arcs of the // form A(0, 0, weight, kNoStateId). %ENDCOLOR% MapFinalAction FinalAction() const; %RED% // Specifies input symbol table action the mapper requires (see above). %ENDCOLOR% MapSymbolsAction InputSymbolsAction() const; %RED% // Specifies output symbol table action the mapper requires (see above). %ENDCOLOR% MapSymbolsAction OutputSymbolsAction() const; %RED% // This specifies the known properties of an Fst mapped by this // mapper. It takes as argument the input Fst's known properties %ENDCOLOR% uint64 Properties(uint64 props) const; };</pre> The following arc mappers are defined in the <nop>OpenFst library: | *Name* | *Description* | ** | | =FromGallicMapper= | Extracts output label from [[FstAdvancedUsage#FstWeights][gallic]] weight | %DOX{fst::FromGallicMapper[%H%]}% | | =IdentityArcMapper= | Maps to self | %DOX{fst::IdentityArcMapper[%H%]}% | | =InvertWeightMapper= | Reciprocate all non-<span style="text-decoration:overline">0</span> weights | %DOX{fst::InvertWeightMapper[%H%]}% | | =PlusMapper= | Adds (⊕) a constant to all weights | %DOX{fst::PlusMapper[%H%]}% | | =QuantizeMapper= | Quantize all weights | %DOX{fst::QuantizeMapper[%H%]}% | | =ReverseWeightMapper= | [[FstWeightRequirements][Reverse]] all weights | %DOX{fst::ReverseWeightMapper[%H%]}% | | =RmWeightMapper= | Map all non-<span style="text-decoration:overline">0</span> weights to <span style="text-decoration:overline">1</span> | %DOX{fst::RmWeightMapper[%H%]}% | | =SuperFinalMapper= | Redirects final states to new superfinal state | %DOX{fst::SuperFinalMapper[%H%]}% | | =TimesMapper= | (Right) multiplies (⊗) a constant to all weights | %DOX{fst::TimesMapper[%H%]}% | | =ToGallicMapper= | Combines output label and weight into [[FstAdvancedUsage#FstWeights][gallic]] weight | %DOX{fst::ToGallicMapper[%H%]}% | | =WeightConvertMapper= | Converts arc weight types (assuming appropriate =WeightConvert= class specialization), leaving labels and nextstates the same. | %DOX{fst::WeightConvertMapper[%H%]}% | =ToGallicMapper= and =FromGallicMapper= are used, for example, to implement transducer [[DeterminizeDoc][determinization]] and [[MinimizeDoc][minimization]] using weighted acceptor versions of these algorithms. Other specialized arc mappers are used to implement [[EncodeDecodeDoc][Decode]], [[EncodeDecodeDoc][Encode]], [[InvertDoc][Invert]], and [[ProjectDoc][Project]]. #FstArcs ---++Arcs An =Arc= is a type that represents an FST transition from a given source state. It specifies an input label, an output label, a weight, and a destination state ID and it has a type name. In particular, it has the following form: <pre> struct SomeArc { typedef W Weight; typedef L Label; typedef S StateId; static const string &Type(); Label ilabel; Label olabel; Weight weight; StateId nextstate; }; </pre> where =W= is a valid [[FstWeightRequirements][weight type]], and =L= and =S= are signed integral types. The following arc types are defined in the <nop>OpenFst library: | *Name* | *Label Type* | *State ID Type* | *Weight Type* | *Registered* | | =ExpectationArc<A, W>= | =int= | =int= | =ExpectationWeight<A::Weight, W>= | | =GallicArc<A, S>= | =A::Label= | =A::StateId= | =GallicWeight<A::Label, A::Weight, S>= | | | =LexicographicArc<W1, W2>= | =int= | =int= | =LexicographicWeight<W1, W2>= | | | =LogArc= | =int= | =int= | =LogWeight= | %Y% | | =Log64Arc= | =int= | =int= | =Log64Weight= | %Y% | | =MinMaxArc= | =int= | =int= | =MinMaxWeight= | | | =PowerArc<A, n>= | =int= | =int= | =PowerWeight<A::Weight, n>= | | =ProductArc<W1, W2>= | =int= | =int= | =ProductWeight<W1, W2>= | | | =SignedLogArc= | =int= | =int= | =SignedLogWeight= | %Y% | | =SignedLog64Arc= | =int= | =int= | =SignedLog64Weight= | %Y% | | =SparsePowerArc<A>= | =int= | =int= | =SparsePowerWeight<A::Weight>= | | =StdArc= | =int= | =int= | =TropicalWeight= | %Y% | | =StringArc<S>= | =int= | =int= | =StringWeight<int, S>= | | Additional arc information: * [[FstAdvancedUsage#FstWeights][Corresponding weight types]] * [[FstQuickTour#FstArcs][Elementary arc information]] * [[FstAdvancedUsage#FstInputOutput][Fst I/O:]] How to register arc types for I/O. * [[FstAdvancedUsage#NewArcs][User-defined arcs]] #BaseFsts ---++Base Fsts Every =Fst= %DOX{fst::Fst[%H%]}% must specify an initial state, the final weights, arc and epsilon counts per states, an Fst type name, the Fst's [[FstAdvancedUsage#Properties][properties]], how to copy, read and write the Fst, and the input and output symbol tables (if any). In particular, the base =Fst= class has the interface: <pre> template <class A> class Fst { public: typedef A Arc; typedef typename A::Weight Weight; typedef typename A::StateId StateId; %RED% // Initial state %ENDCOLOR% virtual StateId Start() const = 0; %RED% // States's final weight %ENDCOLOR% virtual Final(StateId) const = 0: %RED% // State's arc count %ENDCOLOR% virtual NumArcs(StateId) const = 0; %RED% // States's input epsilon count %ENDCOLOR% virtual NumInputEpsilons(StateId) const = 0; %RED% // State's output epsilon count %ENDCOLOR% virtual NumOutputEpsilons(StateId) const = 0; %RED% // If test=false, return stored properties bits for mask (some poss. unknown) // If test=true, return property bits for mask (computing o.w. unknown) %ENDCOLOR% virtual Properties(uint64 mask, bool test) const = 0; %RED% // Fst type name %ENDCOLOR% virtual const string& Type() const = 0; %RED% // Get a copy of this Fst %ENDCOLOR% virtual Fst<A> *Copy() const = 0; %RED% // Read an Fst from an input stream; returns NULL on error %ENDCOLOR% static Fst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an Fst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static Fst<A> *Read(const string &filename); %RED% // Write an Fst to an output stream; return false on error %ENDCOLOR% virtual bool Write(ostream &strm, const FstWriteOptions &opts); %RED% // Write an Fst to a file; return false on error // Empty filename writes to standard output %ENDCOLOR% virtual bool Write(const string &filename); %RED% // Return input label symbol table; return NULL if not specified %ENDCOLOR% virtual const SymbolTable* InputSymbols() const = 0; %RED% // Return output label symbol table; return NULL if not specified %ENDCOLOR% virtual const SymbolTable* OutputSymbols() const = 0; }; </pre> =Fst= is an abstract class (note the pure virtual methods). All <nop>OpenFst FSTs must meet this interface. The companion [[FstAdvancedUsage#StateIterators][state iterator]] and [[FstAdvancedUsage#ArcIterators][arc iterator]] classes provide access to the states and transitions of the FST. #FstCaching ---++ Caching Most of the [[FstQuickTour#DelayedFsts][delayed Fst classes]] use internal caching to save expanded states and arcs. This caching is controlled by this struct: <pre>struct CacheOptions { %RED% // enable GC %ENDCOLOR% bool gc; %RED% // # of bytes allowed before GC %ENDCOLOR% size_t gc_limit; CacheOptions(bool g, size_t l) : gc(g), gc_limit(l) {} CacheOptions() : gc(FLAGS_fst_default_cache_gc), gc_limit(FLAGS_fst_default_cache_gc_limit) {} };</pre> All <nop>OpenFst cached Fsts have constructors that accept this (or a class derived from it) as an argument. The member defaults are controlled by [[FstAdvancedUsage#CommandLineOptions][global flags]]. These options can be used for: * _Maximal caching_: If =gc= is =false=, then any expanded state will be cached for the extent of the FST. This case is useful when states are revisited and memory is not a concern. * _Bounded caching_: If =gc= is =true=, then the cache will be garbage-collected when it grows past =gc_limit=. This case is useful when states are revisited and memory is a concern. This is the default case (based on the global flags). * _Minimal caching_: It is generally not possible to avoid all caching in such an FST since the cache is used to implement the arc iterators efficiently (creating an iterator computes and writes the state's arcs to the cache, iterating reads from the cache). However, if (1) =gc= is =true=, (2) =gc_limit= is 0, and (3) arcs iterators have been created (and then destroyed) _only one state at a time_, then only information for that state is cached and this case is especially optimized. This case is useful when states are not revisited (e.g. when a cached FST is simply being copied to a mutable FST). #CommandLineOptions ---++ Command Line Flags <NOP>OpenFst has several global options in the library proper that most users can ignore, leaving them with their default values: | *Option* | *Type* | *Default* | *Description* | | =FLAGS_fst_compat_symbols= | bool | true | Require symbol tables to match when appropriate | | =FLAGS_fst_default_cache_gc= | bool | true | Enable garbage collection of cached Fsts | | =FLAGS_fst_default_cache_gc_limit= | int64 | 1048576 | Byte size that triggers garbage collection of cached Fsts | | =FLAGS_fst_error_fatal= | bool | true | FST errors are fatal; o.w. return objects flagged as bad: e.g., FSTs - =kError= prop. true, FST weights - not a =Member()= | | =FLAGS_fst_field_separator= | string | " \t" | Set of characters used as a separator between printed fields | | =FLAGS_fst_weight_parentheses= | string | "" | Characters enclosing the first weight of a printed composite weight (and derived classes) to ensure proper I/O of nested composite weights; must have size 0 (none) or 2 (open and close parenthesis) | | =FLAGS_fst_weight_separator= | string | "," | Character separator between printed composite weights; must be a single character | | =FLAGS_fst_verify_properties= | bool | false | Verify Fst properties are correctly set when queried | The first ensures the arguments of binary FST operations (e.g. [[ComposeDoc][composition]]) have compatible symbol tables (e..g output symbol table matches input symbol table for composition). The second two are used to control the [[FstAdvancedUsage#FstCaching][caching]] of expanded state and arc information found in most [[FstQuickTour][delayed Fst classes]]; the default values should normally be satisfactory. The next determines how [[FstAdvancedUsage#ErrorHandling][errors are handled]]. The next is used in the textual representation of FSTs and symbol tables. The next two are used to control the text formating of [[FstAdvancedUsage#FstWeights][ProductWeight]] and other weight tuples. The last is used to ensure that the [[FstAdvancedUsage#FstProperties][properties]] of an FST have been correctly set; it is used for debugging only since it incurs considerable computational cost. In each of the Fst distribution installed binaries, the above options, as well as any of those defined specific to the binary, can be set from the command line using e.g. =--fst_default_cache_gc=false= or =--fst_weight_parenthesis="("= . Additionally, the option =--help= and =--v=N= (where N = 0,1,2,..) will print out usage information and set the verbosity level of logging, respectively. The flag processing is modeled after the Google [[http://code.google.com/p/google-gflags][gflags]] package. In a user-defined binary, the command line options processing will all also work if the user calls: <verbatim> SetFlags(usage, &argc, &argv, true); </verbatim> In that case, the user can set his own flags as well, following the conventions in [[http://cims.nyu.edu/~openfst/doxygen/html/flags_8h-source.html][<fst/flags.h>]]. Alternatively, the user can process options in his own way and directly assign to any of the above global options if he wishes to modify their defaults. #ComposeFilters ---++ Composition Filters A _composition filter_ determines which matches are allowed to proceed in [[ComposeDoc][composition]]. The basic filters handle correct epsilon matching. In particular, they ensure that redundant epsilon paths, which would be incorrect with [[FstWeightRequirements][non-idempotent]] weights, are not created. More generally, composition filters can be used to block or modify composition paths for efficiency or other purposes usually working in tandem with specialized [[FstAdvancedUsage#LookAheadMatchers][matchers]]. Their interface is: <pre>template <class M1, M2> class SomeComposeFilter { public: typedef typename M1::FST1 FST1; typedef typename M1::FST2 FST2; typedef typename FST1::Arc Arc; typedef ... FilterState; typedef ... Matcher1; typedef ... Matcher2; typedef ... FilterState; typedef typename Arc::StateId StateId; typedef typename Arc::Weight Weight; %RED% // Required constructor. The filter is either passed composition matchers or constructs // them internally. This is done so the filter can possibly modify the result (useful e.g. with lookahead). %ENDCOLOR% SomeComposeFilter(const FST1 &fst1, const FST2 &fst2, M1 *matcher1 = 0, M2 *matcher2); %RED% // Return start state of filter. %ENDCOLOR% FilterState Start() const; %RED% // Specifies current composition state. %ENDCOLOR% void SetState(StateId s1, StateId s2, const FilterState &f); Matcher2 *GetMatcher2(); %RED% // Apply filter at current composition state to these transitions. // If an arc label to be matched is kNolabel, then that side does not consume a symbol. // Returns the new filter state or, if disallowed, FilterState::NoState(). // The filter is permitted to modify its inputs, e.g. for optimizations. %ENDCOLOR% FilterState FilterArc(A *arc1, A *arc2) const; %RED% // Apply filter at current composition state to these final weights // (cf. superfinal transitions). The filter may modify its inputs, // e.g. for optimizations. %ENDCOLOR% void FilterFinal(Weight *final1, Weight *final2) const; %RED% // Return resp matchers. Ownership stays with the filter. These // methods allow the filter to access and possibly modify // the composition matchers (useful e.g. with lookahead). %ENDCOLOR% Matcher1 *GetMatcher1(); }; </pre> The filter's state is represented by the type =SomeComposeFilter::FilterState= and is stored in the composition [[FstAdvancedUsage#StateTables][state table]] tuple. It has the form: <pre>class SomeFilterState { public: %RED% // Required constructors %ENDCOLOR% SomeFilterState(); SomeFilterState(const SomeFilterState &f); %RED% // An invalid filter state. %ENDCOLOR% static const SomeFilterState NoState(); %RED% // Maps state to integer for hashing. %ENDCOLOR% size_t Hash() const; %RED% // Equality of filter states. %ENDCOLOR% bool operator==(const SomeFilterState &f) const; %RED% // Inequality of filter states. %ENDCOLOR% bool operator!=(const SomeFilterState &f) const; %RED% // Assignment to filter states. %ENDCOLOR% SomeFilterState& operator=(const SomeFilterState& f); }; </pre> The following composition filters are defined in the <nop>OpenFst library: | *Name* | *Description* | ** | | =SequenceComposeFilter= | Requires epsilons on FST1 to be read before epsilons on FST2 | %DOX{fst::SequenceComposeFilter[%H%]}% | | =AltSequenceComposeFilter= | Requires epsilons on FST2 to be read before epsilons on FST1 |%DOX{fst::AltSequenceComposeFilter[%H%]}% | | =MatchComposeFilter= | Requires epsilons on FST1 to be matched with epsilons on FST2 whenever possible |%DOX{fst::MatchComposeFilter[%H%]}% | | =LookAheadComposeFilter= | Used with a lookahead matcher to block non-coaccessible paths | %DOX{fst::LookAheadComposeFilter[%H%]}% | | =PushWeightsComposeFilter= | Adds weight-pushing to a lookahead composition filter | %DOX{fst::PushWeightsComposeFilter[%H%]}% | | =PushLabelsComposeFilter= | Adds label-pushing to a lookahead composition filter | %DOX{fst::PushLabelsComposeFilter[%H%]}% | =SequenceComposeFilter= is the default composition filter. It can be [[FstAdvancedUsage#OperationOptions][changed]] by using the version of =ComposeFst= that accepts =ComposeFstOptions=. %DOX{fst::ComposeFstOptions[%H%]}% See [[FstAdvancedUsage#LookAheadMatchers][lookahead matchers]] for more information about composition with lookahead. #ErrorHandling ---++ Error Handling If =FLAGS_fst_error_fatal= is true (the default), then most serious errors cause program exit. The exception are most functions and methods that return NULL, a boolean, or =NoWeight()= on error - typically I/O and weight operations. If =FLAGS_fst_error_fatal= is false, then no operation is fatal. In that case, operations that return FSTs set the =kError= property bit. Otherwise classes have an Error() method that should be checked and functions return a boolean. It is intended that a sequence of operations preserve the =kError= property bit and the =NoWeight()=, so that it should suffice to check for error at the end of the sequence. #ExpandedFsts ---++ Expanded Fsts An =ExpandedFst= %DOX{fst::ExpandedFst[%H%]}% is an [[FstAdvancedUsage#BaseFsts][Fst]] that has an additional method that specifies the state count as well as methods to copy and read the expanded FST. In particular, an =ExpandedFst= class has the interface: <pre> template <class A> class ExpandedFst : public Fst<class A> { public: typedef A Arc; typedef typename A::StateId StateId; %RED% // State count %ENDCOLOR% StateId NumStates(); %RED% // Get a copy of this ExpandedFst %ENDCOLOR% virtual ExpandedFst<A> *Copy() const = 0; %RED% // Read an ExpandedFst from an input stream; returns NULL on error %ENDCOLOR% static ExpandedFst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an ExpandedFst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static ExpandedFst<A> *Read(const string &filename); }; </pre> =ExpandedFst= is an abstract class (note the pure virtual methods). Examples are =VectorFst= %DOX{fst::VectorFst[%H%]}% and =ConstFst= %DOX{fst::ConstFst[%H%]}% #FstInputOutput ---++ FST Input/Output The following describes methods for reading and writing binary file representations of FSTS. Note these binary file representations are [[http://en.wikipedia.org/wiki/Endianness][machine architecture dependent]]; use the [[FstQuickTour#CreatingShellFsts][textual]] file format cross-platform independence. The code: <verbatim> VectorFst<Arc> ifst; ... ifst.Write("a.fst"); VectorFst<Arc> *ofst = VectorFst<Arc>::Read("a.fst"); </verbatim> writes and reads a defined Fst type (=VectorFst=) and arc type (=Arc=) to and from a file in a straight-forward way. ---+++ Library Registration The call: <verbatim> Fst<Arc> *fst = Fst<Arc>::Read("a.fst"); </verbatim> reads the same =VectorFst= from the file as above, but returns a base =Fst=. This form, useful for code that works generically for different Fst types, can not work unless the Fst and arc type are appropriately _registered_. Some arc types (see [[FstAdvancedUsage#FstArcs][here]]) are already registered for common Fst types defined in the <nop>OpenFst library. Other arc type =Arc= and Fst type =F= pairs can be registered with the following call: <verbatim> REGISTER_FST(F, Arc); </verbatim> To avoid code bloat in a given program, registering arc types, in particular, should be used sparingly. ---+++ Script Registration In the above examples, the user provided the arc type as a template parameter. However, the call: <verbatim> $ fstdeterminize in.fst >out.fst </verbatim> works e.g. for both =StdArc= and =LogArc= arcs. This is accomplished by calling in =main(argc, argv)=: <verbatim> namespace script { FstClass *ifst = FstClass::Read(in_name); VectorFstClass ofst(ifst->ArcType()); Determinize(*ifst, &ofst); ofst.Write(out_name); } </verbatim> where: <verbatim> class VectorFstClass; void Determinize(const FstClass &ifst, MutableFstClass *ofst); </verbatim> are a class and function in the [[FstAdvancedUsage#FstScript][fst:script]] namespace that do not depend on the =Arc= template parameter. These forms, useful for code that works generically for different Arc types, can not work unless the arc type is appropriately _registered_. Some arc types (see [[FstAdvancedUsage#FstArcs][here]]) are already registered. Other arc types =Arc= can be registered with the following calls: <verbatim> REGISTER_FST_CLASS(VectorFstClass, Arc); REGISTER_FST_OPERATION(Determinize, Arc, DeterminizeArgs); </verbatim> If =Arc= defines a new weight type, it can be registered at the script level (enabling [[FstAdvancedUsage#WeightClass][WeightClass]] support) with the call: <verbatim> REGISTER_FST_WEIGHT(Arc::Weight); </verbatim> To avoid code bloat in a given program, registering arc types should be used sparingly. ---+++ FST Dynamic Shared Objects The examples above show how users can modify programs to be able to read new arc and Fst types. However, it would not be ideal to have to do so for all the distribution binaries or other existing programs. Instead, this can be done more easily with _dynamic shared objects (DSOs)_. To add a new Fst type, =MyFst= with =MyFst::Type()= = ="my_fst"=, use the code: <pre> // Register some arc types with this Fst type %ENDCOLOR% REGISTER_FST(MyFst, StdArc); REGISTER_FST(MyFst, LogArc); </pre> compiled into a dynamic shared object =my_fst.so=. If =my_fst.so= can be found in the =LD_LIBRARY_PATH= (or equivalent), you should be able to read the new Fst type with existing programs. To add a new arc type, =MyArc= with =MyArc::Type()= = ="my_arc"=, use the code: <pre> %RED% // Register some Fst types with this arc type %ENDCOLOR% REGISTER_FST(VectorFst, MyArc); REGISTER_FST(ConstFst, MyArc); %RED% // Register the fst::script operations with this arc type %ENDCOLOR% REGISTER_FST_OPERATIONS(MyArc); %RED% // Register some other operation with this arc type %ENDCOLOR% REGISTER_FST_OPERATION(Operations, MyArc, Args); </pre> compiled into a dynamic shared object =my_arc.so=. If can be found in =LD_LIBRARY_PATH= (or equivalent), you should be able to read the new arc type with existing programs. #FstScript ---++ fst::script The <nop>OpenFst library offers an additional abstraction layer that facilitates scripting with FSTs of different arc types. It allows users to load FSTs and perform various operations on them without specifying the underlying FST arc types. This higher-level layer, found in the =fst::script= namespace, is somewhat less efficient, due to indirection, and in general exposes fewer methods and options. It is used principally to implement the shell-level FST commands and [[PythonExtensions][Python bindings]]. From C++, users are encouraged in most circumstances to use the lower-level templated classes and operations for their efficiency and completeness. However, for simple 'glue' code that will work seamlessly with different arc types (esp. I/O), this higher-level might be appropriate. To use the scripting layer, include =<fst/fstscript.h>= in the installation include directory and link to =libfstscript.{a,so}= in the installation library directory. The [[FstExtensions][extension]] libraries work similarly; e.g. for the FAR extension, include =<fst/extension/farscript.h>= and link to =lib/libfstfarscript.{a,so}=. In the following, all classes and methods mentioned are in the namespace <tt>fst::script</tt>. ---+++ Class Overview At FST script level, the =Fst<Arc>= class [[FstQuickTour#OperationCallingCpp][template hierarchy]] is partially mirrored with a template-free =FstClass= hierarchy. For example, these classes are defined: ---++++ <tt>FstClass</tt> <pre> class FstClass { public: %RED%// Construct an FstClass from a templated Fst, hiding its arc type. %ENDCOLOR% template<class Arc> explicit FstClass(const Fst<Arc> &fst); %RED%// Copy constructor %ENDCOLOR% explicit FstClass(const FstClass &other); %RED%// Read an arc-templated Fst from disk, and return as an FstClass %ENDCOLOR% static FstClass *Read(const string &fname); %RED%// String representation of the arc type %ENDCOLOR% virtual const string &ArcType() const; %RED%// String representation of the underlying Fst type (e.g. 'vector') %ENDCOLOR% virtual const string &FstType() const; %RED%// String representation of the arc's weight type %ENDCOLOR% virtual const string &WeightType() const; %RED%// A pointer to this Fst's input symbol table %ENDCOLOR% virtual const SymbolTable *InputSymbols() const; %RED%// A pointer to this Fst's output symbol table %ENDCOLOR% virtual const SymbolTable *OutputSymbols() const; %RED%// Write the underlying arc-templated Fst to disk %ENDCOLOR% virtual void Write(const string &fname); %RED%// Return an integer representing all the properties (see Fst::Properties) %ENDCOLOR% virtual uint64 Properties(uint64 mask, bool test) const; %RED%// Call to get the underlying FST, if you know the concrete arc type // e.g. for an FstClass fc, // const Fst<StdArc> &f = *(fc.GetFst<StdArc>()); // Returns NULL if the given arc type doesn't match the underlying FST. %ENDCOLOR% template<class Arc> const Fst<Arc> *GetFst() const; virtual ~FstClass(); } </pre> Unlike its lower-level analogue <tt>Fst<Arc></tt>, <tt>FstClass</tt> is not abstract; it is a container which can be constructed from an arbitrary <tt>Fst<Arc></tt>. ---++++ <tt>MutableFstClass</tt> <pre> class MutableFstClass : public FstClass { public: %RED%// Construct a MutableFstClass from some kind of MutableFst<> %ENDCOLOR% template<class Arc> explicit MutableFstClass(const MutableFst<Arc> &fst); %RED%// If your code knows the arc type of the underlying MutableFst<>, it // can use this method to extract a pointer to it. This pointer can be used // to change the underlying MutableFst<> %ENDCOLOR% template<class Arc> MutableFst<Arc> *GetMutableFst(); %RED%// Set the input symbol table of the underlying MutableFst %ENDCOLOR% virtual void SetInputSymbols(SymbolTable *is); %RED%// Set the output symbol table of the underlying MutableFst %ENDCOLOR% virtual void SetOutputSymbols(SymbolTable *os); }; </pre> ---++++ <tt>VectorFstClass</tt> <noautolink> <pre class="prettyprint"> class VectorFstClass : public MutableFstClass { public: %RED%// Construct a copy of "other" as a VectorFstClass %ENDCOLOR% explicit VectorFstClass(const FstClass &other); %RED%// Construct a blank VectorFstClass with the given arc type %ENDCOLOR% explicit VectorFstClass(const string &arc_type); %RED%// Wrap the given VectorFst<Arc> %ENDCOLOR% template<class Arc> explicit VectorFstClass(VectorFst<Arc> *fst); }; </pre> ---++++ <tt>WeightClass</tt> This class hides the weight type similar to how the classes above hide the arc type of an FST. It is useful for weight I/O and for passing weights into the operations that require them. <pre> class WeightClass { public: %RED%// Construct a Zero %ENDCOLOR% WeightClass(); %RED%// Wrap a weight of the given type %ENDCOLOR% template<class W> explicit WeightClass(const W &weight); %RED%// Construct a weight given the string representation of its type // (e.g. "tropical") and a string representation of the weight // itself. %ENDCOLOR% WeightClass(const string &weight_type, const string &weight_str); %RED%// Copy constructor and assign %ENDCOLOR% WeightClass(const WeightClass &other); WeightClass &operator = (const WeightClass &other); %RED%// If you know the correct weight type, you can get it with this method. Will return NULL if an incorrect type is attempted. %ENDCOLOR% template<class W> W *GetWeight() const; %RED%// Constants representing zero and one in all possible weight types %ENDCOLOR% static const WeightClass &Zero(); static const WeightClass &One(); ~WeightClass(); }; </pre> ---+++ Operations In general, many of the [[FstQuickTour#AvailableOperations][operations]] that are implemented for the underlying templated FSTs are implemented for instances of <tt>FstClass</tt>, sometimes with modified option lists. Check =<fst/fstscript.h>=. #FstScriptCodeExamples ---+++ Example The code in the [[FstQuickTour#FstApplicationFromCpp][quick tour]] could be reimplemented using the <tt>fst::script</tt> library as follows. This new version will work with FSTs of all arc types, not just <tt>StdArc</tt> (though the arc types in <tt>input.fst</tt> and <tt>model.fst</tt> must match). <pre> %RED%// Reads in an input FST. %ENDCOLOR% FstClass *input = FstClass::Read("input.fst"); %RED%// Reads in the transduction model. %ENDCOLOR% FstClass *model = FstClass::Read("model.fst"); %RED%// The FSTs must be sorted along the dimensions they will be joined. // In fact, only one needs to be so sorted. // This could have instead been done for "model.fst" when it was created. %ENDCOLOR% ArcSort(input, OLABEL_COMPARE); ArcSort(model, ILABEL_COMPARE); %RED%// Container for composition result. %ENDCOLOR% VectorFstClass result(input->ArcType()); %RED%// Create the composed FST. %ENDCOLOR% Compose(*input, *model, &result); %RED%// Just keeps the output labels. %ENDCOLOR% Project(&result, PROJECT_OUTPUT); </pre> #FstTypes ---++ <nop>Fst Types The following non-abstract FST types with file representations are defined in the <nop>OpenFst library: | *Name* | *Usage* | *Description* | *Registered* | ** | | vector | =VectorFst<A>= | General-purpose mutable FST | =libfst.{a,so}= | %DOX{fst::VectorFst[%H%]}% | | const | =ConstFst<A>= | General-purpose expanded, immutable FST (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::ConstFst[%H%]}% | | constN, N=8,16,64 | =ConstFst<A, uintN>= | General-purpose expanded, immutable FST (# arcs < 2<sup>N</sup>) | =fst/libfstconst.{a,so}=, =fst/constN-fst.so= | %DOX{fst::ConstFst[%H%]}% | | compact_string | =CompactFst<A, StringCompactor<A>>= | Compact, immutable, unweighted, string FST (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compactN_string, N=8,16,64 | =CompactFst<A, StringCompactor<A>, uintN>= | Compact, immutable, unweighted string FST (# arcs < 2<sup>N</sup>) | =fst/libfstcompact.{a,so}=, =fst/compactN-fst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compact_weighted_string | =CompactFst<A, WeightedStringCompactor<A>>= | Compact, immutable, weighted, string FST (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compactN_weighted_string, N=8,16,64 | =CompactFst<A, WeightedStringCompactor<A>, uintN>= | Compact, immutable, weighted, string FST (# arcs < 2<sup>N</sup>) | =fst/libfstcompact.{a,so}=, =fst/compactN_weighted-fst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compact_acceptor | =CompactFst<A, AcceptorCompactor<A>>= | Compact, immutable, weighted FSA (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compactN_acceptor, N=8,16,64 | =CompactFst<A, AcceptorCompactor<A>, uintN>= | Compact, immutable, weighted FSA (# arcs < 2<sup>N</sup>) | =fst/libfstcompact.{a,so}=, =fst/compactN_acceptor-fst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compact_unweighted | =CompactFst<A, UnweightedCompactor<A>>= | Compact, immutable, unweighted FST (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compactN_unweighted N=8,16,64 | =CompactFst<A, UnweightedCompactor<A>, uintN>= | Compact, immutable, unweighted FST (# arcs < 2<sup>N</sup>) | =fst/libfstcompact.{a,so}=, =fst/compactN_unweighted-fst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compact_unweighted_acceptor | =CompactFst<A, UnweightedAcceptorCompactor<A>>= | Compact, immutable, unweighted FSA (# arcs < 2<sup>32</sup>) | =libfst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | compactN_unweighted_acceptor, N=8,16,64 | =CompactFst<A, UnweightedAcceptorCompactor<A>, uintN>= | Compact, immutable, unweighted FSA (# arcs < 2<sup>N</sup>) | =fst/libfstcompact.{a,so}=, =fst/compactN_unweighted_acceptor-fst.{a,so}= | %DOX{fst::CompactFst[%H%]}% | | ilabel_lookahead | ={Std,Log}ILabelLookAheadFst= | Immutable FST with input label lookahead matcher | =fst/libfstlookahead.{a,so}=, =fst/ilabel_lookahead-fst.{a,so}= | %DOX{namespacefst.html#StdILabelLookAheadFst[%H%]}% | | olabel_lookahead | ={Std,Log}OLabelLookAheadFst= | Immutable FST with output label lookahead matcher | =fst/libfstllookahead.{a,so}=, =fst/olabel_lookahead-fst.{a,so}= | %DOX{namespacefst.html#StdOLabelLookAheadFst[%H%]}% | | arc_lookahead | ={Std,Log}ArcLookAheadFst= | Immutable FST with arc lookahead matcher | =fst/libfstllookahead.{a,so}=, =fst/arc_lookahead-fst.{a,so}= | %DOX{namespacefst.html#StdArcLookAheadFst[%H%]}% | | ngram | =NGramFst<A>= | Immutable FST for n-gram language models | =fst/libfstngram.{a,so}= | %DOX{namespacefst.html#NGramFst[%H%]}% | These FST types are registered for =[[FstAdvancedUsage#Arcs][StdArc]]= and =[[FstAdvancedUsage#Arcs][LogArc]]= in the indicated libraries. The user must [[FstAdvancedUsage#FST_Input_Output][register]] other types themselves for general FST I/O. Note the libraries other than =libfst.{a,so}= are [[FstExtensions][extensions]] that must be built and linked separately (to avoid code bloat). For each of these, there is a version that contains all variants of that extension (e.g., =lib/libfstconst.{a,so}=) that should be specified at compile time . Alternatively, there are per variant libraries (e.g. =lib/constN-fst.so=) that will be dynamically loaded into any binary compiled with <nop>OpenFst when the =LD_LIBRARY_PATH= (or equivalent) includes e.g. =/usr/local/lib/fst=. Note =Std{I,O}LabelLookAheadFst=, despite its name, uses the =LogWeight::Plus()= during weight-pushing in composition (only). This choice was made for reasons of efficiency and convenience; it can circumvented by changing the _accumulator_ %DOX{fst::FastLogAccumulator[%H%]}% used. Non-abstract FST types without file representations include the on-the-fly Fst [[FstQuickTour#AvailableOperations][operations]] and the following: | *Name* | *Description* | ** | | =EditFst<A>= | Wraps an =ExpandedFst= as a =MutableFst=, sharing non-mutated components. | %DOX{fst::EditFst[%H%]}% | #FstConvert You can convert to a non-abstract FST Type =F<A>= by calling its =F<A>(const Fst<A> &)= constructor from C++ or the =fstconvert -fst_type=Fname= shell-level command (when there is a file representation). #LookAheadMatchers ---++ <nop>Look-Ahead Matchers _Lookahead matchers_ are [[FstAdvancedUsage#FstMatchers][matchers]] that implement additional functionality to allow looking-ahead along paths. When used in combination with a [[FstAdvancedUsage#ComposeFilters][lookahead filter]] in composition, this can result in considerable efficiency improvements. See Cyril Allauzen, Michael Riley and Johan Schalkwyk, [[%ATTACHURL%/ciaa10.pdf]["Filters for Efficient Composition of Weighted Finite-State Transducers"]], _Proceedings of the Fifteenth International Conference on Implementation and Application of Automata_, (CIAA 2010), Winnipeg, MB. The matcher interface is augmented with the following methods: <pre>template <class F> class SomeLookAheadMatcher { public: typedef F FST; typedef F::Arc Arc; typedef typename Arc::StateId StateId; typedef typename Arc::Label Label; typedef typename Arc::Weight Weight; %RED% // Required constructors.%ENDCOLOR% LookAheadMatcher(const F &fst, MatchType match_type); LookAheadMatcher(const LookAheadMatcher &matcher); Below are methods for looking ahead for a match to a label and more generally, to a rational set. Each returns false if there is definitely not a match and returns true if there possibly is a match %RED% // LABEL LOOKAHEAD: Can 'label' be read from the current matcher state // after possibly following epsilon transitions? %ENDCOLOR% bool LookAheadLabel(Label label) const; %RED% // RATIONAL LOOKAHEAD: The next methods allow looking ahead for an // arbitrary rational set of strings, specified by an FST and a state // from which to begin the matching. If the lookahead FST is a // transducer, this looks on the side different from the matcher // 'match_type' (cf. composition). %ENDCOLOR% %RED% // Are there paths P from 's' in the lookahead FST that can be read from // the cur. matcher state? %ENDCOLOR% bool LookAheadFst(const Fst<Arc>& fst, StateId s); %RED% // Gives an estimate of the combined weight of the paths P in the // lookahead and matcher FSTs for the last call to LookAheadFst. // A trivial implementation returns Weight::One(). Non-trivial // implementations are useful for weight-pushing in composition. %ENDCOLOR% Weight LookAheadWeight() const; %RED% // Is there is a single non-epsilon arc found in the lookahead FST // that begins P (after possibly following any epsilons) in the last // call LookAheadFst? If so, return true and copy it to '*arc', o.w. // return false. A trivial implementation returns false. Non-trivial // implementations are useful for label-pushing in composition. %ENDCOLOR% bool LookAheadPrefix(Arc *arc); %RED% // Optionally pre-specifies the lookahead FST that will be passed // to LookAheadFst() for possible precomputation. If copy is true, // then 'fst' is a copy of the FST used in the previous call to // this method (useful to avoid unnecessary updates). %ENDCOLOR% void InitLookAheadFst(const Fst<Arc>& fst, bool copy = false); }; </pre> The following lookahead matchers are defined in the <nop>OpenFst library: | *Name* | *Description* | ** | | =ILabelLookAheadMatcher= | Look-ahead to first non-epsilon input label on a path | %DOX{fst::LabelLookAheadMatcher[%H%]}% | | =OLabelLookAheadMatcher= | Look-ahead to first non-epsilon output label on a path | %DOX{fst::LabelLookAheadMatcher[%H%]}% | | =ArcLookAheadMatcher= | Look-ahead to first transition on a path | %DOX{fst::ArcLookAheadMatcher[%H%]}% | There are [[FstAdvancedUsage#FstTypes][FST types]] that are provided with these matchers. When these are used in composition, no special options need to be passed; the appropriate matcher and filter are selected automatically. The ilabel (olabel) lookahead matcher has some special properties. It currently requires that there are no input (output) epsilon cycles. Further, it may relabel the input (output) alphabet in order to efficiently look-ahead. The class =LabelLookAheadRelabeler= (in =<fst/lookahead-matcher.h>=) can be used to obtain the mapping between the old and new alphabet (=LabelLookAheadRelabeler::RelabelPairs=) and to relabel and sort other FSTs with the new labeling to make them suitable for composition (=LabelLookAheadRelabeler::Relabel=). Alternatively, the flag =--save_relabel_ipairs= (=--save_relabel_opairs=) can be used to send the relabeling information to a file when the lookahead matcher is constructed (useful when =[[FstAdvancedUsage#FstConvert][fstconvert]]= is used to create a lookahead FST from the command line). #FstMatchers ---++ Matchers _Matchers_ can find and iterate through requested labels at FST states; their principal use is in composition matching. In the simplest form, these are just a search or hash keyed on labels. More generally, they may implement matching special symbols that represent sets of labels such as ρ (rest), σ (all) or φ (fail), which can be used for more compact automata representations and faster matching. The Matcher interface is: <pre>%RED%// Specifies matcher action. %ENDCOLOR% enum MatchType { MATCH_INPUT, %RED%// Match input label. %ENDCOLOR% MATCH_OUTPUT, %RED%// Match output label. %ENDCOLOR% MATCH_NONE, %RED%// Match nothing. %ENDCOLOR% MATCH_UNKNOWN, %RED%// Match type unknown.%ENDCOLOR% };</pre> <pre>template <class F> class SomeMatcher { public: typedef F FST; typedef F::Arc Arc; typedef typename Arc::StateId StateId; typedef typename Arc::Label Label; typedef typename Arc::Weight Weight; %RED% // Required constructors. %ENDCOLOR% SomeMatcher(const F &fst, MatchType type); SomeMatcher(const SomeMatcher &matcher); %RED% // Returns the match type that can be provided (depending on // compatibility of the input FST). It is either // the requested match type, MATCH_NONE, or MATCH_UNKNOWN. // If 'test' is false, a constant time test is performed, but // MATCH_UNKNOWN may be returned. If 'test' is true, // a definite answer is returned, but may involve more costly // computation (e.g., visiting the Fst). %ENDCOLOR% MatchType Type(bool test) const; %RED% // Specifies the current state. %ENDCOLOR% void SetState(StateId s); %RED% // This finds matches to a label at the current state. // Returns true if a match found. kNoLabel matches any // 'non-consuming' transitions, e.g., epsilon transitions, // which do not require a matching symbol. %ENDCOLOR% bool Find(Label label); %RED% // These iterate through any matches found: // No more matches. %ENDCOLOR% bool Done() const; %RED% // Current arc (when !Done) %ENDCOLOR% const A& Value() const; %RED% // Advance to next arc (when !Done) %ENDCOLOR% void Next(); %RED% // Indicates preference for being the side used for matching // in composition/intersection.%ENDCOLOR% ssize_t Priority(StateId s); %RED% // Return matcher FST. %ENDCOLOR% const F& GetFst() const; %RED% // This specifies the known Fst properties as viewed from this // matcher. It takes as argument the input Fst's known properties. %ENDCOLOR% uint64 Properties(uint64 props) const; };</pre> The following matchers are defined in the <nop>OpenFst library (see also the [[FstAdvancedUsage#LookAheadMatchers][lookahead matcher]] topic). | *Name* | *Description* | ** | | =SortedMatcher= | Binary search on sorted input | %DOX{fst::SortedMatcher[%H%]}% | | =RhoMatcher<M>= | ρ symbol handling; templated on underlying matcher | %DOX{fst::RhoMatcher[%H%]}% | | =SigmaMatcher<M>= | σ symbol handling; templated on underlying matcher | %DOX{fst::SigmaMatcher[%H%]}% | | =PhiMatcher<M>= | φ symbol handling; templated on underlying matcher| %DOX{fst::PhiMatcher[%H%]}% | | =MultiEpsMatcher<M>= | Treats specified non-0 labels as non-consuming labels (in addition to 0) | %DOX{fst::MultiEpsMatcher[%H%]}% | | =ExplicitMatcher<M>= | Suppresses any implicit matches of non-consuming labels | %DOX{fst::ExplicitMatcher[%H%]}% | =SortedMatcher= requires the underlying Fst be sorted on the appropriate side. How it matches epsilons requires some explanation. =Find(0)= matches any epsilons on the underlying Fst explicitly (as if they were any other symbol) but also returns an implicit self-loop (namely =Arc(kNoLabel, 0, Weight::One(), current_state)= if the =match_type= is =MATCH_INPUT= and =Arc(0, kNoLabel, Weight::One(), current_state)= if the =match_type= is =MATCH_OUTPUT=). In other words, an epsilon matches at every state without moving forward on the matched FST, a natural interpretation. This behavior implements epsilon-transition handling in composition, or, more generally, a 'non-consuming' match as with the =MultiEpsMatcher= (with =kNoLabel= informing composition of such a match). A [[FstAdvancedUsage#ComposeFilters][composition filter]] determines which of these epsilon transitions are ultimately accepted. Any matcher used in composition and related algorithms must implement these implicit matches for correct epsilon handling. In some other uses, the implicit matches may not be needed. In that case, an =ExplicitMatcher= can be used to conveniently suppress them (or the user can recognize the =kNoLabel= loop and skip them). The special symbols referenced above behave as described in this table: | | *Consumes no symbol* | *Consumes symbol* | | *Matches all* | ε | σ | | *Matches rest* | φ | ρ | The ε symbol is assigned label =0= by [[FstConventions][convention]]. The numeric label of the other special symbols is determined by a constructor argument to their respective matchers. The ρ, σ and φ matchers augment the functionality of their underlying template argument matcher. In this way, matchers can be cascaded (with special symbol precedence determined by the order). A design choice for these matchers is whether to _remove_ the special symbol in the result (used for the ρ, σ, and φ matchers) or _return_ it (used for epsilon-handling). The first case is equivalent to (but more efficient than) applying special-symbol removal prior to composition (c.f., [[RmEpsilonDoc][epsilon removal]]). This case requires that only one of the FSTs in composition contain such symbols for any paired states. The second case requires well-defined semantics and that composition proper identify and handle any non-consuming symbols on each FST. (The result of =Find(kNoLabel)= identifies on one FST, while the matcher's returning a =kNolabel= loop handles the other, both described above.) The template =Matcher<F>= selects the pre-designated matcher for =Fst= type =F=; it is typically =SortedMatcher=. [[ComposeDoc][Composition]] uses this matcher by default. It can be [[FstAdvancedUsage#OperationOptions][changed]] by using the version of =ComposeFst= that accepts =ComposeFstOptions= %DOX{fst::ComposeFstOptions[%H%]}%. Note two matchers (usually of the same C++ type but different =MatchType=) are used in composition -- one for each FST. Whether actual match queries are performed on one or both FSTs depends on the matcher constructor arguments, the matcher capabilities (queried by =Type()=) and composition itself. An example access of an FST's matcher is [[FstQuickTour#FstMatcher][here]]. An example use of a ρ matcher in composition is [[FstAdvancedUsage#OperationOptions][here]]; σ and φ matcher usage is similar. #MutableFsts ---++ Mutable Fsts A =MutableFst= %DOX{fst::MutableFst[%H%]}% is an [[FstAdvancedUsage#ExpandedFsts][ExpandedFst]] that has additional methods that specifiy how to set the start state, final weights, [[FstAdvancedUsage#FstProperties][properties]] and the input and output symbols, how to add and delete states and arcs, as well as methods to copy and read the mutable FST. In particular, a =MutableFst= class has the interface: <pre> template <class A> class MutableFst : public ExpandedFst<class A> { public: typedef A Arc; typedef typename A::StateId StateId; typedef typename A::Weight Weight; %RED% // Set the initial state %ENDCOLOR% virtual void SetStart(StateId) = 0; %RED% // Set the initial state %ENDCOLOR% virtual void SetFinal(StateId, Weight) = 0; %RED% // Set property bits wrt mask %ENDCOLOR% virtual void SetProperties(uint64 props, uint64 mask) = 0; %RED% // Add a state, return its ID %ENDCOLOR% virtual StateId AddState() = 0; %RED% // Add an arc to state %ENDCOLOR% virtual void AddArc(StateId, const A &arc) = 0; %RED% // Delete some states %ENDCOLOR% virtual void DeleteStates(const vector<StateId>&) = 0; %RED% // Delete all states %ENDCOLOR% virtual void DeleteStates() = 0; %RED% // Delete some arcs at state %ENDCOLOR% virtual void DeleteArcs(StateId, size_t n) = 0; %RED% // Delete all arcs at state %ENDCOLOR% virtual void DeleteArcs(StateId) = 0; %RED% // Get a copy of this MutableFst %ENDCOLOR% virtual MutableFst<A> *Copy() const = 0; %RED% // Read an MutableFst from an input stream; returns NULL on error %ENDCOLOR% static MutableFst<A> *Read(istream &strm, const FstReadOptions &opts); %RED% // Read an MutableFst from a file; return NULL on error // Empty filename reads from standard input %ENDCOLOR% static MutableFst<A> *Read(const string &filename); %RED% // Set input label symbol table; NULL signifies not unspecified %ENDCOLOR% virtual void SetInputSymbols(const SymbolTable* isyms) = 0; %RED% // Set output label symbol table; NULL signifies not unspecified %ENDCOLOR% virtual void SetOutputSymbols(const SymbolTable* osyms) = 0; }; </pre> =MutableFst= is an abstract class (note the pure virtual methods). An example is =VectorFst= %DOX{fst::VectorFst[%H%]}%. The companion [[FstAdvancedUsage#ArcIterators][mutable arc iterator]] class provides access to and modification of the transitions of the FST #NaturalOrders ---++ Natural Orders The _natural order_ ≤ associated with a [[FstGlossary#SemiringDef][semiring]] is defined as a ≤ b iff a ⊕ b = a. In the <nop>OpenFst library, we define the strict version of this order as: <verbatim> template <class W> NaturalLess() { bool operator()(const W &w1, const W &w2) const { return (Plus(w1, w2) == w1) && w1 != w2; } }; </verbatim> An order is _left monotonic_ w.r.t a semring iff a ≤ b ⇒ ∀c, c ⊕ a ≤ c ⊕ b and c ⊗ a ≤ c ⊗ b; _right monotonic_ is defined similarly. An order is negative iff <span style="text-decoration:overline">1</span> ≤ <span style="text-decoration:overline">0</span>. The natural order is a left (right) monotonic and negative partial order iff the semiring is [[FstWeightRequirements][idempotent]] and left (right) [[FstWeightRequirements][distributive]]. It is a total order iff the semiring has the [[FstWeightRequirements][path property]]. See Mohri, "Semiring Framework and Algorithms for Shortest-Distance Problems", _Journal of Automata, Languages and Combinatorics_ 7(3):321-350, 2002. This is the default total order (under the requirements above) that we use for the shortest path and pruning algorithms. This order is the _natural_ one to use given that it generally needs to be total, monotonic and. negative: _total_ so that all weights can be compared, _monotonic_ so there is a practical algorithm, and _negative_ so that the "free" weight <span style="text-decoration:overline">1</span> is preferred to the "disallowed" weight <span style="text-decoration:overline">0</span>. #OperationOptions ---++ Operation Options Many [[FstQuickTour#AvailableOperations][FST operations]] have versions that accept options, especially option structures, that have not been documented in this Wiki for brevity other than to mention some of the parameters that can be changed. For example, most of the [[FstQuickTour#DelayedFsts][delayed Fsts]] have constructors that accept options that control [[FstAdvancedUsage#FstCaching][caching behavior]]. Here is an example that selects minimal [[FstAdvancedUsage#FstCaching][caching]] and the [[FstAdvancedUsage#FstMatchers][rho matcher]] (for fst2 ρ's) in composition:: <verbatim> typedef RhoMatcher< SortedMatcher<StdFst> > RM; ComposeFstOptions<StdArc, RM> opts; opts.gc_limit = 0; opts.matcher1 = new RM(fst1, MATCH_NONE, kNoLabel); opts.matcher2 = new RM(fst2, MATCH_INPUT, SomeRhoLabel); StdComposeFst cfst(fst1, fst2, opts); </verbatim> Follow the links to the code under each operation's documentation for the specific details. ---++ Properties Each =Fst= has associated with it a set of stored properties that assert facts about it. These are queried in an FST with the =Properties()= method and set in a =MutableFst= with the =SetProperties()= method. <nop>OpenFst library operations use these properties to optimize their performance. <nop>OpenFst library operations and mutable FSTs attempt to preserve as much property information in their results as possible without significant added computation. Some properties are binary - they are either true or false. For each such property, there is a single stored bit that is set if true and not set if false. The binary =Fst= properties are: | *Name* | *Description* | | =kError= | an [[FstAdvancedUsage#ErrorHandling][error]] was detected while constructing/using the FST | | =kExpanded= | Is an [[FstAdvancedUsage#ExpandedFst][ExpandedFst]] | | =kMutable= | Is a [[FstAdvancedUsage#MutableFst][MutableFst]] | Other properties are trinary - they are either true, false or unknown. For each such property, there are two stored bits; one is set if true, the other is set if false and neither is set if unknown. |*Type* | *Name* | *Description* | | Acceptor | =kAcceptor= | Input and output label are equal for each arc | | | =kNotAcceptor= | Input and output label are not equal for some arc | | Accessible | =kAccessible= | All states reachable from the initial state | | | =kNotAccessible= | Not all states reachable from the initial state | | | =kCoAccessible= | All states can reach a final state | | | =kNotCoAccessible= | Not all states can reach a final state | | Cyclic | =kCyclic= | Has cycles | | | =kAcyclic= | Has no cycles | | | =kInitialCyclic= | Has cycles containing the initial state | | | =KInitialAcyclic= | Has no cycles containing the initial state | | Deterministic | =kIDeterministic=| Input labels are unique leaving each state | | | =kNonIDeterministic=| Input labels are not unique leaving some state | | | =kODeterministic=| Output labels are unique leaving each state | | | =kNonODeterministic=| Output labels are not unique leaving some state | | Epsilons | =kEpsilons= | Has input/output epsilons | | | =KNoEpsilons= | Has no input/output epsilons | | | =kIEpsilons= | Has input epsilons | | | =KNoIEpsilons= | Has no input epsilons | | | =kOEpsilons= | Has output epsilons | | | =KNoOEpsilons= | Has no output epsilons | | Sorted | =kILabelSorted= | Input labels sorted for each state | | | =kNotILabelSorted= | Input labels not sorted for each state | | | =kOLabelSorted= | Output labels sorted for each state | | | =kNotOLabelSorted= | Output labels not sorted for each state | | | =kTopSorted= | States topologically sorted | | | =kNotTopSorted= | States not topologically sorted | | Weighted | =kWeighted= | Non-trivial arc or final weights | | | =kNotWeighted= | Only trivial arc and final weights | The call =fst.Properties(mask, false)= returns the stored property bits set in the mask bits; some properties may be unknown. it is a constant-time operation. The call =fst.Properties(mask, true)= returns the stored property bits set in the mask bits after computing and updating any of those set in the mask that are unknown. It is a linear-time (_O(V + E)_) operation if any of the requested bits were unknown. Note =fstinfo --test_properties=false= will show the stored properties bits, while =fstinfo= or =fstinfo --test_properties=true= will compute unknown properties. #StateIterators ---++ State Iterators A state iterator %DOX{fst::StateIterator[%H%]}% is used to access the states of an FST. It has the form: <pre> template <class F> class StateIterator { typedef typename F::Arc Arc; typedef typename Arc::StateId StateId; public: StateIterator(const &F fst); %RED% // End of iterator? %ENDCOLOR% bool Done() const; %RED% // Current state ID (when !Done) %ENDCOLOR% StateId Value() const; %RED% // Advance to next state (when !Done) %ENDCOLOR% void Next(); %RED% // Return to initial position %ENDCOLOR% void Reset(); }; </pre> It is templated on the Fst class =F= to allow efficient specializations but defaults to a generic version on the abstract base [[FstAdvancedUsage#BaseFsts][Fst]] class. See [[FstConventions][here]] for conventions that state iterator use must respect. An example use of a state iterator is shown [[FstQuickTour#StateIterator][here]]. #StateMappers ---++ State Mappers _State mappers_ are function objects used by the [[StateMapDoc][StateMap]] operation to transform states. A state mapper has the form: <pre>%RED%// This determines how symbol tables are mapped. %ENDCOLOR% enum MapSymbolsAction { %RED% // Symbols should be cleared in the result by the map. %ENDCOLOR% MAP_CLEAR_SYMBOLS, %RED% // Symbols should be copied from the input FST by the map. %ENDCOLOR% MAP_COPY_SYMBOLS, %RED% // Symbols should not be modified in the result by the map itself. // (They may set by the mapper). %ENDCOLOR% MAP_NOOP_SYMBOLS }; class SomeStateMapper { public: %RED% // Assumes input arc type is A and result arc type is B %ENDCOLOR% typedef A FromArc; typedef B ToArc; %RED% // Typical constructor. %ENDCOLOR% SomeStateMapper(const Fst<A> &fst);%RED% // Required copy constructor that allows updating Fst argument.%ENDCOLOR% SomeStateMapper(const SomStateMapper &mapper, const Fst<A&ft; *fst = 0);%RED% // Specifies initial state of result%ENDCOLOR% B::StateId Start() const;%RED% // Specifies state's final weight in result%ENDCOLOR% B::Weight Final(B::StateId s) const;%RED% // These methods iterate through a state's arcs in result // Specifies state to iterator over %ENDCOLOR% void SetState(B::StateId s); %RED% // End of arcs? %ENDCOLOR% bool Done() const; %RED% // Current arc %ENDCOLOR% const B &Value() const; %RED% // Advance to next arc (when !Done) %ENDCOLOR% void Next(); %RED% // Specifies input symbol table action the mapper requires (see above). %ENDCOLOR% MapSymbolsAction InputSymbolsAction() const; %RED% // Specifies output symbol table action the mapper requires (see above). %ENDCOLOR% MapSymbolsAction OutputSymbolsAction() const; %RED% // This specifies the known properties of an Fst mapped by this // mapper. It takes as argument the input Fst's known properties %ENDCOLOR% uint64 Properties(uint64 props) const; };</pre> The following state mappers are defined in the <nop>OpenFst library: | *Name* | *Description* | ** | | =ArcSumMapper= | Sums weights of identically labeled multi-arcs | %DOX{fst::ArcSumMapper[%H%]}% | | =ArcUniqueMapper= | Keeps one of identically labelled and weighted multi-arcs | %DOX{fst::ArcUniqueMapper[%H%]}% | | =IdentityStateMapper= | Maps to self | %DOX{fst::IdentityStateMapper[%H%]}% | Another specialized state mapper is used to implement [[ArcSortDoc][ArcSort]]. #StateQueues ---++ State Queues State queues are used by, among others, the [[ShortestPathDoc][shortest path]] and [[ShortestDistanceDoc][shortest distance]] algorithms and by the [[FstAdvancedUsage#FstVisitors][Visit]] operation. A =state queue= has the form: <pre> template <class StateId> class SomeQueue { public: %RED% // Ctr: may need args (e.g., Fst, comparator) for some queues %ENDCOLOR% SomeQueue(...); %RED% // Returns the head of the queue %ENDCOLOR% StateId Head() const; %RED% // Inserts a state %ENDCOLOR% void Enqueue(StateId s); %RED% // Removes the head of the queue %ENDCOLOR% void Dequeue(); %RED% // Updates ordering of state s when weight changes, if necessary %ENDCOLOR% void Update(StateId s); %RED% // Does the queue contain no elements? %ENDCOLOR% bool Empty() const; %RED% // Remove all states from queue %ENDCOLOR% void Clear(); }; </pre> Pre-defined state queues include: | *Queue* | *Description* | ** | | =AutoQueue= | Automatically-selected from Fst properties | %DOX{fst::AutoQueue[%H%]}% | | =FifoQueue= | First-In, first-Out | %DOX{fst::FifoQueue[%H%]}% | | =LifoQueue= | Last-In, first-Out | %DOX{fst::LifoQueue[%H%]}% | | =NaturalAStarQueue= | A* (under natural order with provided estimate) | %DOX{fst::NaturalAStarQueue[%H%]}% | | =NaturalPruneQueue= | Pruning meta-queue (within provided threshold under natural order) | %DOX{fst::NaturalPruneQueue[%H%]}% | | =NaturalShortestFirstQueue= | Priority (least weight under natural order) | %DOX{fst::NaturalShortestFirstQueue[%H%]}% | | =SccQueue= | Component graph top-ordered meta-queue | %DOX{fst::SccQueue[%H%]}% | | =StateOrderQueue= | State-ID ordered | %DOX{fst::StateOrderQueue[%H%]}% | | =TopOrderQueue= | Topologically ordered | %DOX{fst::TopOrderQueue[%H%]}% | Some queues accept [[FstAdvancedUsage#ArcFilters][arc filters]] to control which transitions are explored. #StateTables ---++State Tables _State tables_ determine the bijective mapping between state tuples (e.g. in [[ComposeDoc][composition]] triples of two FST states and a [[FstAdvancedUsage#ComposeFilters][composition filter state]]) and their corresponding state IDs. They are classes, templated on state tuples, of the form: <pre>template <class T> class SomeStateTable { typedef typename T StateTuple; %RED% // Required constructors. %ENDCOLOR% SomeStateTable(); %RED% // Lookup state ID by tuple. If it doesn't exist, then add it. %ENDCOLOR% StateId FindState(const StateTuple &); %RED% // Lookup state tuple by state ID. %ENDCOLOR% const StateTuple<StateId> &Tuple(StateId) const; }; </pre> A state tuple has the form: <pre>template <class S> struct SomeStateTuple { typedef typename S StateId; %RED% // Required constructor. %ENDCOLOR% SomeStateTuple(); %RED% // Data %ENDCOLOR% ... }; </pre> A specific state tuple is a =ComposeStateTuple= %DOX{fst::ComposeStateTuple[%H%]}% that has data members =StateId state_id1=, =StateId state_id2=, and =FilterState filter_state=. The following state tables are defined in the <nop>OpenFst library: | *Name* | *Description* | ** | | =HashStateTable= | Hash map implementation | %DOX{fst::HashStateTable[%H%]}% | | =CompactHashStateTable= | Hash set implementation | %DOX{fst::CompactHashStateTable[%H%]}% | | =VectorStateTable= | Vector implementation | %DOX{fst::VectorStateTable[%H%]}% | | =VectorHashStateTable= | Vector and hash set implementation | %DOX{fst::VectorHashStateTable[%H%]}% | | =ErasableStateTable= | Deque implementation - permits erasures | %DOX{fst::ErasableStateTable[%H%]}% | Different state tables provide different time and space tradeoffs for applications. Composition state tables are defined using state tables with =ComposeStateTuple=. They are the principal data structure used by composition other than the result [[FstAdvancedUsage#FstCaching][cache]]. The following composition state tables are defined in the <nop>OpenFst library: | *Name* | *State Table* | *Description* | ** | | =GenericComposeStateTable= | =CompactHashStateTable= | General-purpose choice | %DOX{fst::GenericComposeStateTable[%H%]}% | | =ProductComposeStateTable= | =VectorStateTable= | Efficient when the composition state space is densely populated | %DOX{fst::ProductComposeStateTable[%H%]}% | | =StringDetComposeStateTable= | =VectorStateTable= | Efficient when FST1 is a string and FST2 is deterministic | %DOX{fst::StringDetComposeStateTable[%H%]}% | | =DetStringComposeStateTable= | =VectorStateTable= | Efficient when FST1 is deterministic and FST2 is a string | %DOX{fst::DetStringComposeStateTable[%H%]}% | | =EraseableComposeStateTable= | =ErasableStateTable= | Allows composition state tuple erasure | %DOX{fst::ErasableComposeStateTable[%H%]}% | =GenericComposeStateTable= is the default composition state table. It can be [[FstAdvancedUsage#OperationOptions][changed]] by using the version of =ComposeFst= that accepts =ComposeFstOptions=. %DOX{fst::ComposeFstOptions[%H%]}% #SymbolTables ---++ Symbol Tables _Symbol tables_ store the bijective mapping between textual labels, used in [[FstQuickTour#SymbolTables][reading and printing]] an FST textual file, and their integer assignment, used in the FST's internal representation. Symbol tables are usually read in with [[FstQuickTour#FstCompile][fstcompile]], can be stored by the FST, and used to print out the FST with [[FstQuickTour#FstPrint][fstprint]],. Here are a examples of symbol table manipulation: <pre>%RED%// Various ways to reading symbol tables %ENDCOLOR% StdFst *fst = StdFst::Read("some.fst"); SymbolTable *isyms = fst->InputSymbolTable(); SymbolTable *osyms = fst->OutputSymbolTable(); SymbolTable *syms = SymbolTable::ReadText("some.syms"); %RED% // Adding and accessing symbols and keys %ENDCOLOR% syms->AddSymbol("kumquat", 7); int64 key = syms->Find("kumquat"); string symbol = syms->Find(7); %RED% // Various ways of writing symbol tables %ENDCOLOR% fst->SetInputSymbols(isyms); fst->SetOutputSymbols(osyms); fst->Write("some.fst"): syms->WriteText("some.syms"); </pre> #NewArcs ---++ User-defined Fst Arcs and Weights A user may define his own weight type so long as it meets the necessary [[FstWeightRequirements][requirements]]. A user may define his own arc type so long as has the right [[FstAdvancedUsage#FstArcs][form]]. Some Fst I/O with new arc types requires [[FstAdvancedUsage#FstInputOutput][registration]]. #NewFsts ---++ User-defined Fst Classes %ICON{wip}% #FstVisitors ---++Visitors The simplest way to traverse an FST is in state order using a [[FstQuickTour#StateIterator][state iterator]]. A very general traversal method is to use: <pre> Visit(fst, visitor, queue); %DOX{namespacefst.html#Visit[%H%]}% </pre> where the =visitor= object specfies the actions taken in the traversal while the [[FstAdvancedUsage#StateQueues][state queue]] object specifies the traversal order. A =visitor= has the form: <pre>%RED%// Visitor Interface - class determines actions taken during a visit. // If any of the boolean member functions return false, the visit is // aborted by first calling FinishState() on all unfinished (grey) // states and then calling FinishVisit(). %ENDCOLOR% template <class Arc> class SomeVisitor { public: typedef typename Arc::StateId StateId; SomeVisitor(T *return_data); %RED% // Invoked before visit %ENDCOLOR% void InitVisit(const Fst<Arc> &fst); %RED% // Invoked when state discovered (2nd arg is visitation root) %ENDCOLOR% bool InitState(StateId s, StateId root); %RED% // Invoked when arc to white/undiscovered state examined %ENDCOLOR% bool WhiteArc(StateId s, const Arc &a); %RED% // Invoked when arc to grey/unfinished state examined %ENDCOLOR% bool GreyArc(StateId s, const Arc &a); %RED% // Invoked when arc to black/finished state examined %ENDCOLOR% bool BlackArc(StateId s, const Arc &a); %RED% // Invoked when state finished %ENDCOLOR% void FinishState(StateId s); %RED% // Invoked after visit %ENDCOLOR% void FinishVisit(); }; </pre> While a depth-first search can be implemented using =Visit()= with the =LifoQueue()=, it is often better to use the more specialized =DFSVisit()%DOX{namespacefst.html#DfsVisit[%H%]}%= in [[http://cims.nyu.edu/~openfst/doxygen/html/dfs-visit_8h-source.html][<fst/dfs-visit.h>]] since it is somewhat more space-efficient and the specialized visitor interface described there has additional funcitionality for a DFS. Pre-defined FST visitors include: | *Visitor* | *Type* | *Description* | ** | | =CopyVisitor= | Visit | Copies in a queue-specified order | %DOX{fst::CopyVisitor[%H%]}% | | =SccVisitor= | <nop>DfsVisit | Finds strongly-connected components, [[FstGlossary#AccessibleDef][accessibility]] and [[FstGlossary#CoAccessibleDef][coaccessibility]] | %DOX{fst::SccVisitor[%H%]}% | | =TopOrderVisitor= | <nop>DfsVisit | Finds topological order | %DOX{fst::TopOrderVisitor[%H%]}% | The visit operations optionally accept [[FstAdvancedUsage#ArcFilters][arc filters]] to control which transitions are explored. #FstWeights ---++ Weights A =Weight= is a type that is used to represent the cost of taking transitions in an FST. The following basic weight templates are defined in the <nop>OpenFst library: | *Semiring* | *Name* | *Set* | *⊕ %BR% (Plus)* | *⊗ %BR% (Times)* | *<span style="text-decoration:overline">0</span> %BR% (Zero)* | *<span style="text-decoration:overline">1</span> %BR% (One)* | *Notes* | ** | | Expectation | =ExpectationWeight<W1, W2>= | W1 X W2 | ⊕<sub>W1</sub> X ⊕<sub>W2</sub> | [[ExpectationPlus][⊗<sub>expectation</sub>]] | (<span style="text-decoration:overline">0</span><sub>W1</sub>,<span style="text-decoration:overline">0</span><sub>W2</sub>) | (<span style="text-decoration:overline">1</span><sub>W1</sub>,<span style="text-decoration:overline">0</span><sub>W2</sub>) | | %DOX{fst::ProductWeight[%H%]}% | | Lexicographic | =LexicographicWeight<W1, W2>= | W1 X W2 | min | ⊗<sub>W1</sub> X ⊗<sub>W2</sub> | (<span style="text-decoration:overline">0</span><sub>W1</sub>,<span style="text-decoration:overline">0</span><sub>W2</sub>) | (<span style="text-decoration:overline">1</span><sub>W1</sub>,<span style="text-decoration:overline">1</span><sub>W2</sub>) | min: lexicographic order w.r.t. %BR% =W1= and =W2= [[FstAdvancedUsage#NaturalOrders][natural orders]] | %DOX{fst::LexicographicWeight[%H%]}% | | Log | =LogWeightTpl<T>= | [-∞, ∞] | -log(e<sup>-x</sup> + e<sup>-y</sup>) | + | ∞ | 0 | =T=: floating point | %DOX{fst::LogWeightTpl[%H%]}% | | <nop>MinMax | =MinMaxWeightTpl<T>= | [-∞, ∞] | min | max | ∞ | -∞ | =T=: floating point | %DOX{fst::MinMaxWeightTpl[%H%]}% | | Power | =PowerWeight<W, n>= | W<sup>n</sup> | ⊕<sub>W</sub><sup>n</sup> | ⊗<sub>W</sub><sup>n</sup> | <span style="text-decoration:overline">0</span><sub>W</sub><sup>n</sup> | <span style="text-decoration:overline">1</span><sub>W</sub><sup>n</sup> | | %DOX{fst::PowerWeight[%H%]}% | | Product | =ProductWeight<W1, W2>= | W1 X W2 | ⊕<sub>W1</sub> X ⊕<sub>W2</sub> | ⊗<sub>W1</sub> X ⊗<sub>W2</sub> | (<span style="text-decoration:overline">0</span><sub>W1</sub>,<span style="text-decoration:overline">0</span><sub>W2</sub>) | (<span style="text-decoration:overline">1</span><sub>W1</sub>,<span style="text-decoration:overline">1</span><sub>W2</sub>) | | %DOX{fst::ProductWeight[%H%]}% | | SignedLog | =SignedLogWeightTpl<T>= | {-1,1} X [-∞, ∞] | [[SignedLogPlus][⊕<sub>signed_log</sub>]] | (*,+) | (1, ∞) | (1, 0) | =T=: floating point | %DOX{fst::SignedLogWeightTpl[%H%]}% | | SparsePower | =SparsePowerWeight<W>= | W<sup>n</sup> | ⊕<sub>W</sub><sup>n</sup> | ⊗<sub>W</sub><sup>n</sup> | <span style="text-decoration:overline">0</span><sub>W</sub><sup>n</sup> | <span style="text-decoration:overline">1</span><sub>W</sub><sup>n</sup> | =n=: arbitrary | %DOX{fst::SparsePowerWeight[%H%]}% | | String | =StringWeight<L, STRING_LEFT>= | L<sup>*</sup> ∪ {∞} | longest com. prefix | ⋅ | ∞ | ε | =L=: signed integral | %DOX{fst::StringWeight[%H%]}% | | | =StringWeight<L, STRING_RIGHT>= | L<sup>*</sup> ∪ {∞} | longest com. suffix | ⋅ | ∞ | ε | =L=: signed integral | %DOX{fst::StringWeight[%H%]}% | | Tropical |=TropicalWeightTpl<T>= | [-∞, ∞] | min | + | ∞ | 0 | =T=: floating point | %DOX{fst::TropicalWeightTpl[%H%]}% | The following weight types have been defined in the <nop>OpenFst library in terms of the above: | *Name* | *Type* | | =GallicWeight<L, W, S>= | =ProductWeight<StringWeight<L, S>, W>= | | =LogWeight= | =LogWeightTpl<float>= | | =Log64Weight= | =LogWeightTpl<double>= | | =MinMaxWeight= | =MinMaxWeightTpl<float>= | | =SignedLogWeight= | =SignedLogWeightTpl<float>= | | =SignedLog64Weight= | =SignedLogWeightTpl<double>= | | =TropicalWeight= | =TropicalWeightTpl<float>= | Composite weights, such as =ProductWeight= and =LexicographicWeight=, can use [[FstAdvancedUsage#CommandLineOptions][command line flags]] to control their textual formatting. =FLAGS_fst_weight_separator= is printed between the weights (default: =","=). =FLAGS_fst_weight_parentheses= (default: =""=) brackets the weight; if you create nested composite weights, they need to be printed with non-empty brackets (e.g. ="()"=) to ensure correct parsing if read back in. These affect only textual (not binary) I/O. Additional weight information: * [[FstAdvancedUsage#FstArcs][Corresponding arc types]] * [[FstQuickTour#FstWeights][Elementary weight information]] * [[FstAdvancedUsage#NewArcs][User-defined weights]] * [[FstWeightRequirements][Weight type requirements]]
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r73
|
r69
<
r68
<
r67
<
r66
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r67 - 2018-01-05
-
KyleGorman
FST
Log In
or
Register
FST Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback