OpenGrm SFst Available Operations

The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g. canonical, backoff-complete or normalized). The binary commands typically check their input requirements are satisfied or raise an error but the C++ versions may not check for efficiency (see the source code documentation for specific cases).

Operation Usage Description Complexity
Approx Approx(ifst, &backoff_fst, phi_label, delta) approximates a normalized stochastic FST wrt a provided backoff-complete FST same as ShortestDistance on the intersection of the input and output FSTs
  sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst    
Count    
  sfstcount    
IsCanonical IsCanonical(fst, phi_label) checks the second property here holds for a weighted FST Time, Space: O(V + E)
IsNormalized IsNormalized(fst, phi_label, delta) checks the two properties here hold for a weighted FST Time, Space: O(V + E)
GlobalNormalize GlobalNormalize(&fst, phi_label, delta) globally normalizes, when possible1, a canonical weighted FST preserving total path weights (up to a global constant) same as ShortestDistance
  sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst    
LocalNormalize LocalNormalize(&fst) locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant Time, Space: O(V + E)
  sfstnormalize -method=local in.fst out.fst    
NGramApprox NGramApprox(ifst, &ofst, order, phi_label, delta) approximates a normalized stochastic FST as an n-gram model (having phi_labels in OpenGrm NGram format) same as ShortestDistance on the intersection of the input and output FSTs
  sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst    
Perplexity Perplexity(fst, phi_label, unknown_label, unknown_class_size) computes perplexity for a stochastic FST  
  sfstperplexity [--phi_label=$l] [-unknown_label=$u][--unknown_class_size=$s] in.fst test.far (test sentences are in FST archive format)  
PhiNormalize PhiNormalize(&fst, phi_label) normalizes, when possible, a canonical weighted FST by only modifying the failure transitions Time, Space: O(V + E)
  sfstnormalize --method=phi [-phi_label=$l][--delta=$d] in.fst out.fst    
RandGen fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...)) randomly generates paths in a stochastic FST (correctly dealing with failure transitions) see RandGen
  sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst    
Trim Trim(&fst, phi_label) removes useless states and transitions in stochastic automata (irrespective of weights) Time, Space: O(V + E * max-phi-order)
  sfsttrim -phi_label=$l in.fst out.fst    


1Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim).

Michael Riley - 2018-07-16

Edit | Attach | Watch | Print version | History: r8 | r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r4 - 2019-07-18 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback