GRM Web>SFstLibrary>SFstAvailableOperations (2020-07-06, MichaelRiley)

OpenGrm SFst Available Operations

The following operations are provided for SFSTs. Care must be taken that the input FSTs meet the specified requirements (e.g. canonical, backoff-complete or normalized). The binary commands typically check their input requirements are satisfied or raise an error but the C++ versions may not check for efficiency (see the source code documentation for specific cases).

Operation	Usage	Description	Complexity
Approx	Approx(ifst, &backoff_fst, phi_label, delta)	approximates a normalized stochastic FST wrt a provided backoff-complete FST	same as ShortestDistance on the intersection of the input and output FSTs
	sfstapprox[--phi_label=$l][--delta=$d] in.fst backoff.fst out.fst
Count	Counter counter(phi_label, delta, &topfst) counter.Count(infst)	counts from stochastic FST wrt to a provided backoff-complete FST	same as ShortestDistance on the intersection of the input and output FSTs
	sfstcount [--phi_label=$l] in.fst top.fst out.fst
CountNormalize	CountNormalize(&fst)	normalizes a count FST (e.g. as output by Count())	Time: O(sE) where s is max iterations per state, Space: O(V)
	sfstnormalize -method={kl_min,summed} [--phi_label=$l] in.fst out.fst
GlobalNormalize	GlobalNormalize(&fst, phi_label, delta)	globally normalizes, when possible¹, a canonical weighted FST preserving total path weights (up to a global constant)	same as ShortestDistance
	sfstnormalize [--method=global] [--phi_label=$l][--delta=$d] in.fst out.fst
Info	sfstinfo [--phi_label=$l] in.fst	prints out information about a stochastic FST	Time, Space: O(V + E * max-phi-order)
Intersect	Intersect(ifst1, ifst2, &ofst, phi_label, trim)	intersects FSAs in the presence of failure transitions	Time²: O(E₁V₂(max-label-multiplicity₂ + max-phi-order₂ log(max-out-degree₂))
	sfstintersect [--trim] [--phi_label=$l] in1.fst in2.fst out.fst
IsCanonical	IsCanonical(fst, phi_label)	checks the second property here holds for a weighted FST	Time, Space: O(V + E)
IsNormalized	IsNormalized(fst, phi_label, delta)	checks the two properties here hold for a weighted FST	Time, Space: O(V + E)
LocalNormalize	LocalNormalize(&fst)	locally normalizes, when possible, a canonical weighted FST preserving each state's out-going arc weights up to a state-specific constant	Time, Space: O(V + E)
	sfstnormalize -method=local [--phi_label=$l] in.fst out.fst
NGramApprox	NGramApprox(ifst, &ofst, order, phi_label, delta)	approximates a normalized stochastic FST as an n-gram model (having `phi_labels` in OpenGrm NGram format)	same as ShortestDistance on the intersection of the input and output FSTs
	sfstngramapprox [--order=$o][--phi_label=$l][--delta=$d] in.fst out.fst
Perplexity	Perplexity perp(qfst, phi_label, unknown_label) [perp.SetTarget(pfst)] perp.GetPerplexity()	computes self/cross perplexity for stochastic FSTs	same as ShortestDistance on the intersection of the source and target FSTs
	sfstperplexity [--phi_label=$l] [--unknown_label=$u] q.fst [p.{fst,far}]	(p.far is in FST archive format)
PhiNormalize	PhiNormalize(&fst, phi_label)	normalizes, when possible, a canonical weighted FST by only modifying the failure transitions	Time, Space: O(V + E)
	sfstnormalize --method=phi [--phi_label=$l][--delta=$d] in.fst out.fst
RandGen	fst::RandGen(ifst, &ofst, fst::RandGenOptions<SFstArcSelector>(...))	randomly generates paths in a stochastic FST (correctly dealing with failure transitions)	see RandGen
	sfstrandgen [--phi_label=$l] [--max_length=$l] [--npath=$n] [--seed=$s] in.fst out.fst
ShortestDistance	ShortestDistance(fst, &distance, phi_label, reverse, delta)	computes the shortest distance in the presence of failure transitions	same as ShortestDistance
	sfstshorttestdistance [--phi_label=$l][--reversse][--delta=$d] in.fst
Topology	sfstopology [--method=ngram] [--phi_label=$l] in.fst out.fst	algorithms for constructing specific FST topologies	Time, Space: O(V + E)
Trim	Trim(&fst, phi_label)	removes useless states and transitions in stochastic automata (irrespective of weights)	Time, Space: O(V + E max-phi-order)*
	sfsttrim[- -phi_label=$l] in.fst out.fst

¹Possible when the sum of weight of all successful paths from the initial state is finite (and the input is trim

²Assumes for each state (s1, s2) in the output, the out-degree of state s1 in FST1 is less than state s2 in FST2; otherwise the term for that state's contribution swaps s1 and s2.

Topic revision: r8 - 2020-07-06 - MichaelRiley