TWiki
>
GRM Web
>
PyniniOperatorsDoc
(2021-05-28,
KyleGorman
)
(raw view)
E
dit
A
ttach
---+ Specialty operators This describes specialty FST functions for grammar compilation. ---++ =fst::CrossProduct= The _cross-product_ operation generates a transducer from two acceptors _A_, _B_. The resulting transducer represents a relation mapping from any string in _A_ to any string in _B_ (hence the name "cross-product"). In the case that _A_ is a transducer, it is implicitly reinterpreted as the acceptor given by its input projection; similarly, in the case that _B_ is a transducer, it is implicitly reintrepreted as the acceptor given by its output projection. The user may also specify a final weight; this is multiplied by the weights of all final states. The algorithm is as follows: 1 Replace all output labels of _A_ with epsilon. 1 Replace all input labels of _B_ with epsilon. 1 Compose (1) and (2). 1 If a final weight (other than semiring One, the default), is specified, multiple the weight of each final state by it. ---++ =fst::LenientlyCompose= The _lenient composition_ (Karttunen 1998) of two FSTs _A_, _B_ is simply the _priority union_ (to be defined) of (_A_ ∘ _B_) and _A_. The priority union of two FSTs _C_, _D_ is similar to the [[http://www.openfst.org/twiki/bin/view/FST/UnionDoc][union]] of _C_ and _D_ except that the relations in _C_ have "priority" over those in _D_. For example, let it be the case that: _C(a)_ → _b_ _D(a)_ → _c_ The ("vanilla") union of _C_, _D_ is then the relation _a_ → { _b, c_ }, whereas the priority union of _C_, _D_ is the narrower relation _a_ → _c_. We can implement this in Thrax as follows: <verbatim> func PriorityUnion[C, D, sigma_star] { input = Determinize[RmEpsilon[Project[C, 'input']]]; return C | ((sigma_star - input) @ D); }</verbatim> where sigma_star represents the closure over the alphabet. The lenient composition of FSTs _A_, _B_ is simply the priority union of (_A_ ∘ _B_), _A_. We can implement this in Thrax as follows: <verbatim> func LenientlyCompose[A, B, sigma_start] { return PriorityUnion[A @ B, A, sigma_star]; }</verbatim> where sigma_star represents the closure over the alphabet. ---++ =fst::ConcatRange= The _range-based concatenation operation_ is a generalization of [[http://www.openfst.org/twiki/bin/view/FST/ClosureDoc][closure]]. A path through the closure of an FST _A_ is valid if it can be generated by taking zero or more paths through _A_. The first argument to fst::ConcatRange represents a lower bound on the number of paths through the input FST required to form a valid path in the output FST. The second argument represents the upper bound on the number of paths through the input FST; by convention, 0 is used to indicate an infinite upper bound. Naturally, the lower bound must be less than or equal to the upper bound. We can then derive the following familiar operations as special cases: * To compute "star"-closure, use arguments 0, 0. * To compute "plus"-closure, use arguments 1, 0. * To generate exactly _k_ concatenations of an FST, use arguments k, k. ---++ References Karttunen, L. 1998. The proper treatment of Optimality Theory in computational phonology. In _Finite State Methods in Natural Language Processing_, pages 1-12.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r8
<
r7
<
r6
<
r5
<
r4
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r8 - 2021-05-28
-
KyleGorman
GRM
Log In
or
Register
GRM Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback