You can use the formatting commands describes in TextFormattingRules in your comment.
If you want to post some code, surround it with <verbatim> and </verbatim> tags.
Auto-linking of WikiWords is now disabled in comments, so you can type VectorFst and it won't result in a broken link.
You now need to use <br> to force new lines in your comment (unless inside verbatim tags). However, a blank line will automatically create a new paragraph.
Hi,
I am currently using thrax to extend my some features of an alignment tool I wrote for my g2p system.
The basic idea is that the user can specify some alignment correspondence rules and optional default penalties, and then these can be incorporated into the EM training process.
At present I have kind of hacked the functionality of the thraxcompiler command tool to read in the grammar, and then return the desired FST+symbol table to the alignment program.
I force the use of a specific symbol table via .my_syms for each element in my grammar, then I grab the FstMap from the managerspec, extract the FST that I'm looking for, and associate the symbol table manually.
The downsides to this are that: a.) I have to add .my_syms to every single element in the grammar, and b.) I'm fairly sure that my approach to obtaining the FST resulting from the compilation is not the best way to be doing this.
Is there a way to cause my_syms to be used instead of 'bytes' by default?
Is there a recommended way parse a grammar and return the FST(s) objects for downstream use in a larger C++ program?
I went through the FAQ but did not notice any answers to these questions.
Thanks for your time.
Also/alternatively, is there a way to grab the symbol table generated when 'bytes' is used or do i need to build that myself based on the byte-integer values? It seems that if I use a mixture of bytes/utf8 then the .far archive symbolfst (first in the archive) contains a symbol table with the generated symbols, but I cannot seem to find the generated-by-default 'bytes' table (which also includes 'epsilon' in this case.
So far I find thrax a very neat piece of software but I have two questions...
Can I somehow use probability semiring as weights, because it seems Thrax only allows specifying log and tropical semirings? How about the other ones... Or should I somehow postprocess the generated far file?
Another question: I tried to use "fstdraw" on a far file, but got: ERROR: FstHeader::Read: Bad FST header: example.far
Is this a version mismatch?
Sorry, I missed the earlier comment -- for some reason I didn't get email about it.
Unfortunately the restriction to Log and Tropical is due to a similar restriction in the fst library: the real semiring does not come predefined. The best suggestion would be to use Tropical and then just do the obvious e^-cost conversion.
-- CyrilAllauzen - 13 Aug 2012
This topic: Forum > WebHome > GrmThraxForum
Topic revision: r6 - 2013-06-10 - JosefNovak