TWiki> Forum Web>GrmThraxForum (revision 6)EditAttach

OpenGrm Thrax Forum

You need to be a registered user to participate in the discussions.
Log In or Register

You can start a new discussion here:

Help You can use the formatting commands describes in TextFormattingRules in your comment.
Tip, idea If you want to post some code, surround it with <verbatim> and </verbatim> tags.
Warning, important Auto-linking of WikiWords is now disabled in comments, so you can type VectorFst and it won't result in a broken link.
Warning, important You now need to use <br> to force new lines in your comment (unless inside verbatim tags). However, a blank line will automatically create a new paragraph.
Log In

Recommended way to obtain FST+symbols for use

JosefNovak - 10 Jun 2013 - 09:46


I am currently using thrax to extend my some features of an alignment tool I wrote for my g2p system.

The basic idea is that the user can specify some alignment correspondence rules and optional default penalties, and then these can be incorporated into the EM training process.

At present I have kind of hacked the functionality of the thraxcompiler command tool to read in the grammar, and then return the desired FST+symbol table to the alignment program.

I force the use of a specific symbol table via .my_syms for each element in my grammar, then I grab the FstMap from the managerspec, extract the FST that I'm looking for, and associate the symbol table manually.

The downsides to this are that: a.) I have to add .my_syms to every single element in the grammar, and b.) I'm fairly sure that my approach to obtaining the FST resulting from the compilation is not the best way to be doing this.

Is there a way to cause my_syms to be used instead of 'bytes' by default?

Is there a recommended way parse a grammar and return the FST(s) objects for downstream use in a larger C++ program?

I went through the FAQ but did not notice any answers to these questions.

Thanks for your time.

JosefNovak - 10 Jun 2013 - 09:52

Also/alternatively, is there a way to grab the symbol table generated when 'bytes' is used or do i need to build that myself based on the byte-integer values? It seems that if I use a mixture of bytes/utf8 then the .far archive symbolfst (first in the archive) contains a symbol table with the generated symbols, but I cannot seem to find the generated-by-default 'bytes' table (which also includes 'epsilon' in this case.

Log In

Need some help, New to "Thrax"

GoudjilKamel - 03 Jan 2013 - 17:29

compiling under unbuntu LTS 12.04 : got the msg below at linking libtool: link: g++ -g -O2 -o .libs/thraxcompiler compiler.o -L/usr/local/lib/fst -lm -ldl -lfst /usr/local/lib/fst/ ../lib/.libs/ -Wl,-rpath -Wl,/usr/local/lib/fst -Wl,-rpath -Wl,/usr/local/lib ../lib/.libs/ undefined reference to `fst::IsSTList(std::basic_string<char, std::char_traits, std::allocator > const&)' ../lib/.libs/ undefined reference to `fst::IsSTTable(std::basic_string<char, std::char_traits, std::allocator > const&)' collect2: ld returned 1 exit status

Log In

Weight semiring

LauriLyly - 21 Nov 2012 - 00:34

So far I find thrax a very neat piece of software but I have two questions...

Can I somehow use probability semiring as weights, because it seems Thrax only allows specifying log and tropical semirings? How about the other ones... Or should I somehow postprocess the generated far file?

Another question: I tried to use "fstdraw" on a far file, but got: ERROR: FstHeader::Read: Bad FST header: example.far

Is this a version mismatch?

LauriLyly - 29 Nov 2012 - 07:34

Sorry, obviously my bad as it's a far and not an fst file stick out tongue Still not too familiar. But the weight question still applies wink

RichardSproat - 29 Nov 2012 - 10:07

Sorry, I missed the earlier comment -- for some reason I didn't get email about it.

Unfortunately the restriction to Log and Tropical is due to a similar restriction in the fst library: the real semiring does not come predefined. The best suggestion would be to use Tropical and then just do the obvious e^-cost conversion.

Log In

-- CyrilAllauzen - 13 Aug 2012

Edit | Attach | Watch | Print version | History: r126 | r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2013-06-10 - JosefNovak
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback