OpenFst Forum 2013 Archive

Infinity weight in compiled fst

HuanliangWang - 30 Dec 2013 - 23:45

I get Infinity weight in final state when I compile a text fsm by the command "fstcompile --arc_type=standard --fst_type=const infsm | fstprint > outfsm". The file "infsm" is very large, about 20G. I find a line "3 Infinity" in the file "outfsm". Is this an error? How can I solve it?

Log In

Why do I get Infinity weight in final state?

HuanliangWang - 30 Dec 2013 - 23:36

I get a Infinity weight in final state when I compile a text fst into binary format by the command "fstcompile --arc_type=standard --fst_type=const infsm > outfst". The "infsm" file is large fsm. Then I convert the "outfst" into text format by command "fstprint". I find there is a line "3 Infinity" in the converted fsm file. Is the case a error? How can I solve it?

PaulDixon - 31 Dec 2013 - 00:41

What does fstinfo report? Sounds like you have coaccessible states (states not reachable from a final). If the machine is connected the number of accessible and coaccessible states reported by fstinfo should be the same. You can remove the dead end states with fstconnect.

HuanliangWang - 01 Jan 2014 - 21:45

Hi, Paul, I get the fst information by "fstinfo". It is as following: fst type const arc type standard input symbol table none output symbol table none # of states 425041220 # of arcs 1230392842 initial state 0 # of final states 0 # of input/output epsilons 425041217 # of input epsilons 425041217 # of output epsilons 425041217 # of accessible states 425041220 # of coaccessible states 0 # of connected states 0 # of connected components 1 # of strongly conn components 43672597

The number of accessible and coaccessible states is not same.

HuanliangWang - 01 Jan 2014 - 21:51

sorry! It is like as: fst type const

arc type standard

input symbol table none

output symbol table none

# of states 425041220

# of arcs 1230392842

initial state 0

# of final states 0

# of input/output epsilons 425041217

# of input epsilons 425041217

# of output epsilons 425041217

# of accessible states 425041220

# of coaccessible states 0

# of connected states 0

# of connected components 1

# of strongly conn components 43672597

PaulDixon - 02 Jan 2014 - 01:50

Your machine doesn't have final states at at all and is probabaly useless. You could try generating a tiny FST with the same procedure and viewing it using fstdraw to check the structure is what you expect.

HuanliangWang - 06 Jan 2014 - 06:50

thank you!

I find there are two end states in input fsm "infsm". It is unexpected. I will check the error firstly.

Log In

error in ubuntu 13.10

YanWang - 19 Dec 2013 - 22:02

../script/.libs/ undefined reference to `dlopen' ../script/.libs/ undefined reference to `dlerror' collect2: ld returned 1 exit status

the openfst-1.3.4 compile error in ubuntu.How can I solve it?

PaulDixon - 20 Dec 2013 - 04:44

Seems the -ldl linker switch is missing or need the order changing. This has been discussed in earlier posts.

FalouuFalouu - 20 Jan 2014 - 06:10

I found a solution (Ubuntu 13.10): use gcc-4.4 and g++-4.4 instead of 4.8.
1. sudo apt-get install gcc-4.4 g++-4.4
2. Temporarily link gcc to gcc-4.4 (and g++ ...) or replace all occurences of gcc and g++ in Makefiles to 4.4 version. Maybe it is not all necessary, but I did it and it worked.

DavidLandan - 22 Jan 2014 - 15:26

I'm seeing the error as well, and various orderings of the -lm & -ldl tags don't seem to change things. Unfortunately, while using gcc/g++ v4.4 is a workaround, you may run into issues later on with other tools that make use of the openfst library (as I discovered).

My src/bin/ only has one line with "LD" in it: LDADD = ../script/ ../lib/ -lm -ldl

putting the -lm & -ldl switches in the middle or end doesn't matter:

libtool: link: g++ -O2 -o .libs/fstarcsort fstarcsort.o -lm -ldl ../script/.libs/ ../lib/.libs/ ../script/.libs/ undefined reference to `dlopen' ../script/.libs/ undefined reference to `dlerror' collect2: error: ld returned 1 exit status

libtool: link: g++ -O2 -o .libs/fstarcsort fstarcsort.o ../script/.libs/ ../lib/.libs/ -lm -ldl ../script/.libs/ undefined reference to `dlopen' ../script/.libs/ undefined reference to `dlerror' collect2: error: ld returned 1 exit status

Has anyone else found a way to work through this using gcc/++ tools of v4.7 or later?

YendaTrmal - 31 Jan 2014 - 19:35

Ok, I was able to figure it out, perhaps... Not really sure if that is the correct solution either. I don't really have experience with autotools and such, so I had to mess with everything and I'm not sure in which sequence the things have to done, but everything boils down to changing the files containing the LDADD to -LDADD = ../script/ ../lib/ -lm -ldl +LDADD = ../script/ ../lib/ +AM_LDFLAGS = -Wl,-no-as-needed -lm -ldl

YendaTrmal - 31 Jan 2014 - 19:36

Sorry, wrong formatting

-LDADD = ../script/ ../lib/ -lm -ldl

+LDADD = ../script/ ../lib/

+AM_LDFLAGS = -Wl,-no-as-needed -lm -ldl

StevenBedrick - 2014-02-16 - 16:12

First, I think there's a slight formatting bug above- from what I can tell, "no-as-needed" is a two-dash argument (i.e., it should be "-Wl,--no-as-needed" (note the two dashes before no-as-needed)).

Second, editing the relevant files and making this change does in fact result in a successful compilation on modern Ubuntu/GCC combinations, but the resulting binaries ("fstinfo", etc.) still aren't being linked properly against libfst and libfstscript. So, the result of running "make install" is a bunch of broken binaries.

I've poked at it for a while and not had any luck fixing the problem; somebody with stronger automaker-fu than myself might have more luck.

StevenBedrick - 2014-02-17 - 16:01

OK, never mind- I was correct that the issue was with my automake-fu. The problem was twofold: first, I wasn't actually telling automake to regenerate the files from the modified files, and second, once I figured out that that's why my changes to the files weren't having an effect, I next discovered that modern versions of Ubuntu come with a different version of automake/autoconf than the OpenFST distribution was created with. So, to get OpenFST built and installed correctly, here's what you'll need to do:

First you have to modify the relevant files as described above by YendaTrmal (taking into account the typo I identified in my previous comment vis-a-vis double-dashed arguments).

Second, you need to re-generate all of the automake-related files. I did this by running "autoreconf --force --install", which might be more than was needed- see my earlier comments about my lack of automake-fu. Before you can do this, however, you'll need to modify in the top-level distribution directory; the AM_INIT_AUTOMAKE directive has a flag that treats all warnings as errors, and the differences in the automake/autoconf versions causes several warnings that will prevent autoreconf from running correctly. The "solution" was to get rid of the "-Werror" flag in the AM_INIT_AUTOMAKE directive. None of the errors seemed particularly dire, but take that observation with a grain of salt (see earlier comments about my lack of automake-fu). Removing "-Werror" caused all the relevant files to be properly regenerated, with the corrected LDADD/AM_LDFLAGS variables.

Third, "./configure"/"make"/"make install" as normal.

That seems to do the trick- the resulting binaries are all properly linked now.

StevenBedrick - 2014-02-17 - 19:55

One more update- apologies to YendaTrmal, "no-as-needed" seems to work just fine as a single-dashed argument. I must have had some other typo going on when I was having trouble with it earlier. disregard that part of my comment.

MathieuMorey - 2014-02-25 - 11:53

The solutions above did not really work for me, but they were a great starting point (thanks a lot, YendaTrmal and StevenBedrick !). I managed to get openFST to compile, install and work as a shared library on Ubuntu 13.10 using the following steps:

1. in "src/lib/", append: libfst_la_LIBADD = -ldl

2. in "", replace: AM_INIT_AUTOMAKE([foreign nostdinc -Wall -Werror]) with: AM_INIT_AUTOMAKE([foreign nostdinc -Wall -Wno-extra-portability -Werror])"

3. autoreconf --force --install

then the usual ./configure && make && make install .

You can check it works by running "make check".

Log In

fstequal command

LahiruSamarakoon - 19 Dec 2013 - 07:30

Hi all, What is the current status of the "fstequal" command? Is it working properly. My attempts to use it was unsuccessful. Why this command is not listed in the website?


PaulDixon - 20 Dec 2013 - 04:34

It goes through two machines and does a state by state and and an arc comparison, setting --v to 1 should give detailed output. There seems to be typo in line 97 of the code

FalouuFalouu - 20 Jan 2014 - 06:09

I found a solution (Ubuntu 13.10): use gcc-4.4 and g++-4.4 instead of 4.8. 1. sudo apt-get install gcc-4.4 g++-4.4 2. Temporarily link gcc to gcc-4.4 (and g++ ...) or replace all occurences of gcc and g++ in Makefiles to 4.4 version. Maybe it is not all necessary, but I did it and it worked.

FalouuFalouu - 20 Jan 2014 - 06:12

Sorry, i accidentally put response in wrong place. Please delete.
Log In

Visual Studio OpenFST 1.3.4

RolandSchwarz - 02 Dec 2013 - 05:03

Does anyone have a working solution for compiling OpenFST 1.3.4 on Visual Studio? I found this project here

but the latest version there is 1.3.1.

Any help would be appreciated.



Log In

Compiling errors on Mac Mavericks 10.9

CesineC - 22 Nov 2013 - 11:23

Has anyone gotten OpenFST to compile on OS 10.9? Here are the errors I'm getting:

$ ./configure --prefix=$OPENFST_HOME --enable-far --enable-pdt && 
   make clean
   make &&
   make install || {
      echo "Error building openfst.";
      exit 1;

In file included from
In file included from ./../include/fst/fst.h:34:
In file included from ./../include/fst/arc.h:28:
In file included from ./../include/fst/expectation-weight.h:36:
In file included from ./../include/fst/pair-weight.h:29:
In file included from ./../include/fst/weight.h:82:
./../include/fst/util.h:24:10: fatal error: 'tr1/unordered_map' file not found
#include <tr1/unordered_map>
1 error generated.
make[3]: *** [fst.lo] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Error building openfst.

$ g++ --version
Configured with: --prefix=/Applications/ --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.0.2
Thread model: posix

KyleGorman - 22 Nov 2013 - 13:25

I have.

My understanding is that tr1 isn't part of any standard, and Apple moved it to a place where autoconf isn't looking for it. You need to set CPPFLAGS to point to that header and LIBS to point to that shared library. See:

CesineC - 26 Nov 2013 - 14:15

Hi Kyle, I actually found your brew example which helped me get it to work, thats why i posted on your PR for brew also so that maybe they would accept that the current brew formula wasn't working, and that we might have to use libstdc++...

I was hoping that we weren't the only ones using mavericks, but it looks like we are.

For those who run into this later, here is some bash which (so far) appears to work well on mavericks:

./configure --prefix=$OPENFST_HOME  --enable-compact-fsts --enable-const-fsts --enable-far=yes --enable-lookahead-fsts --enable-pdt CXX="clang++ -std=c++11 -stdlib=libstdc++" && 
    make clean
    make &&
    make install || {
        echo "Error building openfst.";
        exit 1;
Log In

Availability of linear classifier/tagger FST extension

DavidHugginsDaines - 19 Nov 2013 - 13:25

I notice that there is a section on "Linear FSTs" in the FstExtensions page, however in the 1.3.4 release the referenced header files and fstlinear program aren't present. Is this available elsewhere, or likely to be released soon?


Log In

How to compile on 64 bit machine

ErrorNotFound - 22 Oct 2013 - 22:34

There seems to be only one OpenFST package for download. In other places I have seen separate packages for 32 bit and 64 bit. If I have a 64 bit machine, how do I install OpenFST there?

Thanks a lot.

Log In

kPosInfinity in old version of fst

ScottLawton - 11 Oct 2013 - 19:32

I'm trying to compile some code ( that apparently depends on an older (unstated) version of fst. It references kPosInfinity:

return LogLatticeWeight(fst::FloatLimits::kPosInfinity, fst::FloatLimits::kPosInfinity);

But, openfst-1.3.3/src/include/fst/float-weight.h just defines PosInfinity (no 'k' prefix).

I tried a superficial fix: deleting the 'k' in srtk -- but that yields "no matching function for call to ‘srtk::LogLatticeWeight::LogLatticeWeight(const float (&)(), const float (&)())’".

I'm a Python guy not a C++ guy ... any suggestions? I assume there's a way to cast it to (float, float), or a wrapper of some sort.


PaulDixon - 12 Oct 2013 - 01:58

You need to add the template parameter and parentheses


Log In

Probably, bug: arc-map.h, ArcMap(), case MAP_REQUIRE_SUPERFINAL

NickolayMerkin - 29 Aug 2013 - 05:01

I've done a static analysis of source code with PVS-Studio, and it has reported a probable bug.

V640 The code's operational logic does not correspond with its formatting. The statement is indented to the right, but it is always executed. It is possible that curly brackets are missing. arc-map.h 178

if (final_arc.ilabel = 0 || final_arc.olabel = 0 || final_arc.weight = Weight::Zero()) fst->AddArc(s, A(final_arc.ilabel, final_arc.olabel, final_arc.weight, superfinal)); fst->SetFinal(s, Weight::Zero());

fst-AddArc is conditional (then-branch of the preceded if) fst-SetFinal is unconditional, though it is indented as if it is conditional too.

Log In

Optimization issue: functions Times and Divide on logartithmic and tropical weights

NickolayMerkin - 29 Aug 2013 - 04:54

Hello, I've run my application (LV CSR decoder) that uses OTF WFST composition under the AMD profiler and found that much time is spent in two functions: inline TropicalWeightTpl Times and Divide. They do costy checks for NaN and +/-Inf before the arithmetic operation (+ for times, - for divide). But it's obvious that nan+anything is nan, inf+anything is inf, and inf-inf is nan. Thus, these pre-checks are redundant.

When I've eliminated these checks, the speed of my application has been increased for 5%.

PaulDixon - 29 Aug 2013 - 22:32

Just out of interest are using the TropicalWeight type every where in your application? Or just as part of the composition and ArcIterator interface and then float and + operator everywhere else? Thanks.

AlexZatv - 11 Sep 2013 - 15:15

It's almost everywhere in application. Looks like most decoders use log scores, but we with Nickolay use weights.
Log In

Which optimization is applied to OpenFst 1.3.3 ?

JihoonKim - 18 Aug 2013 - 21:28


OpenFst 1.3.3 is much faster than OpenFst 1.3.1. While fstdeterminize of 1.3.1 takes about 1 hour, that of 1.3.3 takes only less than 1 minute! How can 1.3.3 be this faster? Could anyone explain the reason?


AlexZatv - 19 Aug 2013 - 09:56

Hello! Are you talking about Windows version, or linux? There was some problems in windows standard library (std::hash_map is too slow,boost::hash_map is much faster)

JihoonKim - 20 Aug 2013 - 04:16

Yes, I meant Windows version. It's because of windows standard library ! Thanks for the quick response.
Log In

Can I get OpenFst 1.3.3 Visual Studio Project

JihoonKim - 15 Aug 2013 - 04:47


I have to build OpenFst in Windows. In the download page, Visual Studio project is just for 1.1 and 1.3.1. It seems that latest version 1.3.3 is much faster than 1.3.1 so I need 1.3.3 to be built in Windows.

Is there anyone who has 1.3.3 for windows ?

PaulDixon - 15 Aug 2013 - 19:12

PaulDixon - 15 Aug 2013 - 19:14

The memory map FSTs compile but aren't tested. The ngram FSTs aren't finished that why I never uploaded to the contrib section. Also includes VS2012 solutions.

JihoonKim - 16 Aug 2013 - 01:15

Thanks for your quick response. It'd be great help for me. wink
Log In

Merging symbol tables

PrashantMathur - 26 Jul 2013 - 10:28


I am a newbie to openfst, so this can sound stupid.

Is there a function which can merge two SymbolTables? I was using A->AddTable(SymbolTable* B) , where I want to merge B in A. function to merge two symbol tables. But I believe AddTable function just appends table B with A and starts indexing with the maximum key. ( I don't know if I am correct).

A 0 a 1 b 2 c 3

B 0 d 1 a 2 b 3

C - Resulting Table 0 a 1 b 2 c 3 d 4

My code needs to do this operation several times, so I thought I better ask first.


PrashantMathur - 26 Jul 2013 - 10:30

sorry the formatting didn't turn out as I planned to .. let me try again A 0 a 1 b 2 c 3

B 0 d 1 a 2 b 3

C 0 a 1 b 2 c 3 d 4

PrashantMathur - 26 Jul 2013 - 11:10

Is there a function which can merge two SymbolTables?
I was using A->AddTable(SymbolTable* B) , where I want to merge B in A.
But I believe AddTable function just appends table B with A and starts indexing with the maximum key. ( I don't know if I am correct).
eps 0
a 1
b 2
c 3

eps 0
d 1
a 2
b 3

C - Resulting Table
eps 0
a 1
b 2
c 3
d 4

My code needs to do this operation several times, so I thought I better ask first.

PaulDixon - 27 Jul 2013 - 05:15

If table A has holes, you can use CompactSymbols after merging the table to give a contiguous table. MergeSymbolTable will merge symbol tables and has optimizations for certain cases. Both are in symbol-table-ops.h

Log In

I/O error for large FSTs

SamS - 08 Jul 2013 - 14:49

I'm working on an application where I'd like to store a large FST in binary format, but I'm getting read/write errors. I'm convinced my large FST is well formed and that there might be a problem with big read/write operations. Here's a toy example that produces the same error:

   SymbolTable * syms = new SymbolTable;

   StdVectorFst res;
   for (int s = 0; s < big_number; s++) {
        TropicalWeight final = -log(0.5);
        res.SetFinal(s, final);
        for (int dest = 0; dest < big_number; dest++) {
            res.AddArc(s, StdArc(1, 1, TropicalWeight(-log(0.2)), dest));


    StdVectorFst* loaded = StdVectorFst::Read("badfile.fst");

    return 0;

For me, the big number seems to be too big at around 923. The printed error message is:

ERROR: VectorFst::Read: read failed: badfile.fst

Any insight into what's going on and how to get around this issue would be appreciated. Thanks!

SamS - 08 Jul 2013 - 16:39

And this turned out to be a problem with the local file system. Sorry!

MikySu - 12 Jul 2013 - 12:54

I also had the same error message. i command "fstcompile --arc_type=log --fst_type=const GramFst > GramFst.out.bfsm",but the fst type of GramFst.out.bfsm didn't transform from vector to const. could someone give me some advice?

MikySu - 12 Jul 2013 - 13:16

my file GramFst.out.bfsm and LexFst.out.bfsm both of fst type were vector. i commanded "fstcompose LexFst.out.bfsm GramFst.out.bfsm >a1.bfsm". i checked the information with fstinfo and then the error message came out "ERROR: VectorFst::Read: read failed:a1.bfsm". file GramFst.out.bfsm and LexFst.out.bfsm were fine.

PaulDixon - 13 Jul 2013 - 04:23

What is the format of GramFst? Hoe many states and arcs do the bfsm machine have? Did this work for you fstconvert --fst_type=const < some.vector.fst > new.const.fst
Log In

Making an FST with C++: How to define strings as labels?

DennisRitter - 19 Jun 2013 - 10:30

In the shell you have the possibility to set up strings as ilabels and olabels (with isymbols.txt and osymbols.txt) But when I followed the QuickTour and wanted to do it all in the C++-Editor I couldn't find out, how to give the strings to an vectorfst. I can only give them integers as labels, because of the definition of stdarc.

Would be glad if someone could point out that to me.

PS Is it even effective to make it all in the C++-editor (I am using Eclipse)?

PaulDixon - 20 Jun 2013 - 02:11

Use the SymbolTable to assign/get integer ids for the string labels

DennisRitter - 20 Jun 2013 - 10:26

Is that the way you meant it?
SymbolTable* isyms;
   isyms = new SymbolTable("isyms.txt");
   SymbolTable* osyms;
   osyms = new SymbolTable("osyms.txt");




   // Adds state 0 to the initially empty FST and make it the start state.
   fst.AddState(); // 1st state will be state 0 (returned by AddState)
   fst.SetStart(0); // arg is state ID

   // Adds two arcs exiting state 0.
   // Arc constructor args: ilabel, olabel, weight, dest state ID.
   fst.AddArc(0, StdArc(isyms->Find("a"), osyms->Find("x"), 0.5, 1)); // 1st arg is src state ID
   fst.AddArc(0, StdArc(isyms->Find("b"), osyms->Find("y"), 1.5, 1));

   // Adds state 1 and its arc.
   fst.AddArc(1, StdArc(isyms->Find("c"), osyms->Find("z"), 2.5, 2));

   // Adds state 2 and set its final weight.
   fst.SetFinal(2, 3.5); // 1st arg is state ID, 2nd arg weight


PaulDixon - 20 Jun 2013 - 19:59

Yes, by convention symbol 0 is the for the epsilon label so call oyms->AddSymbol("\<eps\>"); first and make sure the return value is 0.

AddSymbol returns the id if the key already exists. So you can remove all the AddSymbol lines (except for the epsilon) and do this instead

fst.AddArc(0, StdArc(isyms->AddSymbol("a"), osyms- >AddSymbol("x"), 0.5, 1));

From memory in this case I think you would need to add the Symbol tables after adding all the arcs. Internally I think it will make a copy when you use SetInputSymbols method so might also need to delete the SymbolTables also.

DennisRitter - 21 Jun 2013 - 05:01

Thanks a lot Paul!

PS You now helped me the third time, first time as Edobashira on your old blog :). I was DreeDrunk at that time. Shall I mark my thread with something like a SOLVED-tag?

PaulDixon - 23 Jun 2013 - 16:50

Glad I could help. Not sure if the thread should be marked as solved.
Log In

GenericRegister::GetEntry : lookup failed in shared object

MarkusD - 12 May 2013 - 20:01

I'm trying to register my expectation arc, so that it can be used in fstcompile, etc.

But I'm getting the runtime error:

ERROR: GenericRegister::GetEntry : lookup failed in shared object: FATAL: No operation found for "CompileFst" on arc type expectation

How can I fix this?

The register call in my code looks like this (my arc is called MDExpectationArc):

#include <fst/const-fst.h> #include <fst/edit-fst.h> #include <fst/vector-fst.h> #include <fst/script/register.h> #include <fst/script/fstscript.h> namespace fst { namespace script { REGISTER_FST(VectorFst, MDExpectationArc); REGISTER_FST(ConstFst, MDExpectationArc); REGISTER_FST(EditFst, MDExpectationArc); REGISTER_FST_CLASSES(MDExpectationArc); REGISTER_FST_OPERATIONS(MDExpectationArc); }}

Here is how I compile: /usr/bin/c++ -fPIC -O2 -shared -o /c01_data/mdreyer/software/openfst-1.3.3/lib/

or this, gives the same error: /usr/bin/c++ -fPIC -O2 -shared -Wl,-soname, -o openfst-1.3.3/lib/ -ldl -Wl,-rpath,openfst-1.3.3/lib

It used to work for me with an old openfst version, but I'm getting this error now with openfst 1.3.3.

MarkusD - 12 May 2013 - 20:07

For clarity, I'm getting the runtime error when I run this, for example:

fstcompile --arc-type=expectation

where it loads the file that I created using the c++ commands above.

PaulDixon - 13 May 2013 - 08:10

Does the underlying weight type of the MDExpectationArc return the string "expectation" from the Type() method? I think the Type() method returns value need to match the name of shared object.
Log In

how to use fstreplace tool

HuanliangWang - 09 May 2013 - 21:54

Hi, I want to use fstreplace tool to generate a extended fst by replacing a nonternimal label in root fst by a subfst. But I got error when using the tool. I know it is necessary to preprocess two fsts, such as symbol table. But I don't know the detail. Could anybody tell me the detailed usage procedure?


Huanliang Wang

Log In

Regex to FST

RaihanRasool - 07 May 2013 - 12:56


Using OpenFst library we can create FST for a given regular expression. But is there any way I could develop a generic RegEx to FST converter that takes 'any' regular expression and build FST ?

PaulDixon - 09 May 2013 - 00:09

Maybe the answer to this question is what you are looking for

MarkusD - 12 May 2013 - 19:55

Also, I think Thrax does something like that:

KennethRBeesley - 2017-01-26 - 22:58

I would also suggest taking a look at the Kleene language:
Log In

Coordinate Descent Solution for Rational Kernels

AbielRoche - 01 May 2013 - 13:59

Is there any available implementation of Coordinate Descent Solution for SVM using Rational Kernels? As it is mentioned in the paper: "Large-Scale Training of SVMs with Automata Kernels". Thank you for your help,

Log In

“FATAL: StringWeight::Plus: unequal arguments” when running fstdeterminize.

JiaP - 31 Mar 2013 - 21:51

The transducer in question simply describes a flat list dictionary without any explicit weight. It is very similar to the one given in the “Tokenization” example, except being much larger (about 664,000 states).

When I ran fstdeterminize on this transducer, the command line tool exited with following error message: FATAL: StringWeight::Plus: unequal arguments (non-functional FST?) w1 = 1 w2 = 2

I will try to construct such transducer directly using C++ API. Mean while, I’d appreciate any suggestion.


PaulDixon - 01 Apr 2013 - 01:59

Are you adding disambiguation symbols like the ones described for the lexicon construction in this paper

EricLebigot - 03 Apr 2013 - 22:37

@PaulDixon: I had a look at the paper you link to. As far as I understand, for determinization to be possible, each input string should only produce a single output string, right?
Log In

Composition result too large: how to make smaller?

EricLebigot - 25 Mar 2013 - 22:27


I am trying to calculate a kernel of the form T∘T^-1 from a transducer T, but the calculation uses up a lot of memory and I have to stop it before it converges (I stopped at 25 GB of swap). What could I try to do in order to calculate the composition result? (My ultimate goal is to calculate a kernel between acceptors A and B through the A∘T∘T^-1∘B composition and getting the (log) total weight between the starting and final states. A difficulty is that A∘T can produce a huge number of paths, and I was hoping that calculating T∘T^-1 could help solve this difficulty.)

I tried to first optimize transducer T, but this failed (a first epsilon removal yields a fatal error about StringWeight::Plus being given unequal arguments, as does determinization).

Here are some characteristics of transducer T, if they matter:

- About 15 nodes and 200k arcs.

- Accommodates input strings of any size through the use of internal loops.

If necessary, I can remove all loops, or some loops (because I know in advance how big the input strings can be); this would multiply the number of nodes and arcs by maybe 100 or 600 (depending on which loops I remove). Could removing loops significantly with the "too much memory used for composition" problem?

I could also limit the number of arcs, but this would make the transducer less expressive, so I would like to avoid this. However, if this is the best way of speeding up the calculation of T∘T^-1 and limiting its memory requirement, I am ready to do that.

What would you suggest, so that T∘T^-1 can really be calculated?

PaulDixon - 26 Mar 2013 - 05:35

In the transducers do you have to traverse many epsilon transitions before reaching a label? This can delay the matching and lead to the generation of many dead end states. These states are normally removed after composition.

I don't think epsilon removal uses the StringWeight are your weights in the log/tropical semiring? Another problem could be negative weighted epsilon cycles.

EricLebigot - 26 Mar 2013 - 10:02

(I'm not fully familiar with FST, so please let me know if I'm not clear.)

Transducer T can produce many epsilons (in about half of the 200k arcs) before reaching a final state (and T^-1 therefore can consume many epsilons). This produces a huge number of possible outputs, and I was hoping that OpenFST could somehow "magically" calculate T∘T^-1 despite this. This epsilon production cannot be removed, because it is one of the main purposes of transducer T to remove parts of the input strings.

The weights are in the log semiring. There are cycles that produce epsilon (in T), with a weight of LogWeight(0.69…) (corresponding to 0.5 in real space).

With these precisions, what do you think is the best way to speed up the composition? reducing the number of arcs (I can do this by reducing the number of recognized "letters")?

PaulDixon - 01 Apr 2013 - 07:10

Do have an example transducer you can share online?

EricLebigot - 01 Apr 2013 - 23:56

@PaulDixon: I put a simplified version at The real version has many more arcs (20k arcs where there are ~40 between two states in the simplified version, 1k arcs where there are ~4 arcs). Note: I will have a look at the disambiguation paper that you mentioned yesterday: maybe determinization will speed up the calculations?

DoganCan - 02 Apr 2013 - 06:09

Hi Eric,

The problem is that transducer T can produce infinitely many consecutive epsilons (when it is not restricted by input A) and similarly transducer T^-1 can consume infinitely many consecutive epsilons (when it is not restricted by output B). Naturally, composition does not terminate. You will have to avoid "all" epsilon loops on the output side of T if you want to compute ToT^-1, i.e. limiting the number of output-epsilon arcs won't help. Unfortunately there is no such magic hidden in openfst smile

Cheers, Dogan

EricLebigot - 02 Apr 2013 - 09:32

@DoganCan: Thank you for your input: very interesting! I can certainly remove infinite epsilon-generating loops (thanks to the structure of the input strings). However, a strange thing is that T∘T^-1 does terminate, with OpenFST (when I restrict the number of arcs, like in the simplified version—the change is only quantitative, not qualitative). Does this mean that the composition obtained by OpenFST is likely incorrect? I am a little confused.

EricLebigot - 02 Apr 2013 - 09:46

PS: I put the transducer from the PDF above in files transducer.fst, transducer.ssym (state symbols) and transducer.sym (labels) at, if this can help. OpenFST happily calculates T∘T^-1 after arc-sorting the outputs of T.

DoganCan - 03 Apr 2013 - 04:19

Hi Eric,

It looks like my initial assessment was incorrect. I thought composition would somehow force enumeration of all possible matches between the epsilon cycles on the output side of T and the epsilon cycles on the input of T^-1. Instead, what happens is composition simply creates paths which first generate possibly infinitely many epsilons and then consume possibly infinitely many epsilons. The composition output for the small transducer is correct as far as I can tell.

I think the blow-up in composition is caused by the self-loops on states A_CAT_V_KEPT and B_CAT_V_KEPT. More explicitly, since T maps each input symbol on these arcs to "Wild" and T^-1 maps "Wild" back to these symbols, composition output has a mapping between each input symbol pair. If the big transducer is following the same pattern, there will be around 10K x 10K = 100M self loops in the output of composition for each of these two states.

Cheers, Dogan

EricLebigot - 03 Apr 2013 - 05:31

Hi Dogan,

Thank you for following up! Your remarks are very useful. smile The structure of the input data is such that the Wild labels can be specialized, so that the number of possible matches is reduced (PCP_123:Wild is now PCP_123:PCP_Wild, etc., so that only similar categories match through Wild). The result is that calculating kernel elements is much faster, thank you!

Maybe will the calculation of T∘T^-1 require less memory, too…

Best, EOL

Log In

Double delete bug in fst far reader

DoganCan - 05 Mar 2013 - 04:14


I came across a double delete bug in the implementation of FstFarReader class. ReadFst method deletes the current fst before reading a new one, however if there is no more fst to read then the current fst pointer is left dangling. That leads to a segfault when the class destructor is called. You might trigger the error with

farequal fst1 fst2

I fixed the problem by moving the pointer deletion after the check for whether there is more fst to read.

This error is not triggered by other far utilities reading from an fst archive (farprintstrings, farinfo and farextract) since these tools, unlike farequal, do not delete the far reader pointer after they are done with it. It might be a good idea to fix that as well.

Cheers, Dogan

Log In

Get the Arcs whose nextstate is equal to a given q

JuanMiguelCejuela - 04 Mar 2013 - 08:31

Is it possible to efficiently get all the arcs which have as nextstate a given state q?

DoganCan - 05 Mar 2013 - 04:50

I don't think there is builtin functionality for efficient retrieval of incoming arcs. Your best option is to implement a new Fst class similar to VectorFst which has this functionality. You will first need to modify the definition of VectorState and add a new vector of arcs for incoming arcs. Then you will need to modify the methods that has to do with arcs such as AddArc, RemoveArcs, DeleteArcs, etc. to handle the incoming arcs as well as outgoing arcs. Finally you will need to define a new arc iterator for visiting incoming arcs.

Cheers, Dogan

AlexZatv - 05 Mar 2013 - 07:10

May be, you can revert the Fst? If you will revert the fst but keep the state id's the same, you can simply get revert(F).out_arcs(q).
Log In

Testing an FST for epsilon cycles

KennethRBeesley - 24 Jan 2013 - 00:41

If I have an FST, how can I best test to see if it has input epsilon cycles?

KennethRBeesley - 22 Feb 2013 - 12:16

Here's my own best effort, which seems to be working. Comments welcome. C++ is definitely not my native language.

bool isInputEpsilonCycleFree(StdVectorFst * fstp) {

   set<StateId> stateSet ;   // reuse this set of "visited states"
   // if the machine has no cycles at all, then it's epsilon-cycle-free
   if (fstp->Properties(kAcyclic, true))
      return true ;

   // loop through the states, looking for input epsilon cycles
   for (StateIterator<StdVectorFst> siter(*fstp) ;
      !siter.Done() ;
      siter.Next()) {
      stateSet.clear() ;
      if (findInputEpsilonCycle(fstp, siter.Value(), stateSet))
         return false ;
   // no input epsilon cycles found
   return true ;

bool findInputEpsilonCycle(StdVectorFst *fstp, StateId s, set<StateId> &stateSet) {
   // try to find an input epsilon cycle from state s back
   // to itself, or any other input epsilon cycle found during the search
        // stateSet is the set of states previously visited during the search from s
   if (stateSet.find(s) != stateSet.end()) 
      // found a loop
      return true ;
   // remember state s as "visited"
   stateSet.insert(s) ;
   for (ArcIterator<StdVectorFst> aiter(*fstp, s) ;
      !aiter.Done() ;
      aiter.Next()) {
      StdArc arc = aiter.Value() ;
      // deal only with arcs with epsilon on the input side
      if (arc.ilabel != 0)     // 0 (zero) is wired in as epsilon in OpenFst
         continue ;
      // here the ilabel is 0 (epsilon)
      // recursively look for an input epsilon loop from arc.nextstate
      if (findInputEpsilonCycle(fstp, arc.nextstate, stateSet))
            return true ;
   return false ;

PaulDixon - 23 Feb 2013 - 05:48

Using the OpenFst visitor to find a topological order on the epsilon input subgraph should work.

template<class Arc>
bool HasInputEpsilonCycle(const Fst<Arc>& fst) {
  vector<typename Arc::StateId> order;
  bool acyclic = false;
  TopOrderVisitor<Arc> top_order_visitor(&order, &acyclic);
  DfsVisit(fst, &top_order_visitor, InputEpsilonArcFilter<Arc>());
  return !acyclic;

KennethRBeesley - 21 Mar 2013 - 15:30

Paul, Many thanks. Very elegant and seems to work perfectly.
Log In

Example of k-closed semiring that is non-idempotent

KonstantinSelivanov - 10 Jan 2013 - 04:30


could you give an example k-close semiring that is not idempotent.


CyrilAllauzen - 18 Jan 2013 - 20:46

Hi, The k -tropical semiring defined by (Mohri, 2002) is (k-1) -closed and not idempotent when k ≥ 2 .

Log In

-- Michael Riley - 2018-05-30


Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2018-05-30 - MichaelRiley
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback