Difference: GrmNGramForum (113 vs. 114)

Revision 1142016-07-13 - DanielRenshaw

Line: 1 to 1

OpenGrm NGram Forum

Line: 75 to 75
  Strangely, the kneser_ney method doesn't generate an error.

DanielRenshaw - 2016-07-13 - 07:33

This problem is occurring only when printing the counts to a text format and reading them back in (following the answer to the "Can ngramcount ignore OOVs?" question); the problem occurs even if <unk>s are not removed.

Given counts1.grm, an FST produced by ngramcount, I would have thought "ngramprint counts1.grm | ngramread - counts2.grm" would result in counts2.grm being identical to counts1.grm, but this isn't true in general. With the earnest.cnts example the new version is only slightly different in terms of file size. For part of my own corpus the difference in file size is much more substantial. Could this difference be due to symbol table changes only, or could the procedure change the FST in other ways?

Log In
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback