Difference: GrmNGramForum (111 vs. 112)

Revision 1122016-07-12 - DanielRenshaw

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

OpenGrm NGram Forum

Line: 17 to 17
 
Added:
>
>

NegLogDiff fatal error (not due to precision)

DanielRenshaw - 2016-07-12 - 10:41

I have a problem using ngrammake (OpenGRM v1.2.2 with OpenFST v1.5.1). Many of the smoothing methods produce errors, including the unsmoothed method. They all report a NegLogDiff fatal error which doesn't appear to be due to floating point precision, sometimes after some intermediate warnings or errors. My ngram counts are in the range [1, 1803827]. Does anyone have any idea what might be causing these problems?

<verbatim> ngrammake --method=katz -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions Count bin Katz discounts Counts (1-grams/2-grams/3-grams) Count = 1 0.321904/0.338472/0.226025 Count = 2 0.999/0.586256/0.501859 Count = 3 0.999/0.710776/0.654064 Count = 4 0.999/0.776459/0.737903 Count = 5 0.999/0.823323/0.795443 Count > 5 0.999/0.999/0.999 FATAL: NegLogDiff: undefined inf -0.996128

ngrammake --method=witten_bell -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: lower order sum less than zero: 21 -0.314675 FATAL: NegLogDiff: undefined 0 -3.14155

ngrammake --method=unsmoothed -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: lower order sum less than zero: 3 -inf INFO: new lower order sum: 3 nan INFO: lower order sum less than zero: 4 -inf INFO: new lower order sum: 4 nan INFO: lower order sum less than zero: 5 -inf INFO: new lower order sum: 5 nan <snip: more of similar> INFO: new lower order sum: 80 nan INFO: lower order sum less than zero: 88 -inf INFO: new lower order sum: 88 nan INFO: lower order sum less than zero: 90 -inf INFO: new lower order sum: 90 nan INFO: lower order sum less than zero: 92 -inf FATAL: NegLogDiff: undefined 0 -inf

ngrammake --method=absolute -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 Count bin Absolute discounts Counts (1-grams/2-grams/3-grams) Count = 1 0.374582/0.697101/0.786205 Count > 1 0.374582/0.697101/0.786205 FATAL: NegLogDiff: undefined inf -0.885087 </verbatim>

<--/commentPlugin-->
Log In

 

Can ngramcount ignore OOVs?

DanielRenshaw - 2016-06-29 - 08:15

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback