> > | ## NegLogDiff fatal error (not due to precision)## DanielRenshaw - 2016-07-12 - 10:41I have a problem using ngrammake (OpenGRM v1.2.2 with OpenFST v1.5.1). Many of the smoothing methods produce errors, including the unsmoothed method. They all report a NegLogDiff fatal error which doesn't appear to be due to floating point precision, sometimes after some intermediate warnings or errors. My ngram counts are in the range [1, 1803827]. Does anyone have any idea what might be causing these problems?<verbatim> ngrammake --method=katz -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions INFO: Histograms violating Good-Turing assumptions Count bin Katz discounts Counts (1-grams/2-grams/3-grams) Count = 1 0.321904/0.338472/0.226025 Count = 2 0.999/0.586256/0.501859 Count = 3 0.999/0.710776/0.654064 Count = 4 0.999/0.776459/0.737903 Count = 5 0.999/0.823323/0.795443 Count > 5 0.999/0.999/0.999 FATAL: NegLogDiff: undefined inf -0.996128 ngrammake --method=witten_bell -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: lower order sum less than zero: 21 -0.314675 FATAL: NegLogDiff: undefined 0 -3.14155 ngrammake --method=unsmoothed -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 INFO: lower order sum less than zero: 3 -inf INFO: new lower order sum: 3 nan INFO: lower order sum less than zero: 4 -inf INFO: new lower order sum: 4 nan INFO: lower order sum less than zero: 5 -inf INFO: new lower order sum: 5 nan <snip: more of similar> INFO: new lower order sum: 80 nan INFO: lower order sum less than zero: 88 -inf INFO: new lower order sum: 88 nan INFO: lower order sum less than zero: 90 -inf INFO: new lower order sum: 90 nan INFO: lower order sum less than zero: 92 -inf FATAL: NegLogDiff: undefined 0 -inf ngrammake --method=absolute -v=100 counts.grm model.grm INFO: FstImpl::ReadHeader: source: counts.grm, fst_type: vector, arc_type: standard, version: 2, flags: 3 Count bin Absolute discounts Counts (1-grams/2-grams/3-grams) Count = 1 0.374582/0.697101/0.786205 Count > 1 0.374582/0.697101/0.786205 FATAL: NegLogDiff: undefined inf -0.885087 </verbatim>
## Can ngramcount ignore OOVs?## DanielRenshaw - 2016-06-29 - 08:15 |

