Description
(Available in versions 1.1.0 and higher.)
This operation
re-estimates smoothed n-gram models by imposing marginalization constraints similar to those used for Kneser-Ney modeling on Absolute Discounting models. Specifically, the algorithm modifies lower-order distributions so that the expected frequencies of lower-order n-grams within the model are equal to the smoothed relative frequency estimates of the baseline smoothing method. Unlike Kneser-Ney, this algorithm may require multiple iterations to converge, due to changes in the state probabilities.
Usage
ngrammarginalize [--opts] [in.mod [out.mod]]
--iterations: type = int, default = 1, number of iterations of steady state probability calculation
--max_bo_updates: type = int, default = 10, maximum within iteration updates to backoff weights
--output_each_iteration: type = bool, default = false, whether to output a model after each iteration in addition to final model
--steady_state_file: type = string, default = "", name of separate file to derive steady state probabilities
|
|
class NGramMarginal(StdMutableFst *model);
|
|
Examples
ngrammarginalize --iterations=5 earnest.mod >earnest.marg.mod
int total_iterations = 5;
vector<double> weights;
for (int iteration = 1; iteration <= total_iterations; ++iteration) {
StdMutableFst *model = StdMutableFst::Read("in.mod", true);
NGramMarginal ngrammarg(model);
ngrammarg.MarginalizeNGramModel(&weights, iteration, total_iterations);
if (iteration == total_iterations)
ngrammarg.GetFst().Write("out.mod");
delete model;
}
Caveats
Note that this method assumes that the baseline smoothed model provides smoothed relative frequency estimates for all n-grams in the model. Thus the method is not generally applicable to models trained using Kneser-Ney smoothing, since lower-order n-gram weights resulting from that method do not represent relative frequency estimates. See reference below for further information on the algorithm.
References
B. Roark, C. Allauzen and M. Riley. "Smoothed marginal distribution constraints for language modeling". To appear in
Proceedings of the Association for Computational Linguistics (ACL). 2013. August, Sofia, Bulgaria.
(A link to this paper will be provided as soon as the ACL posts the final version in the anthology.)