TY - GEN

T1 - Estimating marginal probabilities of n-grams for recurrent neural language models

AU - Noraset, Thanapon

AU - Downey, Doug

AU - Bing, Lidong

N1 - Funding Information:
This work was supported in part by NSF grant IIS-1351029, the Tencent AI Lab Rhino-Bird Gift Fund, and Faculty of ICT, Mahidol University. We thank the reviewers for their valuable input.
Publisher Copyright:
© 2018 Association for Computational Linguistics

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Recurrent neural network language models (RNNLMs) are the current standard-bearer for statistical language modeling. However, RNNLMs only estimate probabilities for complete sequences of text, whereas some applications require context-independent phrase probabilities instead. In this paper, we study how to compute an RNNLM's marginal probability: the probability that the model assigns to a short sequence of text when the preceding context is not known. We introduce a simple method of altering the RNNLM training to make the model more accurate at marginal estimation. Our experiments demonstrate that the technique is effective compared to baselines including the traditional RNNLM probability and an importance sampling approach. Finally, we show how we can use the marginal estimation to improve an RNNLM by training the marginals to match n-gram probabilities from a larger corpus.

AB - Recurrent neural network language models (RNNLMs) are the current standard-bearer for statistical language modeling. However, RNNLMs only estimate probabilities for complete sequences of text, whereas some applications require context-independent phrase probabilities instead. In this paper, we study how to compute an RNNLM's marginal probability: the probability that the model assigns to a short sequence of text when the preceding context is not known. We introduce a simple method of altering the RNNLM training to make the model more accurate at marginal estimation. Our experiments demonstrate that the technique is effective compared to baselines including the traditional RNNLM probability and an importance sampling approach. Finally, we show how we can use the marginal estimation to improve an RNNLM by training the marginals to match n-gram probabilities from a larger corpus.

UR - http://www.scopus.com/inward/record.url?scp=85081737682&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85081737682&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85081737682

T3 - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

SP - 2930

EP - 2935

BT - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

A2 - Riloff, Ellen

A2 - Chiang, David

A2 - Hockenmaier, Julia

A2 - Tsujii, Jun'ichi

PB - Association for Computational Linguistics

T2 - 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

Y2 - 31 October 2018 through 4 November 2018

ER -