TY - GEN

T1 - Stolen probability

T2 - 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020

AU - Demeter, David

AU - Kimmel, Gregory

AU - Downey, Doug

N1 - Funding Information:
This work was supported in part by NSF Grant IIS-1351029. We thank the anonymous reviewers and Northwestern’s Theoretical Computer Science group for their insightful comments and guidance.
Publisher Copyright:
© 2020 Association for Computational Linguistics

PY - 2020

Y1 - 2020

N2 - Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the inductive bias of NNLMs. Although NNLMs optimize well with this inductive bias, we show that this results in a sub-optimal ordering of the embedding space that structurally impoverishes some words at the expense of others when assigning probability. We present numerical, theoretical and empirical analyses showing that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.

AB - Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the inductive bias of NNLMs. Although NNLMs optimize well with this inductive bias, we show that this results in a sub-optimal ordering of the embedding space that structurally impoverishes some words at the expense of others when assigning probability. We present numerical, theoretical and empirical analyses showing that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.

UR - http://www.scopus.com/inward/record.url?scp=85098411680&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85098411680&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85098411680

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 2191

EP - 2197

BT - ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

PB - Association for Computational Linguistics (ACL)

Y2 - 5 July 2020 through 10 July 2020

ER -