Generating formal-language represented by relational tuples, such as Lisp programs or mathematical operations, from natural-language input is a challenging task because it requires explicitly capturing discrete symbolic structural information implicit in the input. Most state-of-the-art neural sequence models do not explicitly capture such structural information, limiting their performance on these tasks. In this paper we propose a new encoder-decoder model based on Tensor Product Representations (TPRs) for Natural- to Formal-language generation, called TP-N2F. The encoder of TP-N2F employs TPR 'binding' to encode natural-language symbolic structure in vector space and the decoder uses TPR 'unbinding' to generate, in symbolic space, a sequence of relational tuples, each consisting of a relation (or operation) and a number of arguments. On two benchmarks, TP-N2F considerably outperforms LSTM-based seq2seq models, creating new state-of-the-art results: the MathQA dataset for math problem solving, and the AlgoLisp dataset for program synthesis. Ablation studies show that improvements can be attributed to the use of TPRs in both the encoder and decoder to explicitly capture relational structure to support reasoning.
|Original language||English (US)|
|State||Published - Oct 5 2019|
ASJC Scopus subject areas