Rocktaschel, Tim;
(2018)
Combining Representation Learning with Logic for Language Processing.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Rocktaeschel_ID_thesis.pdf Download (2MB) | Preview |
Abstract
The current state-of-the-art in many natural language processing and automated knowledge base completion tasks is held by representation learning methods which learn distributed vector representations of symbols via gradient based optimization. They require little or no hand-crafted features, thus avoiding the need for most preprocessing steps and task-specific assumptions. However, in many cases representation learning requires a large amount of annotated training data to generalize well to unseen data. Such labeled training data is provided by human annotators who often use formal logic as the language for specifying annotations. This thesis investigates different combinations of representation learning methods with logic for reducing the need for annotated training data, and for improving generalization. We introduce a mapping of function-free first-order logic rules to loss functions that we combine with neural link prediction models. Using this method, logical prior knowledge is directly embedded in vector representations of predicates and constants. We find that this method learns accurate predicate representations for which no or little training data is available, while at the same time generalizing to other predicates not explicitly stated in rules. However, this method relies on grounding first-order logic rules, which does not scale to large rule sets. To overcome this limitation, we propose a scalable method for embedding implications in a vector space by only regularizing predicate representations. Subsequently, we explore a tighter integration of representation learning and logical deduction. We introduce an end-to-end differentiable prover – a neural network that is recursively constructed from Prolog’s backward chaining algorithm. The constructed network allows us to calculate the gradient of proofs with respect to symbol representations and to learn these representations from proving facts in a knowledge base. In addition to incorporating complex first-order rules, it induces interpretable logic programs via gradient descent. Lastly, we propose recurrent neural networks with conditional encoding and a neural attention mechanism for determining the logical relationship between two natural language sentences.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Combining Representation Learning with Logic for Language Processing |
Event: | UCL (University College London) |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10040845 |
Archive Staff Only
View Item |