Entity Information Extraction for Knowledge Base Completion
The information of named entities (real world objects such as person, organization, and etc) is usually harvested from different sources and organized as a multiple relational directed graph in a Knowledge Base (KB). KBs are very big, but still incomplete: many entities and relations between entities are missing. As a result, many recent works have been devoted to KB completion (link prediction) in order to infer missing facts. The inference mechanism could rely either on the observed facts or it could involve huge raw/unstructured data from external domains (such as tweets, captions, news, and etc.). Since the references to entities in external domains are very ambiguous, Named Entity Disambiguation (NED) and Relation Extraction (RE) models have been studied for years as two essential tasks in Natural Language Processing (NLP). In this proposal, we focus on understanding and improving the role of entities in NLP. We propose i) A global NED model which formulates NED as a structure prediction problem in the Limited Discrepancy Search framework. Given an input document, the model starts with a complete solution constructed by a local model and conducts a search in the space of few possible corrections to improve the local solution from a global point of view. ii) A novel local NED system is proposed where words are associated to candidate entities in a language model. This mechanism leverages more appropriately the contextual clues for NED. iii) A novel approach to improve the accuracy of relation extraction models with distant supervision in a way that it predicts output based on right reasons leading to improved transparency and robustness. iv) A new mechanism to improve the explainability of the link prediction models and also a novel model to improve the performance of the linear link predicting models by learning the latent structures provided in non linear multi-hop ones.
Major Advisor: Xiaoli Fern
Committee: Prasad Tadepalli
Committee: Xiao Fu
Committee: Stefan Lee
GCR: Cory M. Simon
Monday, August 24, 2020 at 3:00pm to 5:00pm
Virtual EventDakota Nelson
No recent activity