Log In

2461 SW Campus Way, Corvallis, OR 97331

View map Free Event

Learning with Limited Labeled Data in Natural Language Processing

The advent of deep learning models leads to a substantial improvement in a wide range of NLP tasks, achieving state-of-art performances without any hand-crafted features. However, training deep models requires a massive amount of labeled data. Labeling new data as a new task or domain emerges consumes time and efforts and needs domain expertise. As a result, the approaches that address the data scarcity are getting increasing attention in recent years, including, but not limited to, transfer learning, zero-shot learning, and weak supervision. In this report, we summarize our two prior works on learning from limited data. In the first work, we present a Transfer Learning method to transfer the knowledge between two domains (source and target) with disparate labels. Our approach exploits the relationship between the source and the target labels to enhance the transfer of the learned knowledge. We apply our methods to two NLP tasks: Event Typing and Text Classification. In our second work, we address the problem of modeling the tasks with evolving type ontologies. We present a Zero-Shot Fine-Grained Entity Typing (ZS-FGET) approach that exploits the Wikipedia description of the type to construct the representation of that type. Then, the type can be recognized requiring zero training examples. Since FGET deal with a large number of types organized into a hierarchy, Distant Supervision is employed to automatically collect training data, leading to significant label noises. In our final work, we focus on the hierarchical nature of the fine grain entity types and propose an FGET framework with a ranking-based weak supervision objective that ranks the types in the best path among the candidate paths higher than the incorrect types. We further leverage additional type connections that are not presented in the type hierarchy to improve the training and type inference and present some preliminary results.

Major Advisor: Xiaoli Fern
Committee: Prasad Tadepalli
Committee: Stephen A Ramsey
Committee: Liang Huang
GCR: Brett Tyler