Low-Resource Name Tagging Learned with Weakly Labeled Data

Citation DataProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Page: 261-270

Publication Year2019

1
Citations
25
Usage
157
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
1
- Citation Indexes
  1
Usage
25
- Downloads
  22
- Abstract Views
  3
Captures
157
- Readers
  157

Conference Paper Description

Name tagging in low-resource languages or domains suffers from inadequate training data. Existing work heavily relies on additional information, while leaving those noisy annotations unexplored that extensively exist on the web. In this paper, we propose a novel neural model for name tagging solely based on weakly labeled (WL) data, so that it can be applied in any low-resource settings. To take the best advantage of all WL sentences, we split them into high-quality and noisy portions for two modules, respectively: (1) a classification module focusing on the large portion of noisy data can efficiently and robustly pretrain the tag classifier by capturing textual context semantics; and (2) a costly sequence labeling module focusing on high-quality data utilizes Partial-CRFs with non-entity sampling to achieve global optimum. Two modules are combined via shared parameters. Extensive experiments involving five low-resource languages and fine-grained food domain demonstrate our superior performance (6% and 7.8% F1 gains on average) as well as efficiency.

Bibliographic Details

DOI10.18653/v1/d19-1025

REPOSITORY URLhttps://ink.library.smu.edu.sg/sis_research/7457

URL IDhttps://www.aclweb.org/anthology/D19-1025; http://dx.doi.org/10.18653/v1/d19-1025; https://ink.library.smu.edu.sg/sis_research/7457; https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=8460&context=sis_research; https://dx.doi.org/10.18653/v1/d19-1025; https://aclanthology.org/D19-1025/

AUTHOR(S)

Yixin Cao; Zikun Hu; Tat-seng Chua; Zhiyuan Liu; Heng Ji

PUBLISHER(S)

Association for Computational Linguistics (ACL)

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know