Linking entities through an ontology using word embeddings and syntactic re-ranking

Citation DataBMC Bioinformatics, ISSN: 1471-2105, Vol: 20, Issue: 1, Page: 156

Publication Year2019

32
Citations
0
Usage
94
Captures
1
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
32
- Citation Indexes
  32
Captures
94
- Readers
  94
Mentions
1
- News Mentions
  1

Most Recent News

Who’s Who and What’s What: Advances in Biomedical Named Entity Recognition (BioNER)

May 5, 2021
Towards Data Science

[1] V. Yadav and S. Bethard, A Survey on Recent Advances in Named Entity Recognition from Deep Learning models (2018), Proceedings of the 27th International

Article Description

Background: Although there is an enormous number of textual resources in the biomedical domain, currently, manually curated resources cover only a small part of the existing knowledge. The vast majority of these information is in unstructured form which contain nonstandard naming conventions. The task of named entity recognition, which is the identification of entity names from text, is not adequate without a standardization step. Linking each identified entity mention in text to an ontology/dictionary concept is an essential task to make sense of the identified entities. This paper presents an unsupervised approach for the linking of named entities to concepts in an ontology/dictionary. We propose an approach for the normalization of biomedical entities through an ontology/dictionary by using word embeddings to represent semantic spaces, and a syntactic parser to give higher weight to the most informative word in the named entity mentions. Results: We applied the proposed method to two different normalization tasks: the normalization of bacteria biotope entities through the Onto-Biotope ontology and the normalization of adverse drug reaction entities through the Medical Dictionary for Regulatory Activities (MedDRA). The proposed method achieved a precision score of 65.9%, which is 2.9 percentage points above the state-of-the-art result on the BioNLP Shared Task 2016 Bacteria Biotope test data and a macro-averaged precision score of 68.7% on the Text Analysis Conference 2017 Adverse Drug Reaction test data. Conclusions: The core contribution of this paper is a syntax-based way of combining the individual word vectors to form vectors for the named entity mentions and ontology concepts, which can then be used to measure the similarity between them. The proposed approach is unsupervised and does not require labeled data, making it easily applicable to different domains.

Bibliographic Details

DOI10.1186/s12859-019-2678-8

PMID30917789

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85063581742&origin=inward; http://dx.doi.org/10.1186/s12859-019-2678-8; http://www.ncbi.nlm.nih.gov/pubmed/30917789; https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2678-8; https://dx.doi.org/10.1186/s12859-019-2678-8

AUTHOR(S)

Karadeniz, İlknur; Özgür, Arzucan

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Biochemistry, Genetics and Molecular Biology; Computer Science; Mathematics

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know