A Comparative Analysis of Transformer-based Protein Language Models for Remote Homology Prediction

Citation DataACM-BCB 2023 - 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Page: 1-9

Publication Year2023

1
Citations
0
Usage
3
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
1
- Citation Indexes
  1
Captures
3
- Readers
  3

Conference Paper Description

Protein language models based on the transformer architecture are increasingly shown to learn rich representations from protein sequences that improve performance on a variety of downstream protein prediction tasks. These tasks encompass a wide range of predictions, including prediction of secondary structure, subcellular localization, evolutionary relationships within protein families, as well as superfamily and family membership. There is recent evidence that such models also implicitly learn structural information. In this paper we put this to the test on a hallmark problem in computational biology, remote homology prediction. We employ a rigorous setting, where, by lowering sequence identity, we clarify whether the problem of remote homology prediction has been solved. Among various interesting findings, we report that current state-of-the-art, large models are still underperforming in the "twilight zone"of very low sequence identity.

Bibliographic Details

DOI10.1145/3584371.3612942

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85175833806&origin=inward; http://dx.doi.org/10.1145/3584371.3612942; https://dl.acm.org/doi/10.1145/3584371.3612942; https://dx.doi.org/10.1145/3584371.3612942

AUTHOR(S)

Anowarul Kabir; Asher Moldwin; Amarda Shehu

PUBLISHER(S)

Association for Computing Machinery (ACM)

TAG(S)

Computer Science; Engineering; Medicine

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know