A Semi-supervised Graph Deep Neural Network for Automatic Protein Function Annotation

Citation DataLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN: 1611-3349, Vol: 13347 LNBI, Page: 153-166

Publication Year2022

2
Citations
0
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
2
- Citation Indexes
  2

Conference Paper Description

The protein function annotation based on functional properties like the Enzyme Commission (EC) numbers is a very challenging task that aims to understand life at the molecular level. Especially, the size of features for each protein is very huge and the number of labeled samples is limited, which can significantly affect the annotation accuracy. To address these issues, we propose a novel semi-supervised graph deep learning model that aims to learn better latent representations for each protein/node by taking into account the neighborhood information in order to improve the annotation. Firstly, we extract a set of features from raw protein data. Each protein is associated with a 1-D feature vector that represents its InterPro domain composition. As D, the number of possible interPro domains, is very high (>11,000), we design a deep autoencoder model (DAE) that seeks to find an efficient representation of the domain composition of proteins in a lower dimensional latent space. Then, we construct a protein graph where each node is a protein associated with its latent representation vector and each edge is weighted by the Euclidean distance between the two nodes it connects. Finally, we train a semi-supervised graph neural network (SGNN) for the automatic protein function annotation using the constructed protein graph. Experiments are conducted on four reference proteomes in UniProtKB/SwissProt, including Human, Arabidopsis Thaliana, Mouse, and Rat. Experimental results show that the proposed model is competitive for protein function annotation compared to existing methods.

Bibliographic Details

DOI10.1007/978-3-031-07802-6_14

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85133157477&origin=inward; http://dx.doi.org/10.1007/978-3-031-07802-6_14; https://link.springer.com/10.1007/978-3-031-07802-6_14; https://dx.doi.org/10.1007/978-3-031-07802-6_14; https://link.springer.com/chapter/10.1007/978-3-031-07802-6_14

AUTHOR(S)

Akrem Sellami; Salvatore Tabbone; Marie Dominique Devignes; Sabeur Aridhi; Bishnu Sarker

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Mathematics; Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know