Dynamics of domain coverage of the protein sequence universe.
- Citation data:
BMC genomics, ISSN: 1471-2164, Vol: 13, Issue: 1, Page: 634
- Publication Year:
- Repository URL:
- PMC3557196; 3557196
- Biochemistry, Genetics and Molecular Biology; Microbiology
The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its "dark matter".