InSilicoSeq 2.0: Simulating realistic amplicon-based sequence reads

Citation DatabioRxiv, ISSN: 2692-8205

Publication Year2024

0
Citations
0
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Article Description

Motivation: Simulating high-throughput sequencing reads that mimic empirical sequence data is of major importance for designing and validating sequencing experiments, as well as for benchmarking bioinformatic workflows and tools. Results: Here, we present InSilicoSeq 2.0, a software package that can simulate realistic Illumina-like sequencing reads for a variety of sequencing machines and assay types. InSilicoSeq now supports amplicon-based sequencing and comes with premade error models of various quality levels for Illumina MiSeq, HiSeq, NovaSeq and NextSeq platforms. It provides the flexibility to generate custom error models for any short-read sequencing platform from a BAM-file. We demonstrated the novel amplicon sequencing algorithm by simulating Adaptive Immune Receptor Repertoire (AIRR) reads. Our benchmark revealed that the simulated reads by InSilicoSeq 2.0 closely resemble the Phred-scores of actual Illumina MiSeq, HiSeq, NovaSeq and NextSeq sequencing data. InSilicoSeq 2.0 generated 15 million amplicon based paired-end reads in under an hour at a total cost of €4.3e per million bases advocating for testing experimental designs through simulations prior to actual sequencing. Availability and implementation: InSilicoSeq 2.0 is implemented in Python and is freely available under the MIT licence at https://github.com/HadrienG/InSilicoSeq.

Bibliographic Details

DOI10.1101/2024.02.16.580469

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85189211145&origin=inward; http://dx.doi.org/10.1101/2024.02.16.580469; https://dx.doi.org/10.1101/2024.02.16.580469; https://www.biorxiv.org/content/10.1101/2024.02.16.580469v2

AUTHOR(S)

Stefan H. Lelieveld; Thijs Maas; Henk Jan van den Ham; Tessa C.X. Duk; Hadrien Gourlé

PUBLISHER(S)

Cold Spring Harbor Laboratory

TAG(S)

Biochemistry, Genetics and Molecular Biology; Agricultural and Biological Sciences; Immunology and Microbiology; Neuroscience; Pharmacology, Toxicology and Pharmaceutics

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know