Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

Welzel, Marius; Lange, Anja; Heider, Dominik; Schwarz, Michael; Freisleben, Bernd; Jensen, Manfred; Boenigk, Jens; Beisser, Daniela

doi:10.1186/s12859-020-03852-4

Artikel / Aufsatz Mo., 16. Nov.. 2020 CC BY 4.0

Veröffentlicht

Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

Welzel, Marius ; Lange, Anja ; Heider, Dominik ; Schwarz, Michael ; Freisleben, Bernd ; Jensen, Manfred ; Boenigk, Jens ; Beisser, Daniela

Background: Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system.

Results: We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub ( https://github.com/MW55/Natrix ) or as a Docker container on DockerHub ( https://hub.docker.com/r/mw55/natrix ).

Conclusion: Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.

Vorschau

Einordnung

Datum der Veröffentlichung:

16.11.2020

URN:

urn:nbn:de:hbz:465-20231214-162457-7

DOI:

10.1186/s12859-020-03852-4

Sprache:

Englisch

Ressourcentyp:

Text

Schlagwörter:

Amplicon sequencing; Operational Taxonomic Units; Amplicon Sequence Variants; Snakemake; Pipline; Illumina

Kollektion:

E-Publikationen

Sachgruppen der Deutschen Nationalbibliographie:

570 Biowissenschaften, Biologie

Einrichtung:

Fakultät für Biologie, Bioinformatics and Computational Biophysics

Einrichtung:

Fakultät für Biologie, Biodiversität

Förderung:

The publication of this article was supported by the Publication Fund of the University of Duisburg-Essen.

Informationen zur Erstveröffentlichung:

Welzel, M., Lange, A., Heider, D. et al. Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads. BMC Bioinformatics 21, 526 (2020). https://doi.org/10.1186/s12859-020-03852-4

Published: 16 November 2020

Versionskennzeichen:

Version of Record / Verlagsversion

auf die Merkliste

Zitieren

See more details

Zitierform:

Welzel, Marius et al. (2020): Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads. Online unter: https://nbn-resolving.org/urn:nbn:de:hbz:465-20231214-162457-7.

urn:nbn:de:hbz:465-20231214-162457-7
Zitier-Link kopieren

Rechte

Rechteinhaber:

Nutzung und Vervielfältigung:

Dieses Werk kann unter einer
CC BY 4.0 Logo

Creative Commons Namensnennung 4.0 Lizenz (CC BY 4.0)
genutzt werden.

Export

BibTeX, Endnote, MODS, MARCXML, RIS, ISI, PICA, DC, CSV

DuEPublico 2

Duisburg-Essen Publications online

Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads

Vorschau

Einordnung

Zitieren

Rechte

Rechteinhaber:

Nutzung und Vervielfältigung:

Export