IMAP snakemake workflow series
General Overview
IMAP stands for Integrated Microbiome Analysis Pipelines. IMAP comprises different parts. Each part represents a standalone GitHub repository. The IMAP parts, when used sequentially, provide a systematic microbiome data analysis beyond the traditional methods.
IMAP tentative parts: Each part forms a standalone git repository containing similar project stucture.
IMAP approach
- We use the snakemake workflow management system[1,2] for:
- Maintaining reproducibility in technical validation and regeneration of results.
- Creating scalable data analysis scaled to a server, grid, or cloud environment.
- Fostering sustainable improvement of the microbiome data analysis.
- We break complex workflows into small contiguous but related chunks where each major step forms a separate executable snakemake rule.
Mission and Vision
We envision fostering continuous integration and improvement of highly reproducible and sustainable workflows for microbiome data analysis.
IMAP Project Structure
Note: This structure shows the basic folders and their content. Some folders or files may be removed or add new ones accordingly.
IMAP_Project_Directories
├── LICENSE.md
├── README.md
├── config
│ ├── config.yaml
│ ├── samples.tsv
│ └── units.tsv
├── data
│ ├── metadata
│ │ └── metadata.csv
│ └── reads
├── figures
│ ├── fig.pdf
│ ├── fig.png
│ └── fig.svg
├── images
│ ├── img.pdf
│ ├── img.png
│ └── img.svg
├── index.Rmd
├── library
│ ├── apa.csl
│ ├── imap.bib
│ └── references.bib
├── resources
├── results
└── workflow
├── Snakefile
├── envs
│ ├── pipeline.yml
│ └── tool.yml
├── notebooks
│ ├── jnb.py.ipynb
│ └── jnb.r.ipynb
├── reports
│ ├── plot1.rst
│ └── plot2.rst
├── rules
│ ├── rule1.smk
│ └── rule2.smk
├── schemas
│ ├── schm1.yml
│ └── schm2.yml
└── scripts
├── Rmd.Rmd
├── bash.sh
├── python.py
└── rscript.R
16 directories, 31 files
Potential Workflows
Repo | Description | Status |
---|---|---|
IMAP-GLIMPSE | IMAP project overview | In-progress |
IMAP-PART 01 | Software requirement for microbiome data analysis with Snakemake workflows | In-progress |
IMAP-PART 02 | Downloading and exploring microbiome sample metadata from SRA Database | In-progress |
IMAP-PART 03 | Downloading and filtering microbiome sequencing data from SRA database | In-progress |
IMAP-PART 04 | Quality Control of Microbiome Next Generation Sequencing Reads | In-progress |
IMAP-PART 05 | Bioinformatics & classification of preprocessed microbiome sequencing data | In-progress |
IMAP-PART 06 | In-progress | |
IMAP-PART 07 | In-progress | |
IMAP-PART 08 | In-progress |
Citation
Please consider citing the iMAP article[3] if you find any part of the IMAP practical user guides helpful in your microbiome data analysis.