Day 1 · Module 411:30 – 13:00·90 min

Read Alignment to Reference Genome

BWA-MEM2, SAMtools, and the mechanics of mapping short reads to a reference.

What this module covers

  • BWA-MEM2: index building, alignment, SAM output
  • SAMtools: sort, index, flagstat, view, idxstats
  • Alignment QC: coverage depth, mapping rate, insert size
  • Visualization in IGV
Download .ipynbAlignment pipeline (bash)Alignment engine (Python script)

Start here — the data journey

live in your browser · no install

Watch the data move through the pipeline below, then read on — each section has its own interactive explorer embedded right where the code builds that figure, so you can turn the knobs as you go.

From clean reads to a pileup

Clean readsfrom Module 3

The trimmed, adapter-free FASTQ we finished Module 3 with. Each read is a short string with no idea where on the genome it came from — yet.

Steps 2–4 are what bwa-mem2 does for you in one command; the rest is samtools. This module opens up the Extend step so the aligner stops being a black box.

The notebook — live & editable

runs in your browser · no install

Every section's code is already filled in below. Press the ▶ next to any cell (or Shift+Enter) to run it, edit it and run again, or hit Run all to execute the whole notebook top to bottom. No Python or Jupyter install needed — the kernel boots right here in your browser.

Python kernel — not started
first run downloads the runtime (~once, a few seconds)open in full Jupyter ↗
Loading notebook…