Day 1 · Module 310:30 – 11:30·60 min

Quality Control + Read Trimming

FastQC, MultiQC, Trimmomatic, fastp — diagnose and clean raw reads.

What this module covers

  • FastQC: per-base quality, GC content, adapter contamination
  • MultiQC: aggregating reports across samples
  • Trimmomatic: adapter removal, quality sliding window
  • fastp: faster alternative, hands-on comparison
Download .ipynbQC pipeline (bash)QC engine (Python script)

Start here — the data journey

live in your browser · no install

Watch the data move through the pipeline below, then read on — each section has its own interactive explorer embedded right where the code builds that figure, so you can turn the knobs as you go.

The quality-control pipeline

Raw FASTQoff the sequencer

Millions of reads with per-base quality that sags toward the 3′ end and some adapter read-through.

The notebook — live & editable

runs in your browser · no install

Every section's code is already filled in below. Press the ▶ next to any cell (or Shift+Enter) to run it, edit it and run again, or hit Run all to execute the whole notebook top to bottom. No Python or Jupyter install needed — the kernel boots right here in your browser.

Python kernel — not started
first run downloads the runtime (~once, a few seconds)open in full Jupyter ↗
Heads up: this module's pipeline uses command-line tools (e.g. bwa, samtools) that aren't available in the browser kernel. The Python cells run here; tool/shell lines print a note instead.
Loading notebook…