# Bioinformatics Workshop: From Raw Reads to Biological Insight

2 days · 4 hours each · Hands-on, tutorial-oriented

## Overview

A practical introduction to bioinformatics workflows — from raw sequencing data to
meaningful biological results. Every session is built around real tools, real data,
and exercises you run yourself.

## Audience

Beginner to intermediate. Comfortable with the command line helps but Module 1
covers everything needed.

## Prerequisites

- Laptop with terminal access (Linux/macOS native; Windows via WSL2)
- Conda or Docker installed (see environment/setup.sh)
- ~10 GB free disk space

## Schedule

| Day | Module | Topic | Duration |
|-----|--------|-------|----------|
| 1 | 1 | Linux CLI for Bioinformatics | 45 min |
| 1 | 2 | Sequence Data Formats (FASTA/FASTQ/SAM/BAM/VCF) | 45 min |
| 1 | 3 | Quality Control + Read Trimming | 60 min |
| 1 | 4 | Read Alignment to a Reference Genome | 90 min |
| 2 | 5 | BAM Processing + Variant Calling | 70 min |
| 2 | 6 | RNA-seq: Quantification + Differential Expression | 70 min |
| 2 | 7 | Visualization: IGV, Python plots, pathway analysis | 50 min |
| 2 | 8 | Capstone Project | 50 min |

## Quick Start

```bash
cd bioinformatics-workshop
bash environment/setup.sh          # installs conda env + tools
bash data/reference/download_reference.sh   # pulls chr22 reference
```

## Directory Structure

```
bioinformatics-workshop/
├── README.md
├── SCHEDULE.md
├── environment/       conda env, requirements, Dockerfile
├── data/              raw reads, reference genome, example files
├── day1/              modules 1-4
├── day2/              modules 5-8
├── slides/            lecture slides (Markdown → reveal.js)
└── instructor/        setup checklist, notes, troubleshooting
```

## Tools Used

FastQC, Trimmomatic, BWA, SAMtools, GATK4, HISAT2, featureCounts,
DESeq2 (R), IGV, Python (pandas, matplotlib, seaborn, Biopython)
