| Step | Functionality |
|---|---|
| Input Data | Accepts raw Nanopore FASTQ reads and per-sample metadata via a CSV file. |
| Raw Read Quality Control | Uses NanoStat to evaluate raw read quality metrics such as length, quality score, and yield. |
| Adapter Trimming & Quality Control | Removes adapter sequences and performs post-trimming QC to assess read data integrity and retention. |
| ITR Masking | Detects and masks variable Inverted Terminal Repeat (ITR) regions in the transgene cassette to improve alignment accuracy. |
| Combined Reference Preparation | Builds a combined reference by concatenating the host genome, masked transgene, Rep-Cap, and helper plasmids. |
| Reference Mapping | Aligns reads to the combined reference using minimap2. |
| Read Classification | Identifies the genomic source of each read (e.g., transgene, helper, host). |
| Coverage & Truncation Analyses | Calculates depth of coverage from ITR to ITR and detects common truncation hotspots. |
| Subgenome Typing | Categorizes reads into rAAV subgenome types (e.g., complete, partial, snapback, plasmid backbone). |
| Variant Detection | Identifies sequence variants in the transgene plasmid relative to the reference. |
| Transgene Consensus | Generates a polished consensus sequence of the transgene insert from all mapped reads. |
| BAM Tagging | Adds metadata tags to each read indicating genome source and subgenome classification. |
| Summary Report & Output | Produces interactive QC reports, read alignment statistics, variant summaries, tagged BAMS, and consensus FASTA. |
For a detailed breakdown of the workflow and visual diagrams of rAAV subgenome types, please visit the official EPI2ME AAV Workflow GitHub repository.
Delivered Data Files Nine data files will be generated per sample. These files are prefixed with the TUBE_ID provided in the sample submission form.