Quality trimming https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/quality-control/tutorial.html Read position (bp) Quality score
fastp Chen, et al. Bioinformatics , 34:i884–90 (2018).
fastp Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: auto. adapter trimming Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: base correction Looks for read overlaps If mismatches found within overlap: Only corrects if imbalanced quality score Only corrects if total mismatches below threshold Reduces false corrections Default: 5 Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: sliding window QC trim. The window can slide from either read direction Evaluates average quality score within the window If below threshold, discarded and move forward If above threshold, trimming ends Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: polyG/polyX tail trimming https://sequencing.qcfail.com/articles/illumina-2-colour-chemistry-can-overcall-high-confidence-g-bases/ PolyGs are common in 2-colour sequencing NextSeq/NovaSeq, but not HiSeq (4-colour)
fastp: polyG correction example Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: polyX Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018). Can be enabled, to trim low-complexity A/T/G/C, at 3′ end of read Enable via: -x / --trim_poly_x Can also use: --poly_x_min_len
fastp: UMIs -U / --umi --umi_loc index1/index2/read1/read2/per_index/per_read --umi_len --umi_prefix f (e.g., UMI_AATTCG) --umi_skip Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: additional features Output splitting by file lines or line numbers Duplication evaluation Overrepresented seq. analysis ( -p ) FASTQC only tracks the first 1M reads fastp performs uniform sampling Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: duplication example Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp: overrep. seq. example Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
fastp is very fast (C++; multi-threaded) Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018).
Chen, et al. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics , 34:i884–90 (2018). fastp trims well
fastp does not convert qualities https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/quality-control/tutorial.html Supports phred64 scoring (converts to phread33), via -6 / --phred64