The development of bioinformatic programs for computational epigenomics is a large part of my research effort. Most of my postdoctoral work has focused on low-pass whole genome bisulfite sequencing, which has led to two bioinformatic programs: CpG_Me and DMRichR.
CpG_Me is an optimized and comprehensive whole genome bisulfite sequencing (WGBS) alignment pipeline for a SLURM job scheduler on a high-performance computing cluster. The pipeline consists of Trim Galore!, Bismark (Bowtie 2), FastQ Screen, Picard Tools, and MultiQC. CpG_Me takes you from raw fastq files to CpG methylation count matrices (Bismark cytosine reports), where it processes the data to remove biases and provides ample QC/QA. Scripts are available for both paired end (PE) and single end (SE) sequencing approaches. One command call will align all your samples. The extracted CpG methylation count matrices can be then be used for the identification of differentially methylated regions (DMRs) through the accompanying DMRichR workflow.
DMRichR is a R package and accompanying executable R script for the statistical inference and downstream analysis of differentially methylated regions (DMRs) from aligned WGBS, RRBS, and EM-seq data. The overarching theme of DMRichR is the synthesis of popular Bioconductor R packages for the analysis of genomic data with the tidyverse philosophy of R programming. DMRichR leverages the statistical algorithms from dmrseq and bsseq, which enable the inference of DMRs from low-pass WGBS, and provides a number of novel upstream and downstream functions. One command line call makes almost all the publication-quality figures you need from your aligned data. DMRichR offers a number of functions related to enrichment testing, data visualization, and machine learning.