Current Topics in Genome Analysis: Final Lecture Summary

Jul 4, 2024

Final Lecture in Current Topics in Genome Analysis

Introduction

  • Speaker: Dr. Elaine Mardis
  • Hosts: Dr. Andy Baxabanis & Dr. Tara Wolfsberg
  • Affiliations: Washington University in St. Louis
  • Background: BS in Zoology, PhD in Chemistry and Biochemistry, University of Oklahoma

Dr. Mardis' Contributions

  • Key player in sequencing methods and automation for Human Genome Project
  • Involved in Cancer Genome Atlas Project, Human Microbiome Project, and 1000 Genomes Project
  • Sequenced genomes of several species: mouse, chicken, platypus, etc.
  • Recognized as one of the most influential scientific minds by Thompson and Reuters

Next-Generation Sequencing (NGS) Technologies

  • Basics: Common core principles, library construction, amplification on a solid surface
  • Library Construction: Simple molecular biology steps including amplification or ligation with custom linkers/adapters
  • Amplification: Increases signal for accurate DNA sequencing readouts
  • Integrated Data Production: Sequencing and detection occur in lockstep in NGS
  • Massively Parallel Sequencing: Hundreds of thousands to millions of reactions at the same time
  • Digital Read Type: Quantitation of DNA/RNA sequences through a digital approach
  • Short Read Length: NGS produces shorter reads (100-400 bp) compared to old Sanger sequencing (600-800 bp)

Detailed Library Construction Steps

  • DNA shearing by sound waves
  • Ligation of synthetic DNA adapters
  • Size fractionation for precise size fractions
  • Quantitation of libraries before amplification
  • Enzymatic PCR amplification introduces some biases

Subgenome Approaches

  • Hybrid Capture for exome sequencing
  • Biotinylated probes used to capture specific genome regions via magnetic forces
  • Multiplex PCR for small targeted genome regions

Illumina Sequencing

  • Cluster amplification on a flow cell
  • Sequencing through labeled nucleotides detection stepwise
  • Signal to noise issues due to chemistry imperfections
  • Capacity, throughput, and software improvement for data analysis

Ion Torrent Sequencing

  • Label-free sequencing using native nucleotides
  • Hydrogen ion release upon nucleotide incorporation detected by pH meter
  • Bead-based PCR amplification
  • Short read lengths with higher error rates for homopolymer runs

Data Analysis Challenges

  • Alignment to genome reference critical
  • Identification of duplicate reads and local misalignments
  • Ensuring coverage and evaluating SNPs
  • Visual examination of data using IGV
  • Bulk tools like RefCov used for assessing large datasets
  • Pipeline for somatic variant discovery involving various analytical steps

Transition to Clinical Applications

  • Integration of whole genome, exome, and transcriptome sequencing for cancer genomics
  • Examination of RNA-seq for expression confirmation and gene fusions
  • Determining gene-drug interactions via drug-gene interaction databases and annotation

Future Directions and Innovations

  • Third-generation sequencing (PacBio)
  • Single molecule real-time (SMRT) sequencing for longer reads
  • Error correction through coverage
  • Applications in improving human reference genome and cancer genomics

Nanopore Sequencing

  • Emerging technology for small, portable sequencing devices
  • Error rates still high, but potential for field applications

Case Studies

  • Diagnostic and therapeutic success stories using comprehensive genomic approaches
  • Potential for personalized vaccine development using T-cell and peptide strategies

Conclusion

  • Importance of integrating various sequencing technologies for comprehensive genomic analysis
  • Emphasis on future potentials and translational applications in medicine and research

Q&A Highlights

  • Discussion on historical sequencing biases and clinical sample handling
  • Potential application of nanopore technology in diverse fields such as forensics