Overview
This webinar introduced RNA-Seq, highlighting its background, workflow, and applications in gene expression analysis using next-generation sequencing (NGS), followed by Q&A on common experimental and technical considerations.
Gene Expression Analysis Background
- Gene expression can be studied at the DNA, RNA, or protein level.
- Earlier methods included Northern blot (RNA detection), RT-qPCR (RNA quantification), and microarrays (simultaneous gene expression profiling).
- Each earlier method required prior knowledge of target sequences and had limitations for discovery of novel genes.
Introduction to RNA-Seq and NGS
- RNA-Seq uses NGS to quantify and discover all expressed genes in a sample, including novel transcripts.
- Advantages of RNA-Seq: high throughput, single-nucleotide resolution, no need for prior sequence data.
- The widespread adoption of RNA-Seq followed advances in NGS technologies, such as Illumina.
NGS Workflow and Key Considerations
- Basic NGS workflow: input material (DNA/RNA) → fragmentation → adapter ligation → sequencing.
- A “read” is the nucleotide sequence generated from a DNA or RNA fragment; can be single-end or paired-end.
- Read length (e.g., 75, 150, 300 nucleotides) affects downstream analysis and transcriptome assembly.
RNA-Seq Experimental Design
- Most cellular RNA is ribosomal (rRNA) or transfer RNA (tRNA); mRNA, typically the target, comprises only 2-3% of total RNA.
- Enrichment methods: poly(A) enrichment (for eukaryotic mRNA) and rRNA depletion (for both eukaryotes and prokaryotes).
- Project goals determine sequencing depth, read length, and single- vs paired-end reads.
RNA-Seq Workflow and Quality Control
- Four main steps: library preparation, bridge PCR (cluster generation), sequencing by synthesis, and data analysis.
- Quality control at each step: check RNA integrity (gel), DNA concentration (Qubit), fragment size (Bioanalyzer), and sequenceable library fraction (qPCR).
- Avoid over- or under-clustering to ensure optimal sequencing results.
Data Analysis and Interpretation
- Raw sequencing data is converted to FASTQ files containing sequence and quality info.
- Reads must be aligned to a reference genome/transcriptome for downstream analysis.
- Normalization methods (e.g., FPKM—Fragments Per Kilobase per Million mapped reads) adjust for varying gene lengths and sequencing depths.
- Analytical outputs: heatmaps, principal component analysis, and functional annotation of differentially expressed genes.
Applications and Example Projects
- ENCODE and modENCODE projects: mapped genomic regulatory elements.
- Cancer Genome Atlas: used RNA-Seq to profile cancer transcriptomes.
- RNA-Seq supports advances in personalized medicine for genetic diseases.
Q&A Highlights
- Typical throughput from sample receipt to data: 4-6 weeks.
- ABM offers bioinformatics services for analysis of user-provided sequencing data.
- If samples fail QC, clients are contacted for resubmission or alternative solutions.
- Separate processing (or double sample amount) is needed for both RNA-Seq and microRNA-Seq.
Key Terms & Definitions
- RNA-Seq — sequencing technique to quantify and discover RNA transcripts in a sample.
- Next-Generation Sequencing (NGS) — high-throughput sequencing technologies for DNA/RNA.
- Read — sequence of nucleotides generated from a DNA/RNA fragment during sequencing.
- FPKM — normalization metric: Fragments Per Kilobase of transcript per Million mapped reads.
Action Items / Next Steps
- Expect webinar slides and answers to Q&A via email.
- Visit ABM’s website for educational resources and technical support.
- Watch for an invitation to the upcoming webinar on whole genome sequencing.