📊

Automated TE Annotation with Earl Grey

Apr 24, 2025

Earl Grey: A Fully Automated TE Curation and Annotation Pipeline

Overview

  • Earl Grey is an automated transposable element (TE) annotation pipeline.
  • Utilizes widely-used tools and a consensus elongation process for annotating new genome assemblies.

Features

  • Automates TE curation and annotation.
  • Combines tools with a consensus elongation process.
  • Compatible with new genome assemblies.

Latest Release

  • v6.0.1: Bug fixes for RepeatMasker Libraries verification.
  • Docker container available.

Previous Changes

  • v6.0.0: Updates to use Dfam v3.9, RepeatMasker v4.1.8, famdb v2.0.1.
  • Configuration required for RepeatMasker with Dfam database.
  • v5.1.1: Compatibility improvements with genome sequencing data, new readable summary tables.
  • v5.0.0: Subroutines added for de novo TE detection and annotation.

Example Run

  • Run through steps to identify, curate, annotate TEs.
  • Recommended to use within tmux or screen sessions.
  • Parameters:
    • Required: -g (genome), -s (species), -o (output directory).
    • Optional: -t (threads), -r (RepeatMasker search), -l (library), -i (iterations), etc.

Outputs

  • Results in summaryFiles directory.
  • High-level and family-level TE quantification tables.
  • Repeat Landscapes and genome repeat content pie charts.

Installation

  • Conda/Mamba:
    • conda create -n earlgrey -c conda-forge -c bioconda earlgrey=6.0.1
    • mamba create -n earlgrey -c conda-forge -c bioconda earlgrey=6.0.1
  • Docker: Available for Dfam v3.9 and v3.7.
  • Singularity: Preconfigured with Dfam v3.7.

Usage Without Installation

  • Use via Gitpod for in-browser execution.

Acknowledgements

  • Cites Baril et al. (2024) for the Earl Grey pipeline.
  • Utilizes scripts from RepeatCraft and other open-source software.

Additional Resources


  • Important: Keep RepeatMasker and RepeatModeler updated for compatibility.
  • Note: Running time depends on input genome repeat content, especially using RepeatModeler2.
  • Manual configuration might be needed for advanced usage or specific installations.