TRANSPOSABLE ELEMENT ANALYSIS PIPELINE

Welcome to the companion site for the Tea repeat analysis pipeline. Tea stands for transposable element analyzer. This site will serve as both a resource for downloading additional data related to the paper and as a repository for the Tea pipeline. The paper describing Tea is:

LANDSCAPE OF SOMATIC RETROTRANSPOSITION IN HUMAN CANCERS

Eunjung Lee^1,2, Rebecca Iskow³, Lixing Yang¹, Omer Gokcumen³, Psalm Haseley^1,2, Lovelace J. Luquette III¹, Jens G. Lohr^4,5, Christopher C. Harris⁶, Li Ding⁶, Richard K. Wilson⁶, David A. Wheeler⁷, Richard A. Gibbs⁷, Raju Kucherlapati^2,8, Charles Lee³, Peter Kharchenko^1,9,*, Peter J. Park^1,2,9,*, and The Cancer Genome Atlas Research Network

¹Center for Biomedical Informatics, Harvard Medical School, Boston, MA
²Division of Genetics, Brigham and Womens Hospital, Boston, MA
³Department of Pathology, Brigham and Women's Hospital, Boston, MA
⁴The Eli and Edythe Broad Institute, Cambridge, MA
⁵Dana-Farber Cancer Institute, Boston, MA
⁶The Genome Institute, Washington University, School of Medicine, St. Louis, MO
⁷Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
⁸Department of Genetics, Harvard Medical School, Boston, MA
⁹Informatics Program, Children's Hospital, Boston, MA

Science [DOI:10.1126/science.1222077]

DOWNLOAD ADDITIONAL DATA FILES

Colorectal mutation calls (.maf format, 22MB)
Compilation of known polymorphic TE insertion sites (.txt format, 530KB)

Tea on Github

-Latest Release-

DOWNLOAD Tea PIPELINE

The pipeline can be downloaded using the links provided below. Please see the included README file for instructions on running the pipeline. Installation requires the following software to be present on the system: the CAP3 assembler, Perl, R, Samtools and the BWA aligner software. The following R packages are also required: Bioconductor, Rsamtools, IRanges and spp.

Version 0.6.2
.tar.gz format, (5.0GB) md5sum: 39df86512c441a8503e2c597c4fbecd6
Changes: Enabled short read analysis support in demo scripts. The results in na18506_35bp demo should now be valid. Additional code cleanup. Updates to the README for clarity.
Version 0.6.2 noidx
.tar.gz format, (1.2GB) md5sum: 2aa5e25426aaada4af4a46b03db04928
Identical to the above version, but without a BWA hg18 index. The user must supply the index instead.

For the purposes of testing the pipeline we have included demo data files, containing chromosomes 21 and 22 from the three HapMap normal samples analyzed in the paper:

Demo Files
.tar format (6.9GB) md5sum: fcec584aeb39f16ee8e3ce809f23f4e0

If you have any other questions about Tea, please contact Alice Lee (

GET INVOLVED

A postdoctoral position in bioinformatics is available at the Kharchenko Lab. The candidate will work on development of algorithms and statistical methods for analyzing genomic and functional sequencing data, in close collaboration with experimental labs. Read more.

TRANSPOSABLE ELEMENT ANALYSIS PIPELINE

DOWNLOAD ADDITIONAL DATA FILES

Tea on Github

-Latest Release-

DOWNLOAD Tea PIPELINE

Version 0.6.2

Version 0.6.2 noidx

Demo Files

GET INVOLVED