TRANSPOSABLE ELEMENT ANALYSIS PIPELINEWelcome to the companion site for the Tea repeat analysis pipeline. Tea stands for transposable element analyzer. This site will serve as both a resource for downloading additional data related to the paper and as a repository for the Tea pipeline. The paper describing Tea is:
1Center for Biomedical Informatics, Harvard Medical School, Boston, MA
2Division of Genetics, Brigham and Womens Hospital, Boston, MA
3Department of Pathology, Brigham and Women's Hospital, Boston, MA
4The Eli and Edythe Broad Institute, Cambridge, MA
5Dana-Farber Cancer Institute, Boston, MA
6The Genome Institute, Washington University, School of Medicine, St. Louis, MO
7Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
8Department of Genetics, Harvard Medical School, Boston, MA
9Informatics Program, Children's Hospital, Boston, MA
DOWNLOAD ADDITIONAL DATA FILES
Compilation of known polymorphic TE insertion sites (
Tea on Github
DOWNLOAD Tea PIPELINE
The pipeline can be downloaded
using the links provided below. Please see the included
for instructions on running the pipeline. Installation
requires the following software to be present on the system: the CAP3
assembler, Perl, R,
Samtools and the BWA aligner software. The following R
packages are also required:
Bioconductor, Rsamtools, IRanges and spp.
.tar.gzformat, (5.0GB) md5sum:
39df86512c441a8503e2c597c4fbecd6Changes: Enabled short read analysis support in demo scripts. The results in na18506_35bp demo should now be valid. Additional code cleanup. Updates to the README for clarity.
Version 0.6.2 noidx
.tar.gzformat, (1.2GB) md5sum:
2aa5e25426aaada4af4a46b03db04928Identical to the above version, but without a BWA hg18 index. The user must supply the index instead.
For the purposes of testing the pipeline we have included demo data files, containing chromosomes 21 and 22 from the three HapMap normal samples analyzed in the paper:
.tarformat (6.9GB) md5sum:
A postdoctoral position in bioinformatics is available at the Kharchenko Lab. The candidate will work on development of algorithms and statistical methods for analyzing genomic and functional sequencing data, in close collaboration with experimental labs. Read more.