Whole-genome sequencing data
Chromothripsis refers to the genomic alterations characterized by massive de novo rearrangements, often generated in a single catastrophic event, where the DNA is shattered into a number of fragments that are subsequently stitched together in random order and orientation.
In our recent publication (Cortes-Ciriano et al, 2018), we have characterized the rates of chromothripsis across 37 cancer types using whole-genome sequencing data from ~2,600 tumors that were uniformly collected and processed under the auspices of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project.
The main focus of this site is the exploration of the chromothripsis events we detected in this tumor cohort
using the chromothripsis detection tool we developed ad hoc (i.e. ShatterSeek).
Please cite our paper if you find this site or the associated data useful (Cortes-Ciriano et al, 2018).
Please browse the tab "Supplementary Information" in the navigation bar
to download the Supplementary Information and the R package ShatterSeek.
Chromothripsis explorer
Chromothripsis Explorer permits to explore and visualize the tumors comprised in the PCAWG cohort,
including properties of the tumors (e.g. purity and ploidy),
as well as interactive circos plots for all tumors reporting the SNV, indel, structural variation, and the
total and minor copy number profiles for chromosomes 1-22 and X.
These plots permit to easily visualize complex mutational profiles (e.g. chromothripsis and kataegis),
deletions of chromosome arms, LOH regions, etc..
Please see below for specific instructions for each visualization tool.
To facilitate the integration with the results of other PCAWG studies,
we have used in this site and in our publication the cancer type abbreviations and the colour schemes
agreed upon by the PCAWG consortium.
Chromothripsis detection
To identify chromothripsis-like patterns from whole-genome sequencing data we developed the R package ShatterSeek,
which consists of a custom algorithm to detect clusters of SVs,
and a set of statistical criteria partly based on the work of Korbel and Campbell (Cell, 2013).
Given that chromothripsis events generate clusters
of interleaved rearrangements (i.e., the genomic regions bridged by their breakpoints overlap but are not nested),
ShatterSeek firstly scans each chromosome in each cancer genome for the presence of clusters of this type.
To find clusters, ShatterSeek constructs an undirected graph whose nodes correspond to SVs and whose
edges connect interleaved SVs.
Thus, clusters of SVs are detected by finding the connected components in the graph.
The connected component in each chromosome with the highest number of SVs is considered for further analysis.
Once the SV clusters are detected, several statistical criteria are evaluated.
In addition, the genomic regions delimited by the distal breakpoints composing
the SV clusters are further examined for the presence of contiguous genomic segments oscillating between two CN states
(a widely acknowledged hallmark of chromothripsis).
To tune the parameters in our method, we used statistical thresholds and visual inspection.
For the minimum number of oscillating CN segments, we used two thresholds:
- High-confidence calls display uninterrupted oscillations between two states in at least 7 adjacent segments
- Low-confidence calls involved between 4 and 6 uninterrupted CN oscillations
Our chromothripsis calls were further classified into "canonical" if at least 60% of the CN segments in the complex rearrangement oscillate between
2 states, and "with other complex events" in cases where chromothripsis co-localizes with other genomic alterations.
In polyploid tumors we inferred whether canonical chromothripsis events occurred before or after polyploidization.
For example, if the CN oscillation occurs between 2 and 4 copies in a tetraploid tumor,
we infer that polyploidization occurred after chromothripsis.
On the other hand, if the oscillation occurs between 3 and 4 copies, we infer that polyploidization occurred first.
Interchromosomal SVs were used to detect chromothripsis events involving multiple chromosomes.
Those regions where chromothripsis was not detected directly (i.e. we did not find a cluster of at least 6 interleaved SVs
or the SVs did not satisfy the statistical criteria mentioned above) but are, however, linked to chromothripsis regions
are defined as "linked to high confidence" or "linked to low confidence",
depending on the confidence assigned to the region with which they are linked.
The ShatterSeek code, a tutorial and the documentation are publicly available (please see the 'Supplementary Information' tab).
Please read our publication or the ShatterSeek tutorial for more details on the methodology and its validation.