In this week’s post we’re taking a closer look at large (>50bp) structural variants, including copy number variants (CNVs). Previously, we touched on CNVs and their prevalence in neurodevelopmental disorders (read the post here). We noted how whole genome sequencing (WGS) technology provides unique opportunities for detection of CNVs. But we’re often asked what specific types of structural variants are detected by WGS. Read on to learn more, then download the one-page reference guide which includes a graphic representation of each variant type.
Let’s start with a definition of the different types of structural variants:
With few exceptions, large structural variants like these are not detectable by exome sequencing. They can however be detected by WGS. This is in large part due to the consistent read depth that WGS generates. But to take advantage of the available data it’s necessary to use the right algorithms. In general terms, our algorithms are centered on two distinct analysis strategies: breakpoint analysis and read depth analysis.
Breakpoint analysis takes advantage of two types of reads: split reads and discordant reads. Under normal circumstances, a given paired sequence read will align to a single region of the genome. But for split and discordant reads, the paired read aligns to two distinct regions of the genome with little or no overlap. In the case of split reads, the breakpoint occurs within one of the reads and can be identified to the resolution of a single base pair. In the case of discordant reads, the breakpoint occurs in the insert between the reads, resulting in an unexpected span size or inconsistent orientation. Both are indicative of structural variation.
Read depth analysis
Read depth analysis takes advantage of the expectation of consistent coverage across the genome. Regions with unexpected levels of coverage – both significantly higher (>=2X) and significantly lower (<=.5X) – are indicative of structural variation.
Considering these three signals alongside additional lines of evidence makes it possible to detect nearly all of the above variant types as part of the Genomic Unity™ test. Only balanced translocations are pending, to be addressed soon in a future release. For quick reference, download our one-page reference guide for structural variants including a graphic representation of each variant type.
In our next post we’ll address another common question: why a lower mean sequencing depth for WGS (30X) produces better coverage than a higher mean sequencing depth for exomes (100X).