NEXT GENERATION SEQUENCING OF MULTIPLE GENES IN … · 2013. 5. 22. · CHALLENGES • Sequencing...
Transcript of NEXT GENERATION SEQUENCING OF MULTIPLE GENES IN … · 2013. 5. 22. · CHALLENGES • Sequencing...
NEXT GENERATION SEQUENCING OF MULTIPLE GENES IN PARALLEL IN GENETICALLY HETEROGENEOUS DISORDERS AFTER
ENRICHMENT
HETEROGENEITY IN MONOGENIC DISORDERS
Disease # (known) Genes
Breast cancer 2
Recessive ataxias >20
Blindness >200
Mental retardation ~1000
From: Cover of Nature human genome issue, published on 15 February 2001
CHALLENGES
• Sequencing of multiple genes in parallel for heterogeneous hereditary disorders
• Multiplexing of samples for sequencing to increase throughput and reduce costs
• Automated data-analysis, including
• Interpretation of potential pathogenicity of identified variants
• Validation (in collaboration with EUGT) )
• Cost effectiveness (Katherine Payne, Manchester)
• Ethical issues (Anne Cambon-Thomsen, Toulouse)
• Dissemination of knowledge (Milan Macek, Prague)
• Management
TECHGENE - OTHER ACTIVITIES
TECHGENE - WHO ARE WE?PARTNER CENTRE COUNTRYScheffer/Veltman Radboud University Nijmegen Medical
CentreThe Netherlands
Matthijs/Cuppens KU Leuven BelgiumMacek Charles University - 2nd School of
Medicine, PragueCzech Republic
Estivill Center for Genomic Regulation (CRG), Barcelona
Spain
Gasparini University of Trieste ItalyRiess University of Tuebingen GermanyBakker/Den Dunnen Leiden University Medical Center The NetherlandsBanfi Telethon Foundation, Naples ItalyNigro Second University of Naples ItalyCambon-Thomsen University Paul Sabatier, INSERM
U558, ToulouseFrance
Payne University of Manchester United KingdomSak Asper Biotech, Tartu Estonia
TECHGENE
• Is a network of ISO15189 accredited clinical molecular genetic laboratories and renowned genome research laboratories
• Has been positioned as a technological innovation platform connected with the EUGT Network of Excellence dealing with Quality Standards in Clinical (Laboratory) Genetics
TECHGENE - SCIENTIFIC ADVISORY BOARD
Name Affiliation Expertise
Prof. Dr. Jean-Jacques Cassiman
Katholieke Universiteit Leuven, Eurogentest, ESHG
Molecular Genetics, Quality Assurance, EU Regulatory Affairs
Prof. Dr. David Barton
National Centre for Medical Genetics, University College Dublin
Clinical Molecular Genetics
Dr. Cor Oosterwijk
European Genetic Alliances Network
Patient Organisations, Commercial contacts
PLATFORMS
COMPANY TECHNOLOGY CURRENT READ LENGTH
CURRENT THROUGHPUT
Roche/454(GS FLX Titanium)
Sequencing by synthesis
300-500 bases 500 Mb/10h
Illumina/Solexa(Illumina Genome Analyzer System)
Sequencing by synthesis
35 bases 4.5-6.0 Gb/2 days
Life Technologies(SOLiD3)
Sequencing by ligation
50 bases 60 Gb/5 days
Validation
WP1
Hemoglobinopathies Breast Cancer
Few Genes with Multiple Mutations
Long PCR Enrichment
WP6
Economic Issues
Read-out & Data Anal.
WP7
Dissemination & Training
WP5
Ethical Issues
WP8 Management of TECHGENE
Array Enrichment
WP4
Mental Retardation
Rare Mutations in Rare Genes
WP2
Sensory Disorders
1 Major Gene & Multiple Minor Genes
WP3
Ataxia & Paraplegia
Multiple Equally Important Genes
� Next Generation Sequencing in Diagnostics
TECHGENE - HOW? Overview of Plan of Investigation
ARRAY-BASED GENE CAPTURE: 7 KNOWN DISEASE GENES (AR ATAXIA)
Target region, i.e. genomic gene sequence +/- 5kb
Unique sequence represented on array
7 disease genes:• 278 exons• 1,110,372 targeted bp • 886,569 tiled bp
APTXSYNE1
TDP1
SACS
SETX
ATM
FXN
SYNE 2: EXONS ONLY
SYNE2:• 116 exons• 58,331 targeted bp • 63,040 tiled bp (represented on array)
NOVEL CANDIDATE LOCUS (HOMOZYGOSITY MAPPING): 1.3MB
Candidate locus:• 8 genes (out of 80)• 118 exons• 1,337,276 targeted bp • 1,012,745 tiled bp (represented on array)
AUTOSOMAL RECESSIVE ATAXIA: ARRAY SUMMARY
• 16 genes • 512 exons• 2,505,979 bases targeted• 1,962,354 bases tiled (represented on array)
KEY QUESTIONS
• Does array-enrichment work?– Coverage of targeted vs. near-targeted regions– How much non-targeted sequence is still present?
• Is mutation detection possible?– Can known mutations be identified?– Identification of known SNPs
AVERAGE MAPPING DATA
* * *
% mapped 98.80%Mapped bases in regions 76.49%
Mapped bases near region (500bp) 3.82%Mapped bases outside region 19.68%
Target bases covered 98.05%Target bases with >5x coverage 91.26%
Target bases with >10x coverage 80.89%Target bases with >30x coverage 33.49%Target region bases not covered 1.95%
ENRICHMENT –FULL GENOMIC SEQUENCES, E.G. SYNE1
Ø 39-fold enrichment for all genes
ZOOM IN SYNE1 - ENRICHMENT AND SEQUENCING IS REPRODUCIBLE
ARE THERE REGIONS NOT COVERED?
~3.6-8.9% of targeted regions are not covered (with the given coverage) All intronic sequences
CAN WE DETECT MUTATIONS?
ARSACS patient:
A 5 bp heterozygous deletion in exon 5 of the SACS gene
Chr13:22810013-22810017CTTTT deletion
Found in 63.0% of 27 reads(11/13 forward, 6/16 reverse)
Leads to frame shift
CAN KNOWN MUTATIONS BE IDENTIFIED?
CAN PATHOGENIC MUTATIONS BE IDENTIFIED AGAINST A BACKGROUND OF APPR. 1,750 HIGH CONFIDENTIALITY (HC) DIFFERENCES ?
ARSACS patient: compound heterozygous for SACS c.5559-5563 delCTTTT / c. 11719C>T (p.Q3907X)
Roche Mapping software (HC differences) output:* exclude variants outside coding regions* exclude known SNPs (dbSNP)* sort for conservation (phastCons & phyloP)���� Top 2 on the list are the known mutations!
������
���� ���� ����� ������ �� ���� ����
����� �
����
� �
�����
������ ��
���
���� �
�� �����
���� ��
����
�� ��
��
���
�� � ������
����� ���������� ����������� � � �� � � � �� � �� � ����
����� ����������� ������������ ����� � �� �� �� �� � �� � �����
����� ����������� ������������ � � � � � � � �� ����� � �����
����� ������������ ������������� � � � � � � � ����� � ����
����� ������������ ������������� � � � �� � � � ����� � ������
����� ���������� ����������� � �� � �� � � � ����� � �����
AT patient: c.7875-7576 TG>GC (homozygous mutation ATM gene)
CAN KNOWN MUTATIONS BE IDENTIFIED?
CAN PATHOGENIC MUTATIONS BE IDENTIFIED AGAINST A BACKGROUND OF APPR. 1,750 HIGH CONFIDENTIALITY (HC) DIFFERENCES ?
������
���� ���� ����� ������ �� ���� ����
����� �
����
� �
�����
����
�� ��
���
���� �
�� ��
���
���� ��
����
�� ��
��
���
�� � ������
����� ��������� ���������� �� �� �� �� � � �� � � � �����
����� �������������� �������������� � � �� � �� ! � �� � �� � ������
���� ���������� ���������� � � �� � � � �� ����� � �����
����� �������������� �������������� � �� � �� ! � ����� ���� �����
���� ����������� ���������� �� � � � � ! � �� ����� ���� ������
AT Patient: c.7875-7576 TG>GC (homozygous mutation ATM gene)
Roche Mapping software (HCdifferences) output:���� Top 1 on the list is the known mutation!
MUTATION DETECTION IN EIGHT SAMPLES
CONCLUSIONS
Nimblegen Sequence Capture & Roche 454 FLX Titanium Sequencing
• Enrichment works• Enrichment is specific• Known mutations can be identified by sequence capture
• Concept is proven• Diagnostic potential for many genetically heterogeneous diseases!
• Next phase is design of seq cap tools for parallel sequencing of appr. 100 genes in hereditary blindness, and hereditary movement disorders, respectively; comparison of array-based seq capture, seq capture in solution, Raindance technology, and others)
• Establishment of a diagnostic NGS knowledge network (academic labs offering diagnostic NGS service, commercial companies involved in diagnostic NGS development)
•Regular meetings, workshops, teleconferences, etc. , open to interested researchers and laboratory scientists
• Facilitating open scientific discussions and collaboration
• TECHGENE Scientific Meeting 18 March 2010
DIAGNOSTIC NGS KNOWLEDGE NETWORK
Alexander Hoischen Sascha Vermeer
Walter van der Vliet Nine van Slobbe-Knoers
Peer Arts Berry Kremer
Christian Gilissen Jorge Sequeiros
Nienke Wieskamp
Rowdy Meijer
Michael Buckley
Joris Veltman
Hans Scheffer www.techgene.eu
www.techgene.org
ACKNOWLEDGEMENTS