Anna De Grassi, Convegno Mitocon 2015
-
Upload
mitocon-onlus -
Category
Health & Medicine
-
view
226 -
download
2
Transcript of Anna De Grassi, Convegno Mitocon 2015
Trio Exome Sequencing: Improving the Chance of a Genetic Diagnosis in Rare Disorders
Anna De Grassi- - Dept. Biosciences, Biotechnologies and Biopharmaceutics,
University of Bari --
Mitocon Conference - June 2015 - Bologna
Exome and Mutations
Personal exome cost: 1,500$; Times required: 6 weeksExon
Capture and Sequencing
Genome
Exome repositories host the exome of tens thousand individuals
Technically Mutations are DNA sequence differences between each person and a “reference”
Reference exon
Exon of Person 1
Exon of Person 2
A C
T
G
mutation
The Exome is the set of ALL the DNA portions (exons) encoding PROTEINS:
EXOME: 70M bases for ~30,000 proteins
Number and Class of Mutations
There are around 120,000 mutations between EACH person and the reference(~40,000 mutations within Protein Coding Exome and the others in the flanking sequence)
Which mutation(s) make the phenotypic difference between people? Why?Mutations are only data and not information!
Reference exon
Exon of Person 1
mutation
Hom.mutation
Het.mutation
Het.mutation
Het.mutation
Exon of Person 1
Exome sequencing also reveals the class of each mutation
Detecting Rare Disease Causing MutationsTo reduce the number of candidate mutations in a patient, we must filter them:
Discard mutations that are common in the human populations: 8,000 retained
Exome + Flanking sequence Total mutations: 120,000 40,000 80,000
Het. mutations: 7,500
Hom. mutations: 500 Genes with putative compound het.
mutations: 2,300
“Brute” filtering:- Remove exon flanking sequences- Remove non protein coding exons- Remove synonymous mutations
- Remove “benign” protein changes
- Literature check- Mutation confirmation
- Verify that the protein does not work properly
Genetic diagnosis in
< 9-26% patients
Improving Detection: Is It Possible?In principle, 85% disease causing mutations should be in protein coding exons
Rabbani, Journal of Human Genetics (2014) 59, 5–15
Disease causing mutations maybe are in the exome (or in the flank. reg.), but they were discarded by “brute” filtering
Can we discard mutations more properly?
Probably YES, if we also sequence the exome of both parents (TRIO):
1) Parents are genetically similar to child 2) Parents are healthy
Hom. mutations
Kid Mom Dad
x x x
Het. mutations (de novo) xx
Het. mutations (compound) x
x
x x
and so on…
and so on…
and so on…
ok discard
Mom Dad Mom Dad
Project and Literature TimingMitocon funding
(5 trios)Delivery of DNA from 3 trios (Besta->USA)
August 2014 December 2014
Raw data delivery (USA->Uniba)
February 2015
Pipeline Development & Candidate Mutations
Today
800 patients
The idea is becoming a diagnostic standard
Our Three TriosChildren with suspected mitochondrial disease without genetic diagnosis
Kin
dly
pro
vid
ed b
y V
aler
ia T
iran
ti G
rou
p
Sample name
Trio Member SexAge at onset
Onset Clinical Features Neuroradiology Muscle FibrosExome
Seq
A1 A Mother F
A2 A Father M
A3 A Child F 5 mo Psycho-motor delay
Epilepsy, hypotonia, ophthalmoplegia, microcrania, psychomotor delay, failure to thrive, disorder of swallowing and breathing, hypercalciuria, lactic acidosis
involvement of the basal ganglia, thalamus, subtalami
CI=2% CR Neg Monaco
B1 B Mother F
B2 B Father M
B3 B Child F BirthRespiratory failure, epilepsy
Respiratory failure, epilepsy, anemia, dysphagia, hypotonia. Deceased
thin corpus callosum
CIII=28% CIV=33
%CR Neg BGI
C1 C Mother F
C2 C Father M
C3 C Child M First months Epilepsy
Hypotonia, epilepsy, deafness, myoclonus, hypospadias
not done CI=27% CR Neg Monaco
No disease causing mutations detected in the mitochondrial genome and in the exome
Bioinfo Pipeline for Trio AnalysisBioinfo pipeline written ad hoc and applied to each trio independently
Raw data
- Quality checking- Mapping to reference
Calling of mutations:hom. or het. in >= 1 trio member
Annotation of mutations:location, consequence,
frequency in human populations
Selection of rare mutations:(< 0.5% or unknown)
Detection of kid/parents combinations compatible with
a monogenic disease
Hom. De novo het. Compound het.
No filter on protein coding exons and splice sites
Regulatory exome and flanking regions: filter for mitochondrial location OR effect on splicing
x x x
10Gb x 3 Few Gene Candidates
Candidate Genes for Trio A
Mutations Single Exome
Trio Exome
Candidate Genes
De novo heterozygous 429 0 -
Homozygous 12 0 -
Compound heterozygous (n.genes)
63 3 CRAT(m+m), DOLK(m+m), PCSK4(m+s)
Mutations Single Exome
Trio Exome
Mito genes
Splicing Effect
Candidate Genes
De novo heterozygous 7003 8 1 1 TIMM22 (i)
Homozygous 429 71 0 1 PAX3 (u)
Compound heterozygous(n.genes)
2252 257 0 2 RPA1(u+u)
Flanking regions (utrs, introns, intergenic)
Based on literature and DB: known disease, protein function, tissue expression, KO mouse, HW-equilibrium
Protein coding exons and splice sites
The First Candidate Gene for Trio AAcyl-CoA+Car
Acyl-Car + CoA
Acyl-Car+CoA
Car
Car
Acyl-CoA + Car
Acetyl-CoA + Car
β oxidation
Pyruvate
CoA
Krebs cycle
Glycolysis
Citrate
Acetyl-Car+ CoA
CoA
Acetyl-CoA
Acetyl-Car
CoA
<<<<<PyruvateLactic Acid
- Fatty Acid- Cholesterol- Ketone Bodies
Citrate
Oxalacetate +
?
Putative Effect of CRAT deficiency:
- Excess of Acetyl-CoA- Lactic acidosis?- Excess of Acetyl-CoA or Acetyl-Car in blood?
?
*
*
CRAT physiological role:
- Remove excess of Acetyl-CoA from mitochondria
(fasting->feeding)
- Avoid PDH inhibition- Avoid CoA synt.
inhibition- Free CoA
CRAT
CRAT Mutations Do Not Look Like “Benign”“Brute” filtering: CRAT mutations are predicted to
be “benign” by two algorithms out of three.
Crystallized structure of CRAT (~600 aa)Mutated aa are <= 4 Å from carnitine
Wild
type
Mut
atio
ns
Tyr 89
Val 548
Cys 89
Met 548
Kin
dly
pro
vid
ed b
y C
iro
L.
Pie
rri
Candidate Genes for Trio B
Mutations Single Exome
Trio Exome
Candidate Genes
De novo 758 2 TSFM (s), ADAMTS17(m)
Homozygous 14 0 -
Compound heterozygous(n.genes)
69 4 DHX38(m+m), CD36(m+m), CLSTN3(m+s), EHBP1L1 (m+s)
Protein coding exons and splice sites
Mutations Single Exome
Trio Exome
Mito genes
Splicing Effect
Candidate Genes
De novo 7594 3 0 0 -
Homozygous 426 80 0 0 -
Compound heterozygous(n.genes)
2440 284 0 4 PTPRD (u+u), CCND1(u+u), HKDC1 (m+u), PCSK6(u+i),
Flanking regions (utrs, introns, intergenic)
The First Candidate Gene for Trio BTSFM (Translation elongation factor) – de novo synonymous mutation
- Main cellular location: mitochondria- Main function: help translating proteins encoded by the mitochondrial
genome- Cell type: almost ubiquitary- Known disease: combined oxidative phosphorylation deficiency-3
An intronic mutation in the other allele?
Candidate Genes for Trio C
Mutations Single Exome
Trio Exome
Candidate Genes
De novo 675 3 HSDL2 (s), PHACTR1(m), ZNF28 (f)
Homo/hemyzygous 21 1 HTR2C (m-chrX)
Compound heterozygous(n.genes)
68 2 H2AFY2(s+s), GPR125(s+s)
Protein coding exons and splice sites
Mutations Single Exome
Trio Exome
Mito genes
Splicing Effect
Candidate Genes
De novo 6671 1 0 0 -
Homozygous 461 80 0 0 -
Compound heterozygous(n.genes)
2470 273 3 0 SLC25A10 (f+i), TSFM (i+u), SLC25A2 (m+u)
Flanking regions (utrs, introns, intergenic)
The First Candidate Gene for Trio CSLC25A10 - (frameshift + intron)
- Main cellular location: mitochondria- Main function: transport of dicarboxylic acids across the mitochondrial membrane (Krebs cycle
substrates)- Cell type: almost ubiquitary- Known disease: na
Wt: CCCCTTCGGGCTMut:CCCCTTCAGGCT
intron
New donor splice site
Conclusions and Perspectives
Trio vs Proband Exome:- The number of candidate mutations is lowered without loosing information- “Brute” filtering and control of parent DNA can be avoided- New candidate mutations are pointed out for lab validation
Validation on patient cells:- Trio A: functional assay for CRAT - Trio B: check transcripts for TSFM- Trio C: check transcripts for SLC25A10
Bioinfo Pipeline Optimization
Acknowledgments
- Paola Desideri- Piero Santantonio- Cristina Rebagliati
- Valeria Tiranti- Eleonora
Lamantea- Daniele Ghezzi
- Ciro Leonardo Pierri- Angelo Vozza- Luigi Palmieri- Cesare Indiveri- Vito Porcelli- Giovanni Parisi- Giuseppe Punzi
Other candidate genes
- CLSTN3 (calsitenin 3) - (missense + synonymous)
- Main cellular location: membranes- Cell type: neuronal cells, peripheral nerves, glandular cells,
myocytes- Known disease: na- Main function: help development of excitatory and inhibitory
synapses
Rare missense+
New donor splice siteCheck transcript
- TSFM - (intron + 3’utr)
New acceptor splice site+
New branch point ?Check transcript
- GPR125 - (synonymous + synonymous)
- Main cellular location: membrane- Main function: unknown, receptors of hormones and
neurotransmitters- Cell type: neuronal cells, myocytes- Known disease: na
ESE sites broken+
ESE sites broken/ New ESS sitesCheck transcript
Trio B
Trio C
The Genetic Diagnosis
Top->downBottom->up
Known genetic disease
- Sequencing of few DNA “pieces”
(Mitochondrial genome sequencing)
Genetic Diagnosis
Known disease causing mutation
Putative disease causing mutation
Experiments
Suspect of genetic bases, but unknown genes
Finding a small “piece” of DNA responsible of the disease
Sequencing as many “pieces” of DNA as possible
The Human Reference Genome
Cas Kramer et. al, Leicester University (UK)
- Around 3,000 Mb characters- 130 volumes- Pages printed on both sides- 43,000 characters per page(mitochondrial genome in ½ page)
DNA handbook to “build a human being and let it work”
A SINGLE (Hybrid) Genome sequenced in 10 years (2,000) for 3,000M $
The Reference Genome Sequence
4 different characters (bases), no spaces, no words, no sentences
Exons
(2%)
From Exons to Protein Sequence
Gene
(DNA sequence)
Protein
(amino acid sequence)
Having the sequence of ALL the EXONS means having the sequence of ALL the PROTEINS:
EXOME: 70M bases for ~30,000 proteins
Protein structure and function
Transcript
Class and Number of Mutations
There are around 120,000 mutations between EACH individual and the reference exome(~40,000 mutations within Protein Coding Exome and the others in the Flanking sequence)
Which mutation(s) make the phenotypic difference between individuals? Why?Mutations are only data and not information!
Reference exon
Exon of Person 1
mutation
Exon of Person 1
Hom.alleles
Het.alleles)
Het.alleles
Het.alleles
Exon of Person 1
read
Exome sequencing let us detect the class of each mutation