Identification and characterization of copy number variation in Indian population and its...
-
Upload
rodney-gibbs -
Category
Documents
-
view
215 -
download
3
Transcript of Identification and characterization of copy number variation in Indian population and its...
Identification and characterization of copy number variation in Indian population
and its association with disease
Pankaj Kumar
CAS-MPG Presentation
07 May 2012
CNVs are
- variations in the # of copies of genomic regions
- Can be insertions, deletions and duplications
- have size ranging from > 1 Kb to Mbs
Introduction
CNV SNP
Total Number 38,406 14,708,752
% of Reference Genome
29.74% <1%
CNV vs. SNPS
C D F
Deletion
Polymorphism
Phenotypic Variability Disease Susceptibility
A B
C D EA B
Duplication
C D EA B D C EA B
Mutation
Freq
uenc
yOrigin
Types
Occurrence
Introduction contd..
Scherer et al. Nature Review Genetics 2006
Introduction contd..Consequence of CNVs
Unmask recessive alleles Disrupt genes
Alter regulation Cumulative effects
Objectives:
1. To identify CNVs in diverse Indian populations
2. To map CNV regions with disease susceptibility
3. To study consequence of CNV in disease
4. To explore the role of CNV in Spinocerebellar Ataxia
CNV & Diseases
Proof -of-concept study
APOBEC3b: insertion/ deletion polymorphism
Cytidine deaminase family of proteins
29 kb insertion/deletion polymorphism
Kidds et al. PLoS Genetics, 2007
Spectrum of APOBEC3B deletion frequency in Indian populations studied
APOBEC3b insertion/deletion polymorphism & malaria endemicity
Insertion deletion
White - insertion Dark - deletion
Malaria cohortComparisons (Fisher's
test)Genotypes
Odds Ratio (95 % CI)
P value
Endemic
Non-severe vs. control AB & AA7.11
(3.20 to 15.97) 1x10-7
Severe vs. control AB & AA8.13
(2.62 to 26.59) 1.7x10-5
Severe vs. non-severe AB & AA1.14
(0.37 to 3.81)0.8
Non-endemic
Severe vs. control AB & AA0.39 (0.16 to 0.93)
0.0211
Severe vs. control BB & AB6.44
(1.76 to 24.99)
0.0012
Severe vs. control BB &
(AA+AB)3.17
(1.10 to 10.32)0.0177
Significant association of APOBEC3b with falciparum malaria
A - insertion alleleB- deletion allele
Insertion allele of APOBEC3B seems to be protective for malaria
Positive Selection for APOBEC3B locus in Malaria
???
APOBEC3B
500 Kb upstream 500 Kb downstream
EHH and Haplotype Analysis
Positive selection
markers markers
5' 3'
Endemic case Endemic control
Non-endemic case Non-endemic control
Haplotype based analysis for larger linkage disequilibrium
Selection for ABOPEC3B region has not been observed in malaria
Schematic representation of APOBEC gene cluster and segmental duplication region
Segmental duplication regions
Due to large no. of segmental duplication regions in this locus selection for APOBEC3B was not observed
Conclusions
• Insertion allele of APOBEC3B seems to be protective for malaria
• APOBEC3B locus has not Shown signature of positive selection by conventional methods may be due to high recombination events
• Since this gene is expressed in liver & spleen this might provide a new mechanism of host protective response
Identification of CNVs in the Indian population
A basal Database
Identification of large CNVs (>100k) in the Indian population : MethodologyIdentification of large CNVs (>100k) in the Indian population : Methodology
Sampling of IGV populations
477 samples, 26 populations477 samples, 26 populations
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 5
IE -W-IP2
IE-E-LP2
IE-N-LP1IE-N-LP9
IE-N-LP18
TB-N-IP1TB-N-SP1
IE-W-LP3
IE-W-LP1
IE-W-LP2
IE-E-IP1IE-NE-IP1
AA-NE-IP1
TB-NE-LP1
IE-N-IP2
IE-N-LP10IE-N-SP4
AA-E-IP3
AA-C-IP5
DR-S-LP
IE-W-LP4OG-W-IP
DR-S-LP3
DR-S-LP
IE-N-LP5
IE-E-LP4
IE-NE-LP1
DR-C-IP2
Affy 50k array (~58000 SNPs with av. inter-marker
distance 50 kb)
Raw intensity files
Retrieve segments >100 kb length & minimum 10 probes using G-
Console
CNV calling and QC(Genotyping Console+SVS7)
Validation using Sequenom massARRAY QGE assay
Results
Instances of genomic segment prone to CNVs
Raw CNV deletion = 70174 (<1Mb segment size) and 212 (>1Mb segment size)
Raw CNV duplication = 73580 (<1Mb segment size) and 60 (>1Mb segment size)
Total CNVRs deletions = 1425
Total CNVRs duplications = 1337
result contd..Extent of CNVs in IGV populations
Chromosomal landscape of common CNV regions in all the populations pooled together
5750(65%)
2048(23%)
1006(11%)
Deletion Duplication
GTC 3.0.2
2986(50%)
1461(25%)
1515(25%)
Deletion Duplication
SVS 7
Concordance of dataset using two independent algorithms
result contd..
~ 60% of copy number variable regions showed deletion and duplication both
Comparison using both the software shown 50% concordance prone to CNVs
CNV Validation and Heterogeneityresult contd..
Validation using Sequenom MassARRAY QGE
Amplification
DeletionLess validation due to heterogeneity in CNV boundaries
Selection of probe for validation is a also key factor
TB populations and isolatedHimalayan populations
AA and DR isolated populations
IE large populations
CNVs and Population Structure result contd..
Populations clustered according to genetic and linguistic affinity
SN GENE_SYMBOL Disorder name Class1 KDR Hemangioma, capillary infantile, somatic Cancer 2 IRF4 Multiple myeloma Cancer 3 BRAF Adenocarcinoma of lung, somatic Cancer 4 KCNE2 Atrial fibrillation, familial, Long QT syndrome-6 Cardiovascular
5 AGT,AGTR1 Hypertension, essential, Renal tubular dysgenesis Cardiovascular
6 ADRB1 Congestive heart failure, susceptibility to, Resting heart rate Cardiovascular
7 KRT6A Pachyonychia congenita, Jadassohn-Lewandowsky type Dermatological
8 GTF2H5 Trichothiodystrophy, complementation group A, Dermatological
9 PRSS2 Pancreatitis, chronic Gastrointestinal 10 IL23R Crohn disease Gastrointestinal 11 ABCG5 Sitosterolemia Metabolic 12 HGD Alkaptonuria Metabolic
13 PPM2C Pyruvate dehydrogenase phosphatase deficiency Metabolic
14 A2M,APPAlzheimer disease, susceptibility to,
Emphysema due to alpha-2-macroglobulin deficiency
Neurological
15 ATXN8OS Spinocerebellar ataxia 8 Neurological 16 ATXN1 Spinocerebellar ataxia-1 Neurological 17 PRKCH Cerebral infarction Neurological 18 BFSP1 Cataract, cortical, juvenile-onset Ophthamological
19 HTRA1 Macular degeneration, age-related, 7, Macular degeneration, age-related, neovascular type Ophthamological
20 HMCN1 Macular degeneration, age-related, 1, Posterior column ataxia with retinitis pigmentosa Ophthamological
21 PTGDR,IL12B,HNMT,PTGER2 Asthma Respiratory
CNVs present in IGV map to genes that are associated with diseases
Conclusions
Observed 0.05 % to 1.46% of genomic fraction per individual
• A set of genes that are encompassed in CNVRs are novel and not reported in DGV (database of genomic variation).
• Validation process of individual CNVs showed substantial heterogeneity in the boundaries of CNVs within a gene.
• CNVs can be shared between genetically related populations
• Basal data for genomic region prone to CNVs in Indian population
• CNV regions predispose to many diseases in Indian populations.
Role of CNVs as a genetic modifier in SCA12 phenotype
Investigating the involvement of CNV in sub-phenotypes of SCA12
Neuro-degenerative disorder
CAG repeat expansion in 5’ UTR region of PPP2R2B gene
Two distinct sub-phenotypes have been observedTremor dominantGait dominant
SCA12
Could CNV be involved????
Workflow of CNV Identification
10 index cases of Gait 14 index cases of Tremor
SCA12 (CAG repeat in
PPP2R2B)
Affymetrix 6.0 SNP array
CNV calling (PennCNV)
Gene Annotation
Validation (RealTime method)
Data QC
Functional annotation clustering
IE large populations
CN state Count in SCA12 Count in IE
0 987 389
1 2697 1226
3 257 465
4 158 257
Copy number state distribution in SCA12 and IE population
Chr CNV start
CNV end Sizes in Kb
Genes Gait Del
Gait Dup
HT Del
HT Dup
p value
odds ratio (OR)
chr110582072
810582389
83.17
Non genic
1 4 2 00.017
2Inf
chr14
105609468
105641621
32.1Non genic
6 1 1 10.004
425.144
2
chr5 32142841 32208250 51GOLPH
30 5 0 0
0.0048
Inf
Case control association analysis between gait and tremor groups
Amplification of chr5p13.3 region in Gait Ataxia
5/8 of gait samples0/14 of HT samples
GOLPH3 amplification Real Time validation
GOLPH3 (golgi phosphoprotein 3 (coat-protein))
A Golgi localized protein
Have a regulatory role in Golgi trafficking
Identified as potent oncogene
modulates mTOR signaling
Inhibition of mTOR induces autophagy and reduces toxicity of polyglutamine expansions in fly and mouse models of Huntington disease
Brinda Ravikumar et al. Nature Genetics (2004)
Autophagy induction reduces mutant ataxin-3 levels and toxicity in a mouse model of spinocerebellar ataxia type 3
Fiona M. Menzies et al. Brain (2009)
Term Count % P value Bonferroni
Benjamini
Fold Enrichme
ntGO; 0005216~ ion channel activity
18 6.593 3.74E-05 0.0172 0.0172 3.2549
GO:0022838~substrate specific channel activity
18 6.593 5.48E-05 0.0252 0.0084 3.1568
GO:0015267~channel activity
18 6.593 8.39E-05 0.0383 0.0097 3.0495
GO:0022803~passive transmembrane transpore activity
18 6.593 8.64E-05 0.0394 0.0080 3.0421
Functional annotation clustering of genes under CNV specific to SCA12
significant enrichment of ion channel activity processes in SCA12
A multigene enrichment analysis for dissection of biological system
Biological process
Molecular functions
Cellular components
CNV in ion channel genes and its involvement in different biological, molecularand cellular functions suggest physiological impairment in SCA12
Future direction
Conclusions
• Although SCA12 is a monogenic disorder, phenotypic variability could be due to other Genetic factors.
• Amplification in GOLPH3 gene could be a modifier gene that leads to gait ataxia feature.
• As Autophagy pathway is influenced by GOLPH3 through mTOR pathway that finally leads to Autophagolysis of inclusion bodies.
• GOLPH3 could be good intervention molecule for SCA12 pathogenesis.
• Ion channel genes and its implication in different neurological diseases, suggests physiochemical abnormalities in SCA12
Conclusion of my PhD work ……………
“Any two individual genomes taken from nature, in any species, will have dozens to hundreds of differences in their total number of functional genes.”
[Daniel R. Schrider and Matthew W. Hahn, Proc. R. Soc. B; 2010]
In conclusion our genome is less static and CNVs could play an important role in dynamics of the genome that facilitates evolution, adaptation and selection in populations and diseases due to dosage effect of functional genes/regions.
Jha P, Sinha S, Kanchan K, Qidwai T, Narang A, Singh PK, Pati SS, Mohanty S, Mishra SK, Sharma SK, Awasthi S, Venkatesh V, Jain S, Basu A, Xu S; Indian Genome Variation Consortium, Mukerji M, Habib S. Deletion of the APOBEC3B gene strongly impacts susceptibility to falciparum malaria. Infect Genet Evol. 2012 Jan;12(1):142-8.
Datta S, Chowdhury A, Ghosh M, Das K, Jha P, Colah R, Mukerji M, Majumder PP. A Genome-Wide Search for Non-UGT1A1 Markers Associated with Unconjugated Bilirubin Level Reveals Significant Association with a Polymorphic Marker Near a Gene of the Nucleoporin Family. Ann Hum Genet. 2012 Jan;76(1):33-41.
Abhimanyu, Indian Genome variation consortium, Jha P and Mridula Bose. Footprints of genetic susceptibility to pulmonary tuberculosis: Cytokine gene variants in north Indians. Indian J Med Res., 2011 (accepted)
Lall M, Thakur S, Puri R, Verma I, Mukerji M, Jha P. A 54 Mb 11qter duplication and 0.9 Mb 1q44 deletion in a child with laryngomalacia and agenesis of corpus callosum. Mol Cytogenet. 2011 Sep 21;4:19.
Publications
Gautam P*, Jha P*, Kumar D, Tyagi S, Varma B, Dash D, Mukhopadhyay A; Indian Genome Variation Consortium, Mukerji M. Spectrum of large copy number variations in 26 diverse Indian populations: potential involvement in phenotypic diversity. Hum Genet. 2011 Jul 9. * Equal contributing authors.
Ankita Narang*, Jha P*, Vimal Rawat, Arijit Mukhopadhayay, Debasis Dash, Analabha Basu, Mitali Mukerji. Recent admixture in an Indian population of African ancestry. Am. J. Hum. Genet. 2011 Jul 5. * Equal contributing authors.
Jha P, Suri V, Sharma V, Singh G, Sharma MC, Pathak P, Chosdol K, Jha P, Suri A,Mahapatra AK, Kale SS, Sarkar C. IDH1 mutations in gliomas: First series from a tertiary care centre in India with comprehensive review of literature. Exp Mol Pathol. 2011 May 3;91(1):385-393. Abhimanyu, Jha P, Jain A, Arora K, Bose M. Genetic association study suggests a role for SP110 variants in lymph node tuberculosis but not pulmonary tuberculosis in north Indians. Hum Immunol. 2011 Apr 20. Abhimanyu, Mangangcha IR, Jha P, Arora K, Mukerji M, Banavaliker JN, Consortium IG, Brahmachari V, Bose M. Differential serum cytokine levels are associated with cytokine gene polymorphisms in north Indian populations with active pulmonary tuberculosis. Infect Genet Evol. 2011 Apr 1.
Jha P, Suri V, Jain A, Sharma MC, Pathak P, Jha P, Srivastava A, Suri A, Gupta D, Chosdol K, Chattopadhyay P, Sarkar C. O6-methylguanine DNA methyltransferase gene promoter methylation status in gliomas and its correlation with other molecular alterations: first Indian report with review of challenges for use in customized treatment. Neurosurgery. 2010 Dec; 67(6):1681-91. Jha P, Jha P, Pathak P, Chosdol K, Suri V, Sharma MC, Kumar G, Singh M, Mahapatra AK, Sarkar C. TP53 polymorphisms in gliomas from Indian patients: Study of codon 72 genotype, rs1642785, rs1800370, and 16 base pair insertion in intron-3. Exp Mol Pathol. 2011 Apr;90(2):167-72. (2010) Nov 27. Aggarwal S, Negi S, Jha P, Singh PK, Stobdan T, Pasha MA, Ghosh S, Agrawal A; Indian Genome Variation Consortium, Prasher B, Mukerji M. EGLN1 involvement in high-altitude adaptation revealed through genetic analysis of extreme constitution types defined in Ayurveda. Proc Natl Acad Sci U S A. (2010) Nov 2;107(44):18961-6.
HUGO Pan-Asian SNP Consortium, Mapping human genetic diversity in Asia. Science. (2009) Dec 11;326(5959):1541-5
Indian Genome Variation Consortium. Genetic landscape of the people of India: a canvas for disease gene exploration. J Genet. (2008) Apr;87(1):3-20.
TCGA for Genotyping Facility
Indian Genome Variation Consortium
CSIR
AcknowledgementsQuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
Thank you
Extra slides
Copy Number Variation in Indian Population
547 healthy individuals from26 Reference Population from Indian Genome Variation Consortium
Affymetrix 50k Xba 240 array (raw intensity file)
CNV calling and QC(Genotyping Console+SVS7)
≥ 10 probes≥ 100 kb segment
Reference Sample(30) Test Sample(447)
Common CNV(> 5% of samples)
Rare CNV(< 5% of samples)
Validation using Sequenom massARRAY QGE assay(a subset of 12 genes)
Functional Enrichment Analysis
Mapping with Disease Associated regions
Genotype QC
Ins Homo Heterozygote Del HomoHWE test p-
value
Endemic case 29 41 3 0.018Too many
heterozygotes
Endemic control
64 18 0 0.586
Non-endemic case
56 11 17 7.95 × 10-9
Loss of too many
heterozygotes
Non-endemic control
51 25 5 0.508
Test for HWE
HWD generally indicates some kind of natural selection, after data quality control for genotyping error and population stratification
Future direction
GOLPH3
mTOR Pathway
AUTOPHAGY
Amplification
Induction of mTOR pathway
Autophagy Inhibition
Aggregate formation
SCA12 modifier genes