Church ngs

25
derstanding the human reference geno Deanna M. Church Staff Scientist, NCBI 26 Mar 2013 Assembly

description

My talk on how the human reference assembly is produced. Given http://www.palmettogba.com/ngssummit

Transcript of Church ngs

Page 1: Church ngs

Understanding the human reference genome

Deanna M. ChurchStaff Scientist, NCBI

26 Mar 2013

Assembly

Page 2: Church ngs
Page 3: Church ngs
Page 4: Church ngs

http://genomereference.org

Valerie Schneider

Page 5: Church ngs

The Reference Assembly is NOT Static

NCBI35 (hg17)NCBI36 (hg18)GRCh37 (hg19)GRCh37.p10

Page 6: Church ngs
Page 7: Church ngs
Page 8: Church ngs
Page 9: Church ngs
Page 10: Church ngs

An assembly is a MODEL of the genome

Page 11: Church ngs

chr1:g.158324425A>GCD1E:c.317A>G

CD1E

Page 12: Church ngs

NC_000012.11:g.22066016delA

Missed in ARUP Exome, but not covered by capture probes

http://www.ncbi.nlm.nih.gov/projects/variation/get-rmGeT-RM

Page 13: Church ngs
Page 14: Church ngs
Page 15: Church ngs

Kidd et al, 2007 APOBEC cluster

BLACK: DeletionWhite: Insertion

Page 16: Church ngs

Clones

Clones

IHGSC, Nature 2004

Page 17: Church ngs

Build sequence contigs based on contigs defined in TPF (Tiling Path File).

Check for orientation consistenciesSelect switch pointsInstantiate sequence for further analysis

Switch point

Representative chromosome sequence

Page 18: Church ngs

RP11-34P13 64E8 RP4-669L17 RP5-857K21 RP11-206L10 RP11-54O7

Gaps

Page 19: Church ngs

CXorf2 TKTL1

NCBI35 (Assembly described in last HGP paper)

153,019,779

chrX

153,044,285 153,054,417 153,079,546

chrX:g.153054447G>ATKTL1:c.31G>A

GRCh37 (current reference assembly)chrX

153,498,930 153,523,564 153,524,027 153,558,713

chrX:g.153533600G>ATKTL1:c.135-74G>ATKTL1:c.-90G>ATKTL1:c.135-56G>A

Page 20: Church ngs

Data tracking

ABC14-1065514J1Gaps LengthDate

FP565796.1 121-Oct-2009

FP565796.2 014-Oct-2010

FP565796.3 007-Nov-2010

Page 21: Church ngs

NCBI35 (Assembly described in last HGP paper)chrX

chrX:g.153054447G>ANC_000023.8:g.153054447G>A

GRCh37 (current reference assembly)chrX

chrX:g.153533600G>ANC_000023.10:g.153533600G>A

Page 22: Church ngs

NM_012253.3NM_001145933.1NM_001145934.1

TKTL1

NM_001145933.1:c.135-74G>ANM_00114594.1:c.-90G>ANM_012253.3:c.135-56G>A

Page 23: Church ngs

Preview of GRCh38 (scheduled Fall 2013)

TEX28 TKTL1

LOC101060233(opsin related)

LOC101060234(TEX28 related)

GRCh37 (current reference assembly)chrX

Page 24: Church ngs

http://genomereference.org

Page 25: Church ngs

The human reference assembly is a COMPOSITE of many individuals

The human reference assembly is NOT static

When the reference assembly updates:

Accession.versions are KEY to data management

Your favorite region may have the same SEQUENCE but different COORDINATESYour favorite region may CHANGE significantly

We have the TOOLS to help!

http://www.ncbi.nlm.nih.gov/variation