iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for...

25
iDASH National Center for Biomedical Computing Sharing and Protecting Human Subjects Data 9/30/15 pSCANNER/iDASH all-hands NIH U54 HL108460 Lucila Ohno-Machado, MD, MBA, PhD Biomedical Informatics, University of California San Diego

Transcript of iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for...

Page 1: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

iDASH National Center for Biomedical ComputingSharing and Protecting Human Subjects Data

9/30/15pSCANNER/iDASH all-hands

NIH U54 HL108460 Lucila Ohno-Machado, MD, MBA, PhDBiomedical Informatics, University of California San Diego

Page 2: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Patient Interaction

Data AnalysisStatistics Machine Learning

Data StructuringNatural Language ProcessingData Modeling

Predictive ModelingEvaluation Methods

Decision Support ToolsGuidelines, Alert & Reminders

Data Collection ToolsClinical Data Warehouse

Data IntegrationGenomicsProteomicsSensors

Data De-IdentificationPrivacy Technology

Communication StrategiesConsumer Health Informatics Medical Education

Page 3: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Knowledge & Tools

Privacy

Consent

Data

Our Goals

• Share access to data and computation

• Train the new generation of data scientists

• Provide innovative software, platform, and infrastructure

• Protect privacyDevelop» Algorithms» Tools» Infrastructure» Policies

iDASH

Knowledge& Tools

ServicesPlatform

Data

Sensors

Genomic

Clinical

ServiceWWW

Apps

Exec.

Aggreg.Hosting

Sharing

Policies

Platform

Research

Develop.

Federation

Page 4: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

PostdocsMengLucaWenrui

Exchange agreements with universities inAustriaBrazilChina

PhD and master’s students EricWeiZhanglong

Yuan WuAsstProf Duke

MyoungLaSoftware EngineerMicrosoft

ShuangWangAsstProfUCSDK99/R00

AdelaGrandoAsstProf ASU

Elizabeth BellResearchAsstUCSD

XiaoqianJiangAsstProf UCSDK99/R00

Past Postdocs Past Interns

Trainees

MikeConwayAsstProfU UtahK99/R00

Undergrad Students

Tyler BathProgrammer UCSD

Alex HsiehPhD DBMI studentColumbia University

2011

KaushikSinhaAsstProf Wichita St

NLM Training Grant started 2012 (9 pre- and 6 postdoc slots)Graduates from the postdoc programMindy (Asst Prof UCLA) Augustine (PhD program)Dyvia (Fellowship in Resp Med UCSD)Edna (Residency in Surgery)

Page 5: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

iDASH internship 5 years in a row

16

13

8

13 13

02468

1012141618

2011 2012 2013 2014 2015

Number of interns

Number of interns

Page 6: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Internship Symposium 2015

10/15/2015 Supported by the NIH Grant U54 HL108460 to the University of California, San Diego 6

Page 7: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Daniel Garcia Ulloa, Emory University 4th year graduate student

Pallavi Rao, UC Davis 2nd year graduate student

Lei Yang, University of Oklahoma 3rd year graduate studentDima Aref, New Jersey Institute of Technology Senior, undergraduate

Suyash Rathi, Syracuse University 2nd year graduate student

Lu Wang, Oregon State University 4th year graduate student

Haoyi Shi, Syracuse University 1st year graduate student

Dong Han, University of Oklahoma2nd year graduate student

Chao Jian, University of Oklahoma 1st year graduate student

Ko Dokmai, University of Virginia Will join UMD as a graduate

Michele Dow, Boston University Will join UCSD as a PHDRodrigo Gama Baptista, Federal University of Parana, Brazil Junior, undergraduate

Yerlan Idelbayev, UCSD 2nd year graduate student

2015 Cohort

Page 8: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Workshops, Symposia, Webinars• 12 Workshops

https://idash.ucsd.edu/events/workshops» 4 Privacy » 2 NLP » 2 Imaging Informatics» 4 Others (High Performance Computing, Biomedical Data Sharing,

IEEE HISB, Mobile Data)

• 10 Symposiahttps://idash.ucsd.edu/news-and-events» 5 All-Hands» 5 Internship

• 87 Webinars

Page 9: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Publications

• Published Articles and Book Chapters: 138• Presentations: 244• Posters: 72

Topic # Published Cell Biology 2Cloud Computing and Architecture 1Data Analysis and Compression 5Data Modeling and Integration 4Data Sharing 5Genomics 28Imaging Informatics 4Infrastructure 4Kawasaki Disease (DBP 1 & 4) 13Natural Language Processing 7Patient Centered Research 9Physical Activity Monitoring (DBP 3) 2Privacy Technology 41Statistics 13Total 138

https://idash.ucsd.edu/publications

As of 6/4/15

Page 10: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Integrating Different Types of Data

Genotype RNA

Metabolites

transcription

trans

latio

n

genome transcriptome

laboratoryPhysiology tests

Protein proteome

Phenotype physical exam, imaging, monitoring systems

Page 11: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

● Predictive modeling and adjustment for cofounders require lots of data

● Some institutions cannot move data outside their firewalls, we can bring computation to the data

User requests data for Quality Improvement or Research

•Identity & Trust Management•Policy enforcement

Trusted Broker(s)

Security Entity

Diverse Healthcare Entitiesin 3 different states (federal, state, private)

Analysis: Distributed computingScalable National Network for Comparative Effectiveness Research

Wu Y et al. Grid Binary LOgistic REgression (GLORE): Building Shared Models Without Sharing Data. JAMIA, 2012 Wang S et al. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning. J Biomed Inf 2013 Jiang W et al.. WebGLORE: A Webservice for Grid Logistic Regression. Bioinformatics 2014Wu Y et al. Grid Multi-Category Response Logistic Models. BMC Med Inform Dec Making 2015

Page 12: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Horizontal and Vertical Partitions

Patient Age Insurance

A1 45 X

A2 32 Y

Patient Age Insurance

B1 45 Y

B2 32 Y

Patient Age Insurance

A1 45 X

A2 32 Y

Li Y, Jiang X, Wang S, Xiong L, Ohno-Machado L. VERTIcal Grid lOgistic regression (VERTIGO) – accepted in the J Am Med Inf Assoc.

Page 13: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

iDASH 2014 First Privacy Protection Challenge

• Task 1: Privacy-preserving SNP Data Sharing• Task 2: Privacy-preserving release of top K

most significant SNPs

Evaluate solutions of guaranteed privacy protection for protecting the output of genomic data analysis

Page 14: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

2015 Privacy Protection Challenge

• Task 1: Homomorphicencryption (HME) based secure genomic data analysis

• Task 2: Secure comparison between genomic data in a distributed setting

• Focus on secure outsourcing and secure data analysis in a distributed setting (humangenomeprivacy.org)

Page 15: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Genome Privacy Challenge 2015

Winners for HomomorphicEncryption

• Stanford/MIT• IBM• Microsoft

Page 16: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Consent Management System

Do I wish to disclose data D to U?

Sharing Look-up

Yes

Patient I

Patient Interface

I can check that U looked at my data D

• Data use agreements

• Study registry

Trusted broker

Healthcare Institutions

User U requests Data D on individual I

Sharing

Page 17: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Ohno-Machado L. To Share or Not To Share: That Is Not the Question. Science Translational Medicine, 2012 4(165)

homomorphic encryption

secure multiparty computation

iDASH “commons”

Sharing Data, Tools, Systems

differential privacy

indexing

Page 18: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Research DataClinical Data Applications Integration2008-2009 2010-2011 2012-2013 2014-2015

Electronic Health Record SystemEpic & Clarity

Other SystemsPACS, lab, etc

Personnel SystemsActive Directory

Query ToolsUC-ReX ExplorerPrivacy Technology

Clinical Research DataRedCAPVelosOther DBs

iDASH HIPAA SHADEImages, human genomes, etc

Analytical Tools

Recruitment Consent toolsCustom Apps

VA LA Clinics

UCSF

Davis

Irvine

UCLA

Healthcare Clinical DataClinical Data Warehouse for Research

Scalable Network(Distributed Analytics Tools)

HIPAA

External data (patient reported data, sensors)

pSCANNERPCORI CDRN

iDASH HIPAA/FISMA OVERCASTiDASH, CTRI, School of Medicine

De-ID Tools

UCSD Health Sciences: Building Protected Health Information Networks

SCANNER

BRIGHT

iDASH

PhenDISCO NLM Training Grant

K22, K99s

PCORI contracts

Private Cloud

iCONCUR

UC-ReX

pSCANNER

Accrual for Clinical Trials

CTSA renewal

bioCADDIE

R21, subcontracts

Health System Department

USC/LAC Cedars Sinai

San Mateo

EpicCDDSNew modules

Intermountain

Page 19: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

iDASH On-Demand Resources

SafeHIPAA-compliantAnnotated Data deposit boxEnvironment

On-demandVirtualizedElasticResilientCompute AndStorageTechnology

HIPAA and non-public data

public data, tools, recipes

Pow

ered

by

MID

AS

Data Tools Recipes

upload & download data

compute request,direct upload & download of proprietary data, tool, recipe

middleware and HIPAA security developed by iDASH

Compute nodesMemoryDisk storageNetworking

Pow

ered

by

VMw

areAUTOMATED

Page 20: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Clinical Research Informatics CTRI

Clinical Trial Management System, RedCAPData Concierge ServiceManagement of iDASH HIPAA cloud

20

Page 21: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

iDASH SHADE Repositories

• Based on Kitware MIDAS open-source technology

• File-level access control• Separate PHI and Non-

PHI repositories• Two Factor Auth (PHI)

https://idash-data.ucsd.edu/

Page 22: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Institutions with Signed Agreements

DCA• National

» UCSD» Children’s Health Care of Atlanta (GA)» Long Beach Veterans Affairs Medical

Center» Ortho Kenematics (TX)

• International» Mahidol University (Thailand)

DUA• National

» UCSD» Databetes (NY)» Tin Man Labs, LLC (TX)» UMass Dartmouth» Georgia Institute of Technology» University of Utah» The Ola Grimsby Institute (CA)» The Methodist Hospital Research Institute (TX)» Wake Forest University Health Systems (NC)

• International» North West London Hospitals NHS Trust (UK)» The University Hospital of Leuven (Belgium)» INRIA (France)» Newton Circus Pte. Ltd. (Singapore)

Page 23: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Repeatable Results

Workflow

Short reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Blueprint

WorkflowShort reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Cont

ext

Reference DB

Test data

Configuration

Helper tools

OS

Blueprint

WorkflowShort reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Cont

ext

Reference DB

Test data

Configuration

Helper tools

OS

Blueprint

WorkflowShort reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Cont

ext

Reference DB

Test data

Configuration

Helper tools

OS

Blueprint

WorkflowShort reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Cont

ext

Reference DB

Test data

Configuration

Helper tools

OS

Instance

WorkflowShort reads

Index reference

Align to reference

Call variants

Annotate variants

Pick high impact

Deleterious SNPs

Cont

ext

Reference DB

Test data

Configuration

Helper tools

OS

iDASH On-Demand Resources

BookshelfMyDATA

InputResults

Instance

External Data

Page 24: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

Collaborative ProjectsLinked R01s• Cardiac Atlas Project (R01HL121754)

» Goal: Develop accurate new methods for analyzing cardiac shape, mechanics and blood flow in CHD patients

• CYCORE: Cyberinfrastructure for Cancer Comparative Effectiveness Research (R01CA177996)

» Goal: Develop a system that improves the capture of patient-reported and objectively measured data from patients in cancer clinical trials

• Privacy-Preserved Sharing and Analysis of Human Genomic Data (R01HG007078) » Goal: Study and develop a suite of innovative and transformative techniques aimed at

achieving practical and cost-effective genomic data protection

• SHARE: Statistical Health Information Release with Differential Privacy (R0101GM114612)

» Goal: Develop a toolkit for enabling privacy-preserving health information release to cover different data modality and study needs

PCORI-funded methods grant to collaborator Li Xiong from EmoryNSF-funded infrastructure grant to collaborator Kevin PatrickR21 on cloud privacy to Xiaoqian Jiang

Page 25: iDASH National Center for Biomedical Computing Day 2 LOM.pdf · iDASH National Center for Biomedical Computing ... UCSF. Davis. Irvine. UCLA. Healthcare Clinical Data. Clinical Data

The Near Future

• Ethics technology» Instrument policy makers with algorithms and tools to

support ethics (including privacy)

• Serve HIPAA-storage and compute needs of a larger community» Data Discovery Index prototype environment» Private cloud for protected health information

• Hub infrastructure for large HIPAA-data networks» FISMA ATO» Distributed computing