Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January...

7
Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012

Transcript of Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January...

Page 1: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo

Dave MulvihillJanuary 2012

Page 2: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Background

• The goal for a Common Biorepository Model (CBM) is to reduce the time and effort required by researchers to locate a biobank that has the specimens they need

• ETL scripts have been provided with caTissue Suite v2.0 to populate the CBM database

• Summary level data

Page 3: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Procedure • https://wiki.nci.nih.gov/display/caTissuedoc/Loading+caTissue+Data+into+a+Common+Biore

pository+Model+Database

• Many transformations called by 4 ETL Jobs1. CPAvailability

a)Requires user/data owner intervention 2. Organization3. CollectionProtocol Data4. Collection Summary Data

Page 4: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

CP Availability Process

Run CPAvailability ETL Job

Set CP mask flag, specimen and

annotation values in Excel file

Run CPAvailability ETL Job

Final transformation used in downstream ETL jobs

Note:Subsequent runs of the ETL job compares previous profile responses to current Collection Protocols

StartcaTissue Updated

Page 5: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Testing Scenarios

• Verify data in preCPAvailabilityMapping.xls• Verify data in CPAvailAnnotationMapping.xls– Annotation Availability Profile ID– Specimen Availability Summary Profile ID

• Verify all tables populated by ETL– Tables listed in documentation

Page 6: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Known Issues

• CBM team resolving “Not Specified” diagnosis, specimen type and anatomic location.

• SQL Join for caTissue Oracle database will need to be set manually (this will be added to the documentation)

• Issue with caTissue -> CBM diagnosis mapping (this will be corrected soon)

• Preservation not mapped

Page 7: Populating the Common Biorepository Model with caTissue Suite 2.0 Data - Demo Dave Mulvihill January 2012.

Questions

• Questions before demo?