Bringing OpenClinica Data into SAS
-
Upload
fruitynewt -
Category
Documents
-
view
2.835 -
download
6
description
Transcript of Bringing OpenClinica Data into SAS
![Page 2: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/2.jpg)
CRIC supports a wide variety of studies ‘Regulatory’ clinical trials Many different types of academic study Variable size and complexity
Investigators design their own CRFs CRIC has limited control over design strategies and
CRF consistency.
Analysis requirements and data formats vary
SPSS, Stata, SAS, Excel.
CRIC’s Preferred data handling tool is SAS
CRIC and OpenClinica
![Page 3: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/3.jpg)
OpenClinica exports seem difficult for our users to work with.
Data structures vary depending on the data content.
CRF versions (repeat as extra columns) Group contents (number of repeats)
Multi-select objects difficult to handle. Must be ‘broken’ into separate variables for analysis.
Null values represented as text in otherwise numeric variables
OpenClinica Export
![Page 4: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/4.jpg)
The Challenge We wanted to:
Produce consistently usable data for minimal up front effort.
Get data that could easily be transferred into different formats.
Produce tall, thin, de-normalized data sets suitable for data management purposes.
Leverage CRF metadata to add value: Dataset labels Variable labels SAS formats and informats SAS special missing values.
![Page 5: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/5.jpg)
Create ‘SAS friendly’ XML to be read by the XML Libname engine.
Create a SAS XML Map file to assign labels, data types, informats and formats.
Generate a CNTLIN data set in the XML suitable for use by PROC FORMAT.
Note: The XML file can also be imported directly into MS Access.
The Solution
![Page 6: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/6.jpg)
SAS macros or external utility?◦ Hi complexity
Ensure OpenClinica metadata translated into legal SAS names.
Map OC hierarchy to SAS data sets. CRFs, sections, groups and data items to tables, rows
and columns. De-duplicate object names
◦ No resource to develop complex macros
Development Approach
![Page 7: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/7.jpg)
Command Line Java Utility◦ Programmer available
(I would have to write SAS code myself!)
◦ Capable development environment◦ Portable (Windows / Linux)◦ Callable from within SAS
The Choice
![Page 8: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/8.jpg)
Enter connection parameters and study identifier (interactively or command line)
Connect to Postgres via ODBC
Read study metadata
Manipulate the metadata
Write map file
Read study data
Write data file
Data Processing
![Page 9: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/9.jpg)
Legalize Names SAS names <= 32 characters Must start with a letter or underscore Format names cannot end in a number
De-duplicate names Multiple CRFs may contain the same section and
response option names. Duplicate names have numbers and underscores
appended.
Metadata Manipulations
![Page 10: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/10.jpg)
CRFs◦ No ‘top level’ mapping between CRFs and data
sets.
CRF Section -> SAS data set CRF sections contain logically grouped data – CRFs
may not! CRFs containing multiple sections result in multiple
output data sets. Every data item contained within a section is output
to the same data set. Section label -> dataset name Section title -> dataset label
Metadata Manipulations
![Page 11: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/11.jpg)
Groups -> Rows Ungrouped section data repeated in each row Each repeat becomes a separate row in the data set Rows are numbered to provide a unique key based
on their order within the group. Multiple groups contained within the same section
are merged based on order within the groups. Where groups contain unequal numbers of rows
missing values result.
Metadata Manipulations
![Page 12: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/12.jpg)
CRF items -> dataset variables Item_name -> variable name Description_label -> variable label
Calculate length of character variables SAS has no support for VARCHARs. Explicitly
specifying variable length saves considerable space on disk.
Metadata Manipulations
![Page 13: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/13.jpg)
A new column is created for each response value Column names based on item_name Columns labeled based on item_label and response
option value. Columns contain 1 or 0 to indicate selected or
unselected.
Multi-select and Checkbox items
![Page 14: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/14.jpg)
Response option lists become SAS formats and informats.
Format names created from CRF item’s response_label.
Format names legalized and de-duplicated. If separate CRFs contain identical response option
lists only one format results.
Formats and Informats are written to the XML as a new data table.
This is used as a CNTRLIN data set for PROC FORMAT.
Response Options
![Page 15: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/15.jpg)
Informats are created to read numeric data and handle OpenClinica null values.
CRF Dates
proc format;invalue crfdate 'ASKU' = .k
'NA' = .a'NASK' = .d'NI' = .i'NP' = .p'OTH' = .o'UNK' = .uother = [mmddyy10.];
run;
Missing Values
![Page 16: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/16.jpg)
Numeric Response Options
proc format;invalue bestnull 'ASKU' = .k
'NA' = .a'NASK' = .d'NI' = .i'NP' = .p'OTH' = .o'UNK' = .uother = [best10.];
run;
Missing Values
![Page 17: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/17.jpg)
Formats are created for CRF data. Response options
proc format;value yesno 0 = 'No'
1 = 'Yes'.k = 'ASKU'.a = 'NA' .d = 'NASK'.i = 'NI' .p = 'NP' .o = 'OTH' .u = 'UNK';
run;
Missing Values
![Page 18: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/18.jpg)
Dates
proc format;value crfdate .k = 'ASKU'
.a = 'NA'
.d = 'NASK'
.i = 'NI'
.p = 'NP'
.o = 'OTH'
.u = 'UNK‘Other = [date9.] ;
run;
Missing Values
![Page 19: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/19.jpg)
Numeric Data
proc format;value bestnull .k = 'ASKU'
.a = 'NA'
.d = 'NASK'
.i = 'NI'
.p = 'NP'
.o = 'OTH'
.u = 'UNK‘Other = [best10.] ;
run;
Missing Values
![Page 20: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/20.jpg)
CRF Data◦ One data set per CRF section
Each row contains: Study ID Site ID Subject ID Study event name Event start and end date CRF Name CRF Version
Data Set Output
![Page 21: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/21.jpg)
Subject Data List of subjects including site, secondary ID, group,
etc.
Event Data List of subjects study events including start date, end
date and status.
CRF Status◦ List of subject CRFs including event details, CRF
version, creation date, completion date and status.
Discrepancies
Output Data Sets
![Page 22: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/22.jpg)
Data for removed subjects is not exported.
PHI data remains encrypted .
Output Data Sets
![Page 23: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/23.jpg)
C:> java -jar export.jar---------------------------------------- Export Output: ---------------------------------------- MAP FILE: export.map.xml EXPORT FILE: export.xml----------------------------------------Postgresql driver loaded Enter Database url (default: localhost):Database port (default: 5432):Database name (default: openclinica):username (default: clinica):password: Enter Export file name (default: derived from study):Enter Map file name (default: derived from study):
Interactive Execution
![Page 24: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/24.jpg)
Successful connection to database openclinica on jdbc:postgresql://localhost:5432/
Please choose a study:---------------------- 1) Study1 2) Study2 3) Study3 4) Study4==> 1 Retrieving study metadataCreating subject tableWriting formats to .xml fileWriting subjects to .xml fileRetrieving study item dataWriting study item data to fileCompleteFiles generated: study1.map.xml Study1.xml
Interactive Execution
![Page 25: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/25.jpg)
Command line options may be used rather than prompts. Options include:
Host, database, ID and password Study OID File names Suppression of map file Creation of ‘SPSS friendly’ SAS data sets
Minimal formatting allows data sets to be exported to SPSS using PROC EXPORT.
Command line options allow the utility to be executed from within SAS.
Command Line Options
![Page 26: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/26.jpg)
Define libraries
libname ocdata xml92 “data_file.xml" xmlmap=“map_file.map“ access=readonly;
libname library “c:\project\fmt";
libname stdylib “c:\project\data";
SAS Code
![Page 27: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/27.jpg)
Execute the Import%let scommand =java -Xmx256m -jar c:\export\export.jar;
%let shost =-h 10.11.12.13;
%let sport =-p 5432;
%let sstudy =-soid S_STDY1234;
%let sdatabase =-D openclinica;
%let suser =-U dbuserid;
%let spswd =-P password;
%let spss = ;
X "&scommand &shost &sport &sstudy &sdatabase &suser &spswd &smapFile &sdataFile &spss";
SAS Code
![Page 28: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/28.jpg)
Create the Format Catalog from the XML
proc sort data=ocdata92.fmtlib out=work.fmtlib;
by fmtname type start;
run;
proc format cntlin=work.fmtlib library=library fmtlib;
run;
SAS Code
![Page 29: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/29.jpg)
Copy the Data Sets
proc datasets library=ocdata92;
copy out=studylib;
exclude fmtlib;
quit;
SAS Code
![Page 30: Bringing OpenClinica Data into SAS](https://reader033.fdocuments.in/reader033/viewer/2022061515/557cbff8d8b42ab37c8b5379/html5/thumbnails/30.jpg)
Import into SAS
If we have time:◦ XML Structures◦ Import into Access◦ Import into Excel
Do It!
SAS 9.2 (English).lnk