Machine Learning: How to Implement Operational Predictions …€¦ · Predicting milestones using...
Transcript of Machine Learning: How to Implement Operational Predictions …€¦ · Predicting milestones using...
Machine Learning: How to Implement Operational Predictions and Why these Insights are Key to Business Success
Elvin Thalund
Director, Industry Strategy
Oracle Health Sciences
© 2020 Oracle
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
Safe harbor statement
2
© 2020 Oracle3
2
1
Why Machine Learning Insights are Key to Business Success
How to Implement Machine Learning based Operational Predictions
Program agenda
© 2020 Oracle4
How to Implement Machine Learning based Operational Predictions
© 2020 Oracle
What is Machine Learning?
1 - Daniel Bourke - 2020 Machine Learning Roadmap - Jul 12, 2020. https://www.youtube.com/watch?v=pHiMN_gy9mk
*1
© 2020 Oracle6
• What we start with is • An industry standard metric to normalize data collection
and analytic. MCC is critical in defining the industry standard.
• As much current quality data as possible. In our case a list of normalized and anonymized historical study site data
• This dataset has
• Inputs – Leading indicators/features- Study and Site features
• Outputs – Lagging indicators/targets- Duration from start to target milestone
• Figures out• A prediction model, which in our case is many gradient
boosting decision trees
Predictions in Study startupWhat are these components in Clinical Operations?
© 2020 Oracle7
Predicting milestones using machine learning is not new!
• Gives you multiple options, in this case 3.
• One is suggested as the best route.
• You can choose a different route if you have better information.
• Gives you information of current bottlenecks.
• So you could get there by 5.40!
© 2020 Oracle
Standardized milestone model for study site startup
IdentificationEnrollment
Activation
ActivationSponsor infrastructure
ActivationQualification
Selection
Site InitiationVisit
Investigator Meeting
Site Qualified
Study/Country Technical Infrastructure
IP Release
Activated/Enrollment Ready
Country IP Ready
SiteInfrastructure
All technical Infrastructure
System 1 Infrastructure
Non–technical Infrastructure
SelectedSelection
NotSelected
CDA Signed
Site ID
Ethical Approved
Essential Documentation
Contracted/Indemnity
Enrolled
Country Selected
Study Enrolled
System nInfrastructure
Counties = Functional areas
Country QualifiedStudy
Package Available
Cities = Milestones
Roads = Cycle times
Study Site Startup map
© 2020 Oracle9
Study countryLeading indicators/features – Location, location and location!
© 2020 Oracle10
When you start matters – Study package send to site.Leading indicators/features – Seasonality/resource availability
© 2020 Oracle11
Number of countries in studyLeading indicators/features – Complexity
© 2020 Oracle12
StudyLeading indicators/features – Therapeutic Area!
© 2020 Oracle13
• Study features
• Size/complexity - Phase
• Site features
• Red tape - IRB/EC type - Local vs Central IRB
• Investigator features
• Experience – Number of studies
Leading indicators/features - Others
© 2020 Oracle14
We use a gradient boosted decision trees for milestone predictionDefine the machine learning technique that fit your problem
© 2020 Oracle
What is the process to build and implement machine learning?
15
CustomerPrediction
CustomerPrediction
CustomerPrediction
CustomerPrediction
CustomerPrediction
CustomerPrediction
CustomerPrediction
CustomerPrediction
• Customers with individual configurations, but linked to common milestones
• Volume of clean data into normalized and anonymized model
• Data is fed to machine learning
• Resulting in a prediction model
• Supporting customer comparisons and prediction
NormalizedAnonymized
Customer
Customer
Customer
Customer
Customer
Customer
Customer
Customer
Machine Learning
Leading IndicatorsPrediction
© 2020 Oracle
What determines, when will we get there? – Predicting from machine learning!
ActivationQualification
Counties = Functional areas
Cities = Milestones
Roads = Cycle times
Destination milestone
Start milestone
Route
Contract Sent to Site
Site Contracting
Country Contract Template
Study Package Available
Site Qualified
Selected
Contracted/Indemnity
Ethical Approved
Essential Documentation
Country Qualified
Site Study Package Available
Study Package Sent to Site
First Ethical Submission
Leading features/indicators
Number of countries 15
Sites in study 141
Sites in country 29
Start month 11
PI in study counts 12
Therapeutic area Oncology
Phase III
Country code USA
IRB/EC type Central
Prediction
Duration (days) 101
+101Nov 22, 2019 Mar 2, 2020
Country Study Package Available
First Approved Protocol
© 2020 Oracle17
While currently more a proof-of-concept it is real!It is not just a theory
© 2020 Oracle18
Why Machine Learning Insights are Key to Business Success
© 2020 Oracle19
Feature importance for study sites, using a data set of approximately 40,000 sites.Each study site is represented as a single dot for each indicator under investigation. The horizontal position of the dot is the impact of that indicator on the model’s prediction for the study site. The color of the dot (e.g., red for local IRB and blue for central IRB) represents the value of that indicator for the study site.
A negative SHAP value for cycle times is desirable, because that represents a decrease in cycle time.
Summary plot of site activation (IP Release)Using SHAP to graphically reverse-engineer the output
1 – Oracle - ChromoReport - Spring 2020. https://www.oracle.com/a/ocom/docs/chromo-report-spring-2020.pdf
*1
© 2020 Oracle20
Summary bar of site activation (IP Release)Understanding the actual features importance?
The most important indicators in predication of site activation cycle times are the
• IRB/EC type
• Country
IRB/EC type is about two times more important to prediction than the next indicators, which are country and therapeutic area.
© 2020 Oracle21
Summary plot of site activation (IP Release) in the USFocus on importance of IRB/EC type in the US
There is now a clear separation of central IRB/EC type, showing it is clearly preferable when looking at activation timelines as opposed to local IRB/EC type.
The dot clustering for central IRB/EC types in the SHAP plot can be attributed to the consistency in operations of central IRB/ECs.
CentralLocal
© 2020 Oracle22
Summary bar of site activation (IP Release) in the USFeature importance when focusing on US
Now IRB/EC type is about three times more important to prediction than the next indicator, which are therapeutic area.
It is interesting that when focusing on speed in site activation, then experience can not overcome red tape!
© 2020 Oracle23
Dependency plot of PI counts vs IRB/EC type in site activation (IP Release) in the USImportance dependencies between red tape and experience
This plot shows a negative SHAP value for experienced primary investigators (study count over 10) using local IRB/ECs, but does that mean that these investigators also activate faster?
The answer is “no.”
The impact of the mean SHAP value for IRB/EC is over five-and-a-half times higher than PI counts, so less experienced investigators using central IRB/EC will, in most cases, activate faster than more experienced investigators using local IRB/EC.
© 2020 Oracle
Takeaways
24
• Machine learning is
• simpler that traditional programming
• Good to solve complex problems
• Requires
• Industry metric standardization (MCC)
• Enough normalized, anonymized quality data
• Will become key to business success
• Provide accurate value of feature importance
• Inform process improvements
• Really effective at scale
© 2020 Oracle
Is Artificial Intelligence Critical to Improving Efficiencies and Outcomes in Clinical Trials?Thank You - Next machine learning webinar
• Elvin Thalund
– Director, Industry Strategy – Oracle Health Sciences– [email protected]