W07 The Discovery Challenge on Thrombosis Data Data Provider Katsuhiko TAKABAYASHI MD Chiba...
-
Upload
bryan-lloyd -
Category
Documents
-
view
217 -
download
0
Transcript of W07 The Discovery Challenge on Thrombosis Data Data Provider Katsuhiko TAKABAYASHI MD Chiba...
W07 The Discovery Challenge
on Thrombosis Data
Data Provider Katsuhiko TAKABAYASHI MDChiba University Hospital, Japan
Anti-Phospholipid antibody Syndrome (APS)
Anti-cardiolipin antibodies (aCL) Lupus anticoagulant (LAC)
Induces thrombotic events (such as AMI, Stroke, deep venous throm
bosis, miscarriage, pulmonary hypertension etc.)
sometimes positive in other collagen diseases (Lupus, Sjoegren syndrome)
TF XI XII
VII VIIa IX
TF / VII a XIa XIIa
IXa
Ca++ VIIIa VIII
X Xa
Ca++
PL Va V
prothrombin thrombin
fibrinogen fibrin XIII a XIII
PL Ca++
Ca++
Ca++ Ca++Mg++ Mg++
Mg++
ProteinC ProteinC
Collagen Diseases
Autoimmune disease Rheumatic disease Connective tissue disease
Thombosis ; Vessel stasis by blood clots Myocardial Infarction, Stroke etc.
autoantibodies
APS
Thrombosis
Who in APS ?
When ?
Which laboratory data have relations with thrombosis as well as anti-cardiolipin antibodies ?
The Goal of this trial (1)
If a data mining technique can point out important key factors (aCL, LAC, PT, APTT) which are already known to be related with thrombosis properly from many variants we provided.
Assessment of validity of each study
The Goal of this trial (2)
The Results to expect
2) to predict the time of thrombosis or detect the change of some variants in the course of thrombosis from the series of temporal data.
1) to identify high risk patients who have no history of thrombosis so far.
Evaluation of the results
Common sense results (positive control)
Probable results Possible results unclear results, difficult to evaluate Nonsense results (negative control)
From the current medical point of view,.
We cannot judge what we We cannot judge what we do not know !do not know ! The study
most results of whichhave good accordance withcurrent knowledge
The studymost results of whichhave low accordance withcurrent knowledge
Domain researcherscannot believe the restof unclear results !
Assessment in domain field
Domain researchers cannot say that other unclear resultsare also true.
Medical Data Set
Medical data set here is from 1241 patients with collagen diseases and 7 basic laboratory data for aCL from 806 cases were provided.
As for temporal laboratory data, 41 items in 57,543 tests totally in 17 years were prepared.
Seventy-six cases had some thrombotic events in their clinical course.
It can predict patients’ health state from spe-exams and lab-exams in 99.28%.
CNS lupus has a relation with anti-DNA Ab level and IgM type aCL.
aCL IgM and anti-DNA Ab levels are related independently with the thrombosis in the future.
Evaluations from medical aspects
Coursac I et alThe bridge theory ;Genetic Programming
a lot of rules with 100% confidence, but most of them were not useful.
A rule that aCL >2.4 and range of aCL IgM from 1.9 to 2.7 and KCT (-) is SLE.
The rule that sex is M and ANA is 0 is Behcet.
We would like to look at the other rules not written here to find attractive ones.
Evaluations from medical aspects
Boulicaut et al δ- strong classification rules
LAC, ANA, U-pro, centromere-type, SSA, SSB,RNP,SM,SCl-70 were strong contributors to predict the presence of thrombosis.
Other possibilities of thrombosis without aCL antibodies.
Evaluations from medical aspects
Jensen S et al CRISP ( cross-industry standard process )
Sequential analysis for temporal data did not show interesting results.
It might be difficult to predict the time of thrombosis. One possibility is that the data might be modified by the treatment or prophylaxis.
Evaluations from medical aspects
Jensen S et al
determined a discriminate function that separates occurrences of thrombosis with very low false negatives.
However, ..... is it possible to translate the meaning and make us understood ?
Weightening?
Evaluations from medical aspects
Werner J and Fogarty T genetic programming
When the Results Beyond Expert’s Knowledge ability
Complicated relations might be difficult to be explained. No drug relations for three items were tried.
The results through a black box might be ignored by the experts simply because it can not make them understood!
reasonable results as Infozoom. ANA pattern analysis Patients with severe attacks have more pos
sibilities of other attacks. Thrombosis related with the level of aCLs. Alveolar hemorrhage and CNS attacks are n
ot associated with milder attacks.
Evaluations from medical aspects
Zytkow J and Gupta SSQL ; cross contingency classification
Evaluations from medical aspects
Beilken and Spenke (InfoZoom) : by using user friendly interface, easy to understand their test results. They could choose the reasonable and interesting rules.
Levin: by using Wizwhy producing 7356 rules. Complicated rules are difficult to comment because of its complexity.
Taylor : from temporal data missing data disturbed the analysis. Only common sense findings were selected.
reasonable results as Infozoom. ANA pattern analysis Patients with severe attacks have more pos
sibilities of other attacks. Thrombosis related with the level of aCLs. Alveolar hemorrhage and CNS attacks are n
ot associated with milder attacks.
Evaluations from medical aspects
Zytkow J and Gupta SSQL ; cross contingency classification
To obtain the good results efficiently
Preprocessing the data is very essential by domain researchers who concerned with the database to minimize the noises.
Definition, classification, adjustment etc. Recognition of the modification by the
treatment or prophylaxis. Indication to treat missing data
(1)Cleaning of data
To involve medical knowledge as possible with the data set in the beginning
To cooperate with domain researchers to obtain domain knowledge during data mining.
To obtain the good results efficiently
(2)Introduction of the domain knowledge
Causal Relation
Misjudge in temporary meaning
Bacteria invades Pneumonia occurs
Bacteria has invaded Pneumonia occurs
Bacteria will invade Pneumonia occurs
Backward and non-objective relationships
An interactive technique will avoid user’s discontent of a black box and assist to drive to the right direction.
Hypothetico-deductive method will be easily accepted by physicians.
To obtain the good results efficiently
(3)Cooperation with domain researchers
Causal Relation
Misjudge in temporary meaning
It rains The road is wet.
It rains The road is wet.
It will rain The road is wet.
Backward and non-objective relationships
Data mining
Retrospective approach ; not arranged, many noises.
Data ; More genuine and adequate data set must be prepared. Terms, definitions and background must be introduced beforehand.
Rules ; Complicated rules (relations between more than 3 items) found by this analysis cannot be explained nor proved whether they are true from medical approach.
3 種の薬剤の治験はない