AIME'05 1
Learning rules from multisource data
for cardiac monitoring
Elisa Fromont*, René Quiniou, Marie-Odile Cordier
DREAM, IRISA, France
*work supported by the French National Net for Health Technologies as a member of the Cepica project
AIME'05 2
Context (ICCU/ [Calicot03]) Cardiac Arrhythmias Learning for Intelligent Classification of On-line Tracks
0 1000 2000 3000 4000
Signal abstraction:
raw data (ECG)
symbolic descriptions
Chronicle recognition
t
P(normal) P(normal)
t0 t1 t2 t3 t4
QRS(normal) QRS(normal)
Arrhythmia
QRS(abnormal)
On line
Chronicle base
Inductive learning
Symbolic
transformationOff line
Rule baseSignal data
base
[Moody96]
AIME'05 3
Motivations• Why learning rules ?
A knowledge acquisition module can relieve experts of that time-consuming task [Morik00]
• Why using Inductive Logic Programming (ILP) ?– First order rules are easily understandable by doctors – Relational learning allows to take into account temporal constraints
( chronicles)
• Why using multiple sources ?Information from a single source is not always sufficient to give a
precise diagnosis (noise, complementary information, etc.)
Update Calicot for multisource management
AIME'05 4
Multisource data
2 ECG channels, 1 hemodynamical channel:
3 views of the same phenomenon
ECG Chan II :(P, QRS)
ECG Chan V :(QRS)
ABP Chan
Sensor 1
Sensor 2
Sensor 3
AIME'05 5
Monosource learning with ILP• From
– A set of examples E defined on LE labeled by a class c C
– For each class c, E+ = {(ek,c)| k = 1,m} are the positive examples
and E- = {(ek,c’)| k = 1,m, c c’} are the negative examples
– A bias B that defines the language LH of the hypotheses looked for
– A Background knowledge BK defined on L = LH LE
• Find for each class, a set of hypotheses H such that :1. H BK E+ (H covers all the positive examples)
2. H BK E- (H covers no negative example) *
* in practice this property is loosen
AIME'05 6
Declarative bias [Bias96]• Grammar to define :
– the language (specify the vocabulary to use)– the length of the hypotheses looked for– the order in with consider literals
• Mandatory for ILP system such ICL[ICL95]•
AIME'05 7
Example of learned monosource rule
rule(bigeminy) :-qrs(R0, anormal),p_wav(P1, normal), suc(P1,R0), qrs(R1, normal), suc(R1,P1),qrs(R2, anormal, R1), suc(R2,R1),rr1(R1, R2, short).
0 500 1000 1500 2000 2500 3000
R0 R1 R2
P1
Example bigeminy e21
Example bigeminy e31
Example X* e41
Example Z* en1
…
Example bigeminy e11
*X…Z bigeminy
Induction+B +BK
AIME'05 8
Multisource learning : 2 approaches(example on two sources for one class)
Consistency : i j (ek,i, c) (ek,j, c’) c = c’
Example bigeminy e12
Example bigeminy e13
Example X* e14
Example Z* e1n
…Example bigeminy e11
Induction+B1 +BK1
Example bigeminy e22
Example bigeminy e23
Example X* e24
Example Z* e2n
…
Example bigeminy e21
Induction+B2 +BK2
H1
H2
H
Example bigeminy e12
Example X* e14
Example Z* e1n
…
Example bigeminy e11
Example Z* e2n
Example X* e24
Example bigeminy e13
Example bigeminy e23
Example bigeminy e22
Example bigeminy e21
Induction+B +BK
aggregated examples
Naive multisource learningmonosource learning on source 1
monosource learning on source 2
Vote between H1 and H2 ?
AIME'05 9
Naive multisource learning problemsWhen number of sources increases
– volume of data increases (aggregation of examples)
– expressiveness of language increases the size of the hypothesis search defined by
B is bigger than both search spaces defined by B1 and B2
• too much computation time• bad results due to important pruning
when looking for hypotheses in the search space
AIME'05 10
Idea : biased multisource learning
• Bias efficiently the multisource learning by using : – monosource learned rules– aggregated examples
• Difficult to define without background knowledge on the problem
create a multisource bias automatically !
AIME'05 11
Algorithm (on two sources)
Resulting search space
L : naive multisource language
H2
LL2
H1L1
Lbbt1 bt2 bt3bt4
bt5
Lb : biased multisource language
Li : langage of source i
AIME'05 12
How to construct bti ?(toy example)
• H1: class(x):-
p_wave(P0,normal),qrs(R0,normal), pr1(P0,R0, normal), suc(R0,P0).
• H2:class(x):-
diastole(D0,normal),systole(S0,normal),suc(S0,D0).
…
class(x):-p_wave(P0,normal),diastole(D0,normal),suci(D0,P0),qrs(R0,normal),systole(S0,normal),suci(S0,R0), pr1(P0,R0,normal), suc(R0,P0), suc(S0,D0).
class(x):-p_wave(P0,normal),qrs(R0,normal), pr1(P0,R0,normal), suc(R0,P0),diastole(D0,normal), suci(D0,R0),systole(S0,normal),suc(S0,D0).
Rule fusion + new relational literals
AIME'05 13
Properties of the biased multisource search space
1. rules learned with the biased multisource method have an equal or higher accuracy than the monosource rules learned for the same class (in the worst case: vote)
2. the biased multisource search space is smaller than the naive multisource search space ( DLAB [DLAB97])
3. there is no guaranty to find the best multisource solution with the biased multisource learning
AIME'05 14
Examples of learned rulesclass(svt):- %ECG
qrs(R0),qrs(R1),suc(R1,R0),
qrs(R2),suc(R2,R1),rr1(R1,R2,short),
rythm(R,R1,R2,regular),
qrs(R3), suc(R3,R2),rr1(R2,R3,short),
qrs(R4),suc(R4,R3),rr1(R3,R4, short).
(covers 2 neg)
class(svt):- %ABP
systole(S0),systole(S1),suc(S1,S0),
amp_ss(S0,S1,normal),
systole(S2),suc(S2,S1),
amp_ss(S1,S2,normal),ss1(S1,S2,short).
(covers 1 neg,
does not cover 1 pos)
class(svt):- %biased multi
qrs(R0),qrs(R1),suc(R1,R0),
qrs(R2), suc(R2,R1),rr1(R1,R2,short),
rythm(R,R1,R2,regular),
qrs(R3), suc(R3,R2),rr1(R2;R3,short),
systole(S0), suci(S0,R3),
qrs(R4), suci(R4,S0),suc(R4,R3),
systole(S1),suc(S1,S0), suci(S1,R4),
amp_ss(S0,S1,normal).
class(svt):- %naive multi qrs(R0), systole(S0), suc(S0,R0), qrs(R1), suc(R1,S0), systole(S1), suc(S1,R1),suc(R1,R0),rr1(R1,R2,short).(covers 12 neg)
AIME'05 15
Results on the whole database
6572273510231063Nb Nodes
54/23/25Cardiaccycles
1221Nb Rules
10.70.841TestACC
10.9160.9981ACC
arrhyt1bigeminy BiasedNaive
Source 2(ABP)
Source 1(ECG)
Multi sourceMono source
363.86*310014.2726.99CPU time
*include monosource computation times
Biased multisource much more efficient than naive multisource
No significant improvement from monosource to biased multisource
Database :
• small(50)
• not noisy
• sources are redundant for the studied arrhythmias
AIME'05 16
Less informative database(new results without multisource cross validation problems and
new constraint on ABP monosource learning)
8/54/4/654Cardiaccycles
2311Rules(H)
0.90.640.860.4TestACC
0.980.9450.940.44ACC
arrhyt1ves
BiasedNaiveSource 2
(ABP)
Source 1
(ECG)
Multi sourceMono source
5235Cardiaccycles
1111Rules(H)
0.920.760.840.94TestACC
0.990.760.9620.96ACC
arrhyt2svt
BiasedNaiveSource 2
(ABP)
Source 1
(ECG)
Multi sourceMono source
AIME'05 17
ConclusionBiased multisource vs monosource: better or equal accuracy less complex rules (less rules or less literals)
Biased multisource method vs naive method: better accuracy narrower search space reduced computation time
Multisource learning can improve the reliability of diagnosis (particularly on complementary data)
The biased method allows scalability
AIME'05 18
References[Calicot03] : Temporal abstraction and inductive logic
programming for arrhythmia recognition from ECG. G. Carrault, M-O. Cordier, R. Quiniou, F. Wang, AIMed 2003
[Moody96] : A database to support development and evaluation of intensive care monitoring. G.B. Moody et al. Computer in Cardiology 96
[ICL95] : Inductive Constraint Logic (ILP). L. De Raedt et W. Van Laer, Inductive Logic Programming 95
[Bias96] : Declarative bias in ILP. Nedellec et al. Advances in ILP 96
[DLAB97] : Clausal discovery. L. De Raedt, L. Dehaspe, Machine Learning 97
[Morik00] : Knowledge discovery and knowledge validation in intensive care. K. Morik et al. AIMed 2000
AIME'05 19
Property on aggregated examples
Let Hic a hypothesis induced by learning from source i, i [1,s] and the class c C
• For all k [1,p], if Hic covers (ei,k, c) then it also covers the aggregated example (ek,c)
• For all k [1,n], for all c’ {C-c},
if Hic does not cover (ei,k, c’) and if for all ji, Li Lj= then Hic does not cover the aggregated negative example (ek ,c’)
Top Related