Business Process Deviance Mining

download Business Process Deviance Mining

of 5

Transcript of Business Process Deviance Mining

  • 8/19/2019 Business Process Deviance Mining

    1/8

    Mining Business Process Deviance:

    A Quest for Accuracy

    Hoang Nguyen1, Marlon Dumas2, Marcello La Rosa1,3, Fabrizio Maria Maggi2,and Suriadi Suriadi1

    1 Queensland University of Technology, Australia{huanghuy.nguyen@student.,m.larosa@,s.suriadi@}qut.edu.au

    2 University of Tartu, Estonia{marlon.dumas,f.m.maggi}@ut.ee

    3 NICTA Queensland Lab, Australia

    Abstract.   This paper evaluates the suitability of sequence classificationtechniques for analyzing deviant business process executions based on

    event logs. Deviant process executions are those that deviate in a negativeor positive way with respect to normative or desirable outcomes, such asexecutions that undershoot or exceed performance targets. We evaluatea range of features and classification methods based on their ability toaccurately discriminate between normal and deviant executions. We alsoanalyze the ability of the discovered rules to explain potential causes of observed deviances. The evaluation shows that feature types extractedusing pattern mining techniques only slightly outperform those based onindividual activity frequency. It also suggest that more complex featuretypes ought to be explored to achieve higher levels of accuracy.

    1 Introduction

    Process mining is a family of techniques to extract knowledge of business pro-cesses from event logs [13]. It encompasses, among others, techniques for au-tomated discovery of process models from logs, techniques for checking confor-mance between a given process model and an event log, as well as techniques foranalyzing and predicting performance of business processes based on event logs.

    This paper deals with   business process deviance mining , a family of tech-niques aimed at analyzing event logs in order to explain the reasons why abusiness process deviates from its normal or expected execution. Such devia-tions may be of a negative or of a positive nature. Positive deviance correspondsto executions that lead to high process performance, such as achieving positiveoutcomes with low execution times, low resource usage or low costs. Negativedeviance refers to the executions of the process with low process performance or

    with negative outcomes (e.g. customer complaints) or compliance violations.The input of business process deviance mining is a set of labelled traces. Eachtrace represents one execution (case) of the process under analysis. Each trace isa sequence of events, wherein each event records the execution of an activity. Thelabel associated with a trace indicates whether it is normal or deviant. Giventhis input, the problem of deviance mining is to compute a function (called aclassifier ) that takes as input a trace and outputs its class (normal or deviant).Such function must produce accurate labels, i.e. it should guess the correct classof a trace both for traces in the input (training) set but also for other unseentraces. In addition, as the purpose of deviance mining is explanatory, the functionmust be captured in terms of patterns or rules interpretable by an analyst.

  • 8/19/2019 Business Process Deviance Mining

    2/8

    A family of techniques applicable to deviance mining is  sequence classifica-tion  [14], where the goal is to build classifiers that discriminate between two or

    more classes of sequences. One key step in sequence classification is to extractfeatures from sequences that can be given as input to standard classificationtechniques such as decision trees. Such features are typically extracted usingsequence mining techniques. Various such techniques have been independentlytested in the context of deviance mining as discussed later. However, no com-parative study has been conducted to assess their relative merits in this setting.

    This paper presents a comparative evaluation of sequence mining tech-niques for business process deviance mining. The paper compares two families of techniques: frequent pattern mining and discriminative mining. Techniques arebenchmarked using a battery of event logs covering situations where deviance isfrequent (balanced datasets) and others where deviance is rare (unbalanced).

    The paper is structured as follows. Section 2 discusses existing deviance min-ing methods. Section 3 outlines the methods for feature extraction and classifica-tion evaluated in this study. Next, Section 4 presents the comparative evaluation.Section 5 summarizes the contribution and discusses directions for future work.

    2 Related Work

    Business process deviance mining has been the subject of many case studieswhere a variety of techniques have been applied. For example, in [11] we reporton a case study in an insurance company aimed at explaining why some simpleclaims took too long to be resolved. We applied a technique called  delta-analysis ,which consists in using automated process discovery to extract two process mod-els: one for “normal” cases and one for “deviant” cases, and manually comparingthese two models. Delta analysis has also been applied to explain deviance in

    healthcare processes [8]. In this paper, we do not consider delta analysis becauseof its manual nature. Instead we focus on automated techniques.

    In [10], the authors analyzed a log of a software defect handling process todiscriminate between defect reports leading to correct resolution (normal) vs.those leading to complaints (deviant). They applied a   discriminative pattern mining   algorithm to identify patterns of the form “activity  B   occurs N timesafter activity  A  has occurred M times”, that are frequent in deviant cases butnot in normal cases or vice-versa. A decision tree is built based on these features.

    In [2], the authors sought to discriminate between traces leading to mal-functioning (versus normal functioning) of X-ray machines. Unlike [10] wherediscriminative sequence mining was employed, [2] employs frequent pattern min-ing   to extract so-called “tandem repeats”, “maximal repeats” and “alphabetrepeats” [3]. A tandem repeat is a sequence of events that is repeated; a maxi-mal repeat in a log is a sequence of events that is repeated and not included ina longer repeated sequence; a repeat alphabet is any non-empty intersection of the set of events contained in different tandem/maximal repeats. In control-flowterms, tandem repeats correspond to loops, maximal repeats to subprocesses,and alphabet repeats to parallelism. In [2], tandem repeats, maximal repeatsand alphabet repeats are extracted for all traces combined (normal and deviant).The patterns with highest support are then used to build a decision tree.

    Similarly, in [4] the authors applied frequent pattern mining to discriminatebetween cases with positive clinical outcomes (vs. negative ones) in a processfor congestive heart failure treatment. The authors extracted frequent patterns

  • 8/19/2019 Business Process Deviance Mining

    3/8

    of the form “B   occurs after  A” from positive cases and from negative casesseparately. The extracted patterns were used together with manual delta-analysis

    to extract pathways characteristic of either positive or negative cases.[12] presents a case study where analysts in a company sought to identify

    causes for non-compliant cases in a procurement process. The authors appliedassociation rule mining to extract frequent patterns for normal and for deviantcases separately. The patterns were used to derive rules characterizing deviance.

    In the above studies, the input are sequences of activity occurrences withoutpayload. Sometimes, events in the log carry a payload such as attributes repre-senting resources, or attributes provided as input to the process (e.g. customertype, age) or produced in the process. Preliminary studies have attempted to usesuch data for deviance mining [6,9]. In this paper, we focus on logs consisting of sequences of events without payload. This means we make minimal assumptionson the log and study how much accuracy can be achieved in this setting.

    3 Model construction

    All automated deviance mining techniques reviewed above involve two phases:(i) a pattern extraction phase where patterns are extracted from traces seen assequences of symbols representing activity occurrences; and (ii) a classificationphase, where each trace is abstracted as a vector of features, and these vectorsare used to produce a classifier. The reviewed techniques differ on the employedpattern extraction technique and/or the employed classification method. Fromthe feature extraction perspective, we evaluate the following methods:

    1. Occurrence count of   individual activities   in a trace [11]. In this feature ex-traction method, each activity type (e.g. “Issue Invoice”, “Invoice Paid”)becomes a numerical feature. For a given trace  t, the value of an activity’s

    feature for t  is the number of times the activity occurs in  t.2.   Tandem repeats (TR),   alphabet tandem repeats   (ATR),   maximal repeats 

    (MR) and   alphabet maximal repeats   (AMR) [2]. Here, each pattern (tan-dem repeat, maximal repeat, etc.) is a boolean feature. For a given trace,the feature corresponding to a repeat pattern is true iff the pattern occurs inthe trace. We select for evaluation the repeat patterns defined in [3] as theycapture common control-flow relations (loops, parallelism, subprocesses) andhave been shown to be suitable for deviance mining [2]. Given the potentiallylarge number of TR/MR patterns that can be extracted from a log, we selectN  with highest support where  N  is a parameter of the method.

    3.  Discriminative (iterative) patterns   (DP ) [5]. Here, the features are iterativepatterns (sequences of consecutive events) that appear many times within

    a trace but also across traces. The  N   iterative patterns with the highestdiscriminative power (measured via the  Fischer score ) become boolean fea-tures. Among existing discriminative sequence mining techniques, we selectthe one in [5] because it falls in the same category as the one in [10] and it iscomparable to the technique of [2], as tandem repeats are iterative patternsthat occur frequently within and across traces.

    The first method is a baseline. Our aim is to evaluate the added-value of other feature extraction methods (frequent and discriminative patterns) overindividual activities . Thus, we study 6 feature sets: (i) individual activities (IA);(ii) IA+TR; (iii) IA + ATR; (iv) IA + MR; (v) IA + AMR; and (vi) IA + DP.

  • 8/19/2019 Business Process Deviance Mining

    4/8

    DatasetNormal Deviant Total Avg. Avg. Event Event Is De-

    c ases cases cases length length classes c lasses ba- vianc e(norm.) (dev.) (norm.) (dev.) lanced? criterion

    Hospital 448 363 811 16 20 25 23 Yes temp.

    Insurance 1,921 3,195 5,116 13 24 13 12 No temp.

    BPIdCC   917 225 1,142 109 85 100 100 No non-temp.

    BPIdM 13   832 310 1,142 144 74 99 99 No non-temp.

    BPIdM 16   926 216 1,142 127 113 99 94 No non-temp.

    BPIt101   6 83 459 1,142 195 25 107 103 Yes non-temp.

    Table 1: Descriptive statistics for the six datasets, including whether the distri-bution of normal (norm.) and deviant (dev.) cases is balanced, and whether thedeviance criterion used is temporal (temp.) or non-temporal (non-temp.)

    Given a set of features and a labelled log (i.e. a set of traces labelled as nor-mal/deviant), the reviewed techniques extract a labeled sample ( f 1,...f n  , l)for each trace   t, where  f i  is the value of the   i

    th feature for  t  and   l  is the label(normal/deviant). A range of methods can be used to construct classifiers from

    labelled samples. Since we seek interpretable classifiers, a natural choice are de-cision trees as in [2,10,11]. We specifically use the C4.5 method in RapidMiner.We also include k-NN (k-Nearest Neighbors) in the evaluation. In this method, asample is assigned the most common label among the sample’s  k  nearest neigh-bors in the training set. We set k  = 8 after trial-and-error to find a value yieldinghighest accuracy. Although no rules can be extracted from a k-NN classifier, itsoutput is explainable since given a trace   t, one can show which similar traceshave been used to classify  t. Finally, we included neural networks in the evalua-tion as representative of a method that can adjust itself to the data and handlelarge feature sets [15] although it does not produce interpretable rules.

    4 Evaluation

    4.1 Datasets

    We selected six real-life datasets in order to cover a range of process deviancetypes. Specifically, these datasets contain cases labelled as “normal” and “de-viant” based either on   temporal   or   non-temporal   criteria. The former differ-entiates between normal and deviant cases based on process duration w.r.t. athreshold (e.g. slow cases if above 180 min); the latter criterion differentiatescases based on data attributes (e.g. patient suffered from a given cancer or not).A further dimension for the selection of the datasets was the distribution of deviant cases vs. normal cases (balanced or unbalanced).

    The first dataset (called Hospital in Table 1) records the flow of chest painpatients in an emergency department of an Australian hospital. Each case is

    labelled as “quick” if completed within 180 min and “slow” if completed over180 min. Thus, we used a temporal criterion to classify this log. The second log(called Insurance) comes from a large Australian insurance company and recordsan extract of the instances of a commercial insurance claims handling processexecuted in 2012 [11]. The temporal criterion is 30 days. The remaining fourdatasets were extracted from the BPI challenge 2011 log [1]. This log recordsthe executions of a process related to the treatment of patients diagnosed withcancer in a Dutch hospital. The log contains domain specific case attributes,e.g. Age , Diagnosis , Diagnosis code , and Treatment code . We extracted four logsaccording to four deviance criteria: (i) Deviant cases if Diagnosis is “cervix can-cer” (BPIdCC  dataset in Table 1); (ii) Deviant cases if Diagnosis code is “M13”

  • 8/19/2019 Business Process Deviance Mining

    5/8

    (BPIdM 13); (iii) Deviant cases if Diagnosis code is “M16” (BPIdM 16); (iv) De-viant cases if Treatment code is “101” (BPIt101).

    4.2 Classification accuracy

    We measured classification accuracy using the standard notion of  accuracy   de-fined as   tp+tn

    tp+tn+fp+fn  where   tp   and   tn   are the # traces correctly classified as

    deviant and normal (true positives and true negatives),  fp   is the # false pos-itives and   fn   is the # false negatives. We also report on the  Area Under the ROC Curve (AUC)  of each classifier. This corresponds to the probability that arandom negative sample is ranked higher than a random positive sample in thelist of samples ranked from most to least likely to belong to the deviant class.

    In order to test all combinations of feature types and classification methodsfor all six datasets under exam, we created a scientific workflow in RapidMinerv6.0. The vector spaces containing the extracted features were stored in a MS

    Access database. Tandem/maximal repeats and their alphabet variants wereextracted using the ProM plugin described in [2], while discriminative patternswere extracted using the tool implementation in [5].1

    Tables 2–7 show the results of the measurements of the classification accuracyfor the six datasets using the feature types and classification methods discussedin Section 3. The tables report mean accuracy and AUC obtained for each clas-sification method based on five-fold cross-validation, meaning that each datasetis split five times into 80% of the dataset for training and 20% for testing andaccuracy/AUC is calculated for each such “fold” and aggregated across all fivefolds. Next to accuracy we also show the interval of accuracy/AUC values acrossall folds in the form of a +/ − δ   bracket from the mean (standard deviation).

    Feature type Decision Tree k-NN Neural Net

    AC(%) AUC AC(%) AUC AC(%) AUCIA 66.09±1.53 0.683±0.008   70.66±3.25 0.761±0.038 69.55±2.78 0.751±0.039

    IA+T R 65.35±2.45 0.639±0.034 69.67±3.16   0.769±0.031 66.22±3.74 0.714±0.049IA+ATR 68.19±3.69 0.689±0.043 70.04±2.99 0.766±0.029 67.69±1.20 0.741±0.020

    IA+MR 64.37±3.38 0.627±0.033 70.16±1.74 0.764±0.019 66.59±3.65 0.735±0.036

    IA+AMR 65.72±1.66 0.645±0.027 69.54±3.23 0.766±0.036 66.95±2.93 0.736±0.036

    IA+DP 66.70±2.73 0.645±0.038 70.53±2.66 0.762±0.026 65.60±2.65 0.712±0.045

    Table 2: Classification Results for Hospital dataset (temporal – balanced)

    Feature type Decision Tree k-NN Neural NetAC(%) AUC AC(%) AUC AC(%) AUC

    IA 83.17±1.77 0.857±0.017 83.87±1.00 0.908±0.011   86.49±1.18   0.937±0.006IA+T R 83.33±0.71 0.838±0.007 83.64±0.82 0.912±0.009 83.13±1.84 0.894±0.016

    IA+ATR 82.60±1.49 0.832±0.027 83.80±1.01 0.912±0.010 83.97±2.62 0.895±0.028

    IA+MR 82.62±0.84 0.825±0.021 83.91±1.30 0.912±0.010 84.46±0.98 0.903±0.011

    IA+AMR 82.66±0.92 0.813±0.023 84.28±0.92 0.912±0.012 83.11±0.89 0.900±0.010

    IA+DP 83.50±1.16 0.840±0.017 85.13±0.72 0.918±0.010 83.48±1.21 0.916±0.003

    Table 3: Classification Results for Insurance dataset (temporal – unbalanced)

    From the results, we can draw the following observations. First, when the de-viance criterion is temporal, i.e. based on the duration of the process (Hospitaland Insurance datasets), IA alone tends to achieve the highest accuracy levels,though the difference w.r.t. other feature types is minimal. For example, in theHospital dataset, accuracy is   ∼  70% with k-NN across all feature types; in theInsurance dataset, Neural Networks achieve the highest value (86.5%) via IA,

    1 The BPI log extracts, the scientific workflow and the results of the tests can bedownloaded from  http://tinyurl.com/kvqtepy.

  • 8/19/2019 Business Process Deviance Mining

    6/8

    Feature type Decision Tree k-NN Neural NetAC(%) AUC AC(%) AUC AC(%) AUC

    IA 78.81±2.21 0.752±0.026 79.95±1.15 0.751±0.040 78.37±3.22 0.771±0.057

    IA+TR 76.97±4.14 0.736±0.061   81.17±2.71 0.761±0.039 72.78±12.82 0.724±0.030IA+ATR 78.19±2.54 0.738±0.061 80.56±1.90 0.760±0.025 79.16±1.22 0.684±0.035

    IA+ MR 76.35±1.93 0.682±0.050 80.38±2.10   0.773±0.020 75.32±6.93 0.659±0.112IA+AMR 75.66±1.78 0.687±0.038 80.21±2.00 0.771±0.022 79.25±1.34 0.675±0.076

    IA+DP 78.98±2.15 0.744±0.033 80.65±1.12 0.771±0.019 78.98±1.20 0.681±0.061

    Table 4: Classification Results for BPIdCC  dataset (non-temporal – unbalanced)

    Feature type Decision Tree k-NN Neural NetAC(%) AUC AC(%) AUC AC(%) AUC

    IA 71.63±2.01 0.721±0.033 72.59±0.92 0.700±0.011 71.98±2.49   0.751±0.026IA+T R 71 .19±2.00 0.710±0.042 72.07±1.36 0.705±0.016 72.42±1.15 0.671±0.033

    IA+ATR 72.33±1.60 0.691±0.012 71.72±0.78 0.698±0.007 69.08±9.15 0.641±0.101

    IA+MR 71 .98±2.59 0.722±0.026 71.28±0.91 0.692±0.013 72.24±2.21 0.693±0.050

    IA+AMR 72.33±2.97 0.728±0.029 71.19±1.09 0.692±0.013 71.29±5.79 0.662±0.099

    IA+DP   73.99±3.33 0.727±0.064 72.85±2.27 0.728±0.045 71.36±2.24 0.694±0.055

    Table 5: Classification Results for BPIdM 13 dataset (non-temporal – unbalanced)

    with the other feature types/classifiers ranging from 82% to 85%. In the Insur-ance dataset, we also get the highest AUC with Neural Networks on top of IA,while in the Hospital dataset, the highest AUC is obtained by k-NN, which is es-sentially the same across all feature types (∼ 0.76%). These results suggest thatIA already carries most of the signal when the labeling of deviance/normal casesis based on a temporal criterion. This is attributable to the fact that process du-ration is directly correlated with the number of activities being performed, so themore activities are repeated, the longer a process case will take. More precisely, arepeated activity indicates a loop in the process, which is typically symptomaticof process delays. For example, in the case of the Insurance dataset this relatesto the repetition, among others, of activity “Request additional information”,

    indicating that there is no sufficient information to progress the handling of theclaim (e.g. further evidence of an accident is needed).

    Feature type Decision Tree k-NN Neural NetAC(%) AUC AC(%) AUC AC(%) AUC

    IA 82.49±1.06 0.759±0.058 83.27±1.81 0.774±0.031 83.45±2.30 0.832±0.058

    IA+TR 83.19±0.91 0.763±0.028 83.19±1.17 0.771±0.022 82.75±0.74 0.799±0.030

    IA+ATR 83.10±1.35 0.749±0.069 83.10±1.22 0.767±0.025 72.87±14.48 0.759±0.049

    IA+ MR 82.57±0.94 0.766±0.049 82.84±1.44 0.776±0.026 81.53±1.85 0.773±0.047

    IA+AMR 82.57±0.94 0.766±0.051 82.84±1.44 0.776±0.027 79.78±4.27 0.799±0.054

    IA+DP 84.06±0.79 0.736±0.076   84.68±0.61 0.803±0.008 82.84±2.24   0.834±0.025

    Table 6: Classification Results for BPIdM 16 dataset (non-temporal – unbalanced)

    Second, when the deviance criterion is not temporal but based on a dataattribute (this is the case in all BPI datasets), we observe a marginal increase of 

    accuracy with sequence mining techniques. In particular, we obtain the highestaccuracy with tandem repeats in the BPIdCC    (81.2%) and BPIt101   (87.6%),and with discriminative patters in BPIdM 13  (74%) and BPIdM 16  (84.7%). Theseresults tend to be confirmed by the AUC, whose highest values are obtainedby a sequence mining technique, though not necessarily the same, e.g. in theBPIdCC    dataset the highest AUC (0.77) is achieved by IA + MR whereas thehighest accuracy is achieved by IA + TR. An exception is made by the BPIdM 13dataset, where the highest AUC is achieved by IA alone (0.75). These resultssuggest that in the case of a labeling not dependent on process duration, the totalnumber of activities alone is not enough to explain why certain deviances occur.That said, once again, the increase of accuracy achieved by the sequence mining

  • 8/19/2019 Business Process Deviance Mining

    7/8

    Feature type Decision Tree k-NN Neural NetAC(%) AUC AC(%) AUC AC(%) AUC

    IA 84.94±1.76 0.860±0.017 85.99±2.09 0.910±0.026 83.10±4.25 0.893±0.030

    IA+TR 84.94±1.66 0.850±0.009   87.57±2.37 0.913±0.026 75.57±8.64 0.728±0.262IA+ATR 85.38±1.47 0.861±0.013 86.69±2.73 0.911±0.027 69.61±11.63 0.766±0.233

    IA+ MR 84.76±1.60 0.859±0.017 86.43±3.28 0.912±0.026 67.34±8.97 0.564±0.273

    IA+AMR 84.77±1.61 0.855±0.011 86.17±3.39 0.913±0.028 71.19±9.60 0.625±0.297

    IA+DP 84.77±2.75 0.864±0.037 87.48±2.03   0.920±0.006 62.08±10.56 0.510±0.255

    Table 7: Classification Results for BPIt101   dataset (non-temporal – balanced)

    techniques is marginal compared to IA alone, indicating that probably otherfeature types such as inter-arrival rate, resources and input/output data, haveto be extracted to be able to better discriminate between normal and deviantcases. We also remark that none of the techniques indeed achieves an accuracyof 95% or above, with values ranging from 64.4 to 87.6%.

    Third, the three classification methods under analysis all produce stable re-sults in terms of standard deviation (relatively low). An exception is made by

    Neural Networks, which have high standard deviation in all BPI datasets. This isprobably due to the type of classification based on a specific data attribute ratherthan on a temporal criterion. Further, out of all feature types, discriminative pat-terns tend to have the most stable results (lowest standard deviation) across alldatasets and feature types, with an exception being the BPIt101  dataset.

    4.3 Rules interestingness

    We analyzed the capability of the rules extracted from decision trees to explain asmuch as possible deviant process executions using the least amount of rules. Foreach rule, we calculated its  coverage as the ratio between the number of devianttraces that satisfy a rule and the total number of deviant traces (i.e. the recall),and its  precision as the ratio between the number of traces that satisfy a rule

    and the number of traces that are classified correctly. For example, using IA+DPon the insurance log, we obtain the following rule: if the patterns   (New Claim,Authorise Payment, Close File) and   (New Claim, Close File) do not oc-cur, the trace is classified as “slow”. The coverage of this rule is 45.23% whileits precision is 98.84%.

    To compare the overall interestingness of rules through the use of differentfeature types, we use the ratio between the average coverage of all rules derivedfrom the use of a particular feature type, and the total number of rules. Theavg coverage/#rules ratio reflects the strength and simplicity of a ruleset (thehigher the more powerful, as more deviant cases can be explained with less rules).From the results we observe that the rulesets mined from IA+AMR and IA+DPhave the highest coverage/#rules ratio. However, neither the coverage nor the

    #rules depend on the chosen feature type; rather, they depend on the datasetcharacteristics. For detailed results we refer to the technical report [7].

    5 Conclusion

    Existing methods for business process deviance mining extract patterns fromevent logs based on frequent or discriminative pattern mining. We have empir-ically observed that in processes with high variability, (discriminative) patternmining approaches may slightly outperform those based purely on activity occur-rence counts. However, in all cases, the accuracy obtained with such approachesis limited (rarely above 80%). Underlying this limitation is the fact that the

  • 8/19/2019 Business Process Deviance Mining

    8/8

    reviewed methods treat the input as sequences of simple symbols representingactivity occurrences. Oftentimes, including all six datasets in this study, busi-

    ness process logs consist instead of  temporal complex  symbolic sequences, i.e. se-quences of timestamped events with payloads consisting of attribute-value pairs.A direction for future work is to develop and apply techniques for extracting(discriminative) patterns from complex symbolic sequences – a non-trivial andopen problem [14]. While tackling the problem of complex symbolic sequencemining in the general case is very challenging, it may be possible to reducethe problem of business process deviance mining to well-scoped subsets of thisproblem, for example by taking advantage of information contained in availableprocess models in order to prune the pattern search space.

    Acknowledgments.  Work funded by the Estonian Research Council, ERDF viathe Estonian Centre of Excellence Programme and by National ICT Australia.

    References

    1. 3TU Data Center. BPI Challenge 2011 Event Log, 2011.doi:10.4121/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54.

    2. R.P. J.C. Bose and W.M.P. van der Aalst. Discovering signature patterns fromevent logs. In  Proceedings of CIDM , pages 111–118. IEEE, 2013.

    3. R.P.J.C. Bose and W.M.P. van der Aalst. Trace clustering based on conservedpatterns: Towards achieving better process models. In  Proc. of BPM Workshops .Springer, 2010.

    4. G.T. Lakshmanan, S. Rozsnyai, and F. Wang. Investigating clinical care pathwayscorrelated with outcomes. In  Proc. of BPM , pages 323–338. Springer, 2013.

    5. D. Lo, H. Cheng, J. Han, S.-C. Khoo, and C. Sun. Classification of softwarebehaviours for failure detection: A discriminative pattern mining approach. InProc. of KDD , pages 557–566. ACM, 2009.

    6. J. Nakatumba and W.M.P. van der Aalst. Analyzing resource behaviour usingprocess mining. In Proc. of BPM Workshops , pages 69–80. Springer, 2010.

    7. H. Nguyen, M. Dumas, M. La Rosa, F.M. Maggi, and S. Suriadi. Mining businessprocess deviance: A quest for accuracy. ePrint 75279, QUT, 2014.  http://eprints.qut.edu.au/75279/.

    8. A. Partington, M.T. Wynn, S. Suriadi, C. Ouyang, and J. Karnon. Process miningof clinical processes: Comparative analysis of four australian hospitals. ACM Trans.in Management Information System , 2014. In press.

    9. J. Poelmans, G. Dedene, G. Verheyden, H. Van der Mussele, S. Viaene, and E. Pe-ters. Combining business process and data discovery techniques for analyzing andimproving integrated care pathways. In   Proc. of Industrial ICDM Conference ,pages 505–517. Springer, 2010.

    10. C. Sun, J. Du, N. Chen, S.-C. Khoo, and Y. Yang. Mining explicit rules for software

    process evaluation. In Proc. of ICSSP , pages 118–125. ACM, 2013.11. S. Suriadi, M.T. Wynn, C. Ouyang, A.H.M. ter Hofstede, and N.J. van Dijk. Un-derstanding process behaviours in a large insurance company in Australia: A casestudy. In  Proc. of CAiSE , pages 449–464. Springer, 2013.

    12. J. Swinnen, B. Depaire, M.J. Jans, and K. Vanhoof. A process deviation analysis– a case study. In Proc. of the BPM’2011 Workshops , pages 87–98. Springer, 2012.

    13. W.M.P. van der Aalst.   Process Mining - Discovery, Conformance and Enhance-ment of Business Processes . Springer, 2011.

    14. Z. Xing, J. Pei, and E.J. Keogh. A brief survey on sequence classification.  SIGKDD Explorations , 12(1):40–48, 2010.

    15. G. P. Zhang. Neural networks for classification: A survey. IEEE Trans. on Systems,Man, and Cybernetics, Part C: Applications and Reviews , 30(4):451–462, 2000.