383671

17
Research Article Multiple Na\ve Bayes Classifiers Ensemble for Traffic Incident Detection Qingchao Liu, 1,2 Jian Lu, 1,2 Shuyan Chen, 1,2 and Kangjia Zhao 3 1 Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing 210096, China 2 Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Nanjing 210096, China 3 Department of Civil & Environment Engineering, National University of Singapore, Singapore 119078 Correspondence should be addressed to Jian Lu; lujian [email protected] Received 16 January 2014; Revised 26 March 2014; Accepted 27 March 2014; Published 28 April 2014 Academic Editor: Erik Cuevas Copyright © 2014 Qingchao Liu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. is study presents the applicability of the Na¨ ıve Bayes classifier ensemble for traffic incident detection. e standard Naive Bayes (NB) has been applied to traffic incident detection and has achieved good results. However, the detection result of the practically implemented NB depends on the choice of the optimal threshold, which is determined mathematically by using Bayesian concepts in the incident-detection process. To avoid the burden of choosing the optimal threshold and tuning the parameters and, furthermore, to improve the limited classification performance of the NB and to enhance the detection performance, we propose an NB classifier ensemble for incident detection. In addition, we also propose to combine the Na¨ ıve Bayes and decision tree (NBTree) to detect incidents. In this paper, we discuss extensive experiments that were performed to evaluate the performances of three algorithms: standard NB, NB ensemble, and NBTree. e experimental results indicate that the performances of five rules of the NB classifier ensemble are significantly better than those of standard NB and slightly better than those of NBTree in terms of some indicators. More importantly, the performances of the NB classifier ensemble are very stable. 1. Introduction e functionality of automatically detecting incidents on freeways is a primary objective of advanced traffic manage- ment systems (ATMS), an integral component of the Nation’s Intelligent Transportation Systems (ITS) [1]. Traffic incidents are defined as nonrecurring events such as accidents, disabled vehicles, spilled loads, temporary maintenance and construc- tion activities, signal and detector malfunctions, and other special and unusual events that disrupt the normal flow of traffic and cause motorist delay [2, 3]. If the incident cannot be handled timely, it will increase traffic delay, reduce road capacity, and oſten cause second traffic accidents. Timely detection of incidents is critical to the successful implemen- tation of an incident management system on freeways [4]. Incident detection is essentially a pattern classification problem, where the incident and nonincident traffic patterns are to be recognized or classified [5]. at is to say, incident detection can be viewed as a pattern recognition problem that classifies traffic patterns into one of the two classes: noninci- dent and incident classes. e classification is normally based on spatial and temporal traffic pattern changes during an inci- dent. e spatial pattern changes refer to the traffic pattern alterations over a stretch of a freeway. e temporal traffic pattern changes refer to the traffic pattern alterations over consecutive time intervals. Typically, traffic flow maintains a consistent pattern at upstream and downstream detector stations. When an incident occurs, however, traffic flow at the upstream of incident scene tends to be congested while that at the downstream station tends to be light due to the blockage at incident site. ese changes in the traffic flow are reflected in the detector data obtained from both the upstream and downstream stations [5]. erefore, an AID problem is essentially a classification problem. Any good classifier is a potential tool for the incident detection problem. Based on this idea, in our approach to incident detection in general, we treat the problem as one of pattern classification problems. Under normal traffic operation, traffic parameters (speeds, Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 383671, 16 pages http://dx.doi.org/10.1155/2014/383671

description

jj

Transcript of 383671

  • Research ArticleMultiple Na\ve Bayes Classifiers Ensemble for TrafficIncident Detection

    Qingchao Liu,1,2 Jian Lu,1,2 Shuyan Chen,1,2 and Kangjia Zhao3

    1 Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing 210096, China2 Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Nanjing 210096, China3Department of Civil & Environment Engineering, National University of Singapore, Singapore 119078

    Correspondence should be addressed to Jian Lu; lujian [email protected]

    Received 16 January 2014; Revised 26 March 2014; Accepted 27 March 2014; Published 28 April 2014

    Academic Editor: Erik Cuevas

    Copyright 2014 Qingchao Liu et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

    This study presents the applicability of the Nave Bayes classifier ensemble for traffic incident detection. The standard NaiveBayes (NB) has been applied to traffic incident detection and has achieved good results. However, the detection result of thepractically implementedNBdepends on the choice of the optimal threshold, which is determinedmathematically by using Bayesianconcepts in the incident-detection process. To avoid the burden of choosing the optimal threshold and tuning the parameters and,furthermore, to improve the limited classification performance of the NB and to enhance the detection performance, we propose anNB classifier ensemble for incident detection. In addition, we also propose to combine the Nave Bayes and decision tree (NBTree)to detect incidents. In this paper, we discuss extensive experiments that were performed to evaluate the performances of threealgorithms: standard NB, NB ensemble, and NBTree. The experimental results indicate that the performances of five rules of theNB classifier ensemble are significantly better than those of standard NB and slightly better than those of NBTree in terms of someindicators. More importantly, the performances of the NB classifier ensemble are very stable.

    1. Introduction

    The functionality of automatically detecting incidents onfreeways is a primary objective of advanced traffic manage-ment systems (ATMS), an integral component of the NationsIntelligent Transportation Systems (ITS) [1]. Traffic incidentsare defined as nonrecurring events such as accidents, disabledvehicles, spilled loads, temporarymaintenance and construc-tion activities, signal and detector malfunctions, and otherspecial and unusual events that disrupt the normal flow oftraffic and cause motorist delay [2, 3]. If the incident cannotbe handled timely, it will increase traffic delay, reduce roadcapacity, and often cause second traffic accidents. Timelydetection of incidents is critical to the successful implemen-tation of an incident management system on freeways [4].

    Incident detection is essentially a pattern classificationproblem, where the incident and nonincident traffic patternsare to be recognized or classified [5]. That is to say, incidentdetection can be viewed as a pattern recognition problem that

    classifies traffic patterns into one of the two classes: noninci-dent and incident classes.The classification is normally basedon spatial and temporal traffic pattern changes during an inci-dent. The spatial pattern changes refer to the traffic patternalterations over a stretch of a freeway. The temporal trafficpattern changes refer to the traffic pattern alterations overconsecutive time intervals. Typically, traffic flow maintainsa consistent pattern at upstream and downstream detectorstations.When an incident occurs, however, traffic flow at theupstream of incident scene tends to be congested while that atthe downstream station tends to be light due to the blockageat incident site. These changes in the traffic flow are reflectedin the detector data obtained from both the upstream anddownstream stations [5]. Therefore, an AID problem isessentially a classification problem. Any good classifier is apotential tool for the incident detection problem. Based onthis idea, in our approach to incident detection in general, wetreat the problem as one of pattern classification problems.Under normal traffic operation, traffic parameters (speeds,

    Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014, Article ID 383671, 16 pageshttp://dx.doi.org/10.1155/2014/383671

  • 2 Mathematical Problems in Engineering

    occupancies, and volumes) both upstream and downstreamfrom a given freeway site are expected to have more or lesssimilar patterns in time (except under bottleneck situations).In the case of an incident, this normal pattern is disrupted.Patterns of incident develop increased occupancies and dropin speeds for instance.

    Automated incident detection (AID) systems, whichemploy an incident detection algorithm to detect incidentsfrom traffic data, aim to improve the accuracy and efficiencyof incident detection over a large traffic network. Early AIDalgorithmdevelopment focused on simple comparisonmeth-ods using raw traffic data [6, 7]. To enhance algorithm perfor-mance and to achieve real-time incident detection, advancedmethods have been suggested, which include image process-ing [8], artificial neural networks [9], support vectormachine[4], and data fusion [10]. Although these new publishedmethods represent significant improvements, performancestability and transferability are still major issues concerningthe existing AID algorithms. To enhance freeway incidentdetection and to fulfill the universality expectations for AIDalgorithms, the classic Bayesian theory has attracted manyscientists attention. Due to the Nave Bayes ensemble andNave Bayes, and decision tree (NBTree) algorithm simplicityand easy interpretation, the applications of this approach canbe found in an abundant literature. However, the report ofits application to traffic engineering is rare. It provides amplemotivation to investigate thismodel performance on incidentdetection.

    There are drawbacks which limit its applications; theoptimal threshold and parameter of Nave Bayes (NB) havea great effect on the generalization performance, and settingthe parameters of the NB classifier is a challenging task.At present, there is no structured method to choose them.Typically, the optimal threshold and parameters have to beenchosen and tuned by trial and error. Some studies haveapplied search techniques for this problem; however, a largeamount of computation time will still be involved in suchsearch techniques, which are themselves computationallydemanding.

    A natural and reasonable question is whether we canincrease or at least maintain NB performance, while, at thesame time, avoiding the burden of choosing the optimalthreshold and tuning the parameters. Some researchers haveproposed classifier ensembles to address this problem. Theperformance of classifier ensemble has been investigatedexperimentally, and it appears to consistently give betterresults [1113]. Kittler et al. [12] used different combiningschemes based on multiple classifier ensembles, and six ruleswere introduced into ensemble learning to search for accurateand diverse classifiers to construct a good ensemble. Thepresented method works on a higher level and is more directthan other search based methods of ensemble learning. Theprevious research illustrated that ensemble techniques areable to increase the classification accuracy by combining theirindividual outputs [14, 15]. While these general methods arepreexisting, their application into the specific problem andtheir integration into the proposed model for the detectionof traffic incident are new. It is expected that the NB classifierensemble has more generalization ability, but the method has

    not been seen discussed in the context of incident detection.Coupling this expectation of reasonably accurate detectionwith the attractive implementation characteristics of the NBclassifier ensemble, we propose to apply the NB classifierensemble achieved by the combination of different rules todetect incidents. The NB classifier ensemble algorithm trainsmany individual NB classifiers to construct the classifierensemble and then uses this classifier ensemble to detectthe traffic incidents. It needs to train many times. Takingthis into account, we also propose NBTree [16] for incidentdetection. The NBTree splits the dataset by applying anentropy-based algorithm and uses standard NB classifiers atthe leaf node to handle attributes.TheNBTree applies strategyto construct a decision tree and replaces leaf nodes with NBclassifiers. We have performed some experiments to evaluatethe performances of the standard NB, NB ensemble, andNBTree algorithms.The experimental results indicate that theperformances of five rules of the NB classifier ensemble aresignificantly better than those of standard NB and slightlybetter than those ofNBTree in terms of some indicators.Moreimportantly, the performances of the NB classifier ensembleare very stable.

    The remaining part of the paper is structured as follows.Section 2 introduces the combination schemes of the NBclassifier ensemble. The general performance criteria of AIDare presented in Section 3. Section 4 is devoted to empiricalresults. In this section, we evaluate the performances of thethree algorithms: standard NB, NB ensemble, and NBTree.Finally, the conclusions are drawn in Section 5.

    2. Na\ve Bayes Classifier andCombining Schemes

    2.1. Classification and Classifier. A dataset generally consistsof feature vectors, where each feature vector is a descriptionof an object by using a set of features. For example, take alook at the synthetic dataset as shown in Figure 1. Here, eachobject is a data point described by the features -coordinate,-coordinate, and color, and a feature vector looks like (0.8,0.9, yellow) or (0.9, 0.1, red). Features are also called attributes,a feature vector is also called an instance, and sometimes adataset is called a sample.

    Nave Bayes classifier ensemble is a predictive model thatwewant to construct or discover from the dataset.Theprocessof generating models from data is called learning or training,which is accomplished by a learning algorithm. In supervisedlearning, the goal is to predict the value of a target featureon unseen instances, and the learned model is also called apredictor. For example, if we want to predict the color of thesynthetic data points, we call yellow and red labels, andthe predictor should be able to predict the label of an instanceforwhich the label information is unknown, for example, (0.7,0.7). If the label is categorical, such as color, the task is calledclassification and the learner is also called classifier.

    2.2. Nave Bayes Classifier. To classify a test instance , oneapproach is to formulate a probabilistic model to estimatethe posterior probability ( | ) of different s and predict

  • Mathematical Problems in Engineering 3

    10

    1

    X

    Y

    Instance: (X, Y; color)(1) If (X < 0.25 and Y > 0.75) or

    (X > 0.75 and Y < 0.25) then

    (2) If (X > 0.75 and Y > 0.75) then

    Y [0.25, 0.50]) then

    (3) If (X < 0.25 and Y < 0.25) then

    (4) If (X [0.25, 0.50] and

    Figure 1: The synthetic dataset.

    the one with the largest posterior probability; this is themaximum a posteriori (MAP) rule. By Bayes Theorem, wehave

    ( | ) =

    ( | ) ()

    ()

    , (1)

    where () can be estimated by counting the proportion ofclass in the training set and() can be ignored sincewe arecomparing different s on the same . Thus we only need toconsider ( | ). If we can get an accurate estimate of ( |), we will get the best classifier in theory from the giventraining data, that is, the Bayes optimal classifier with theBayes error rate, the smallest error rate in theory. However,estimating ( | ) is not straightforward, since it involvesthe estimation of exponential numbers of joint-probabilitiesof the features. To make the estimation tractable, someassumptions are needed. The naive Bayes classifier assumesthat, given the class label, the features are independent ofeach other within each class. Thus, we have

    ( | ) =

    =1

    (| ) (2)

    which implies that we only need to estimate each feature valuein each class in order to estimate the conditional probability,and therefore the calculation of joint-probabilities is avoided.In the training stage, the naive Bayes classifier estimates

    the probabilities () for all classes and (| ) for

    all features = 1, 2, . . . , and all feature values from the

    training set. In the test stage, a test instance will be predictedwith label if leads to the largest value of all the class labels

    ( | ) ()

    =1

    (| ) . (3)

    As demonstrated in paper [17], for a threshold level of0.0006, 65.4% of incidents were identified. However, if thethreshold cannot be chosen appropriately, fewer incidentswill be identified correctly. As known frommachine learning,the naive Bayesian classifier provides a simple approach,with clear semantics, for representing, using, and learningprobabilistic knowledge. The method is designed for use insupervised induction tasks, in which the performance goalis to accurately predict the class of test instances and inwhich the training instances include class information. In thisway, it avoids choosing the optimal threshold and tuning theparameters manually.

    2.3. Combining Schemes

    2.3.1. Five Rules. Themost widely used probability combina-tion rules [12] are the product rule, the sum rule, themin rule,and the max rule. Given classifiers and

    1, . . . ,

    , classes

    1, . . . ,

    , these are defined as follows.

    Product rule:

    (| x1, . . . , x

    ) =

    1

    ()

    =1

    (| x) , (4)

    where xis the input to the th classifier and (

    ) is the a

    priori probability for class .

    Sum rule:

    (| x1, . . . , x

    ) =

    1

    =1

    (| x) . (5)

    Min rule:

    (| x1, . . . , x

    ) =

    min (| x)

    =1min (| x)

    . (6)

    Max rule:

    (| x1, . . . , x

    ) =

    max (| x)

    =1max (| x)

    . (7)

    Majority vote rule:

    =

    {

    {

    {

    1 if (| x) =

    max=1

    (| x)

    0 otherwise,

    =1

    =

    max=1

    =1

    .

    (8)

    Note that for each class , the sum on the right hand side of

    (8) simply counts the votes received for this hypothesis fromthe individual classifiers. The class that receives the largestnumber of votes is then selected as the majority decision.

  • 4 Mathematical Problems in Engineering

    2.3.2. Combine Decision Tree and Nave Bayes. The NaveBayesian tree (NBTree) algorithm is similar to the classicalrecursive partitioning schemes, except that the leaf nodes cre-ated are nave Bayesian classifiers instead of nodes predictinga single class [16]. First, define a measure called entropy thatcharacterizes the purity of an arbitrary collection of instances.Given a collection , if the target attribute can take on different values, then the entropy of relative to this -wiseclassification is defined as

    Entropy () =

    =1

    log2() , (9)

    where is the proportion of belonging to class . The

    information gain, Gain(, ), of an attribute , the expectedreduction in entropy caused by partitioning the examplesaccording to this attribute relative to , is defined as

    Gain (, ) = Entropy () Vvalue()

    V

    ||

    Entropy (V) ,

    (10)

    where value () is the set of all possible values for attributes and V is the subset of for which attribute has valueV. NBTree is a hybrid approach that attempts to utilizethe advantage of both decision trees and nave Bayesianclassifiers. It splits the dataset by applying an entropy-basedalgorithm and uses standard nave Bayesian classifiers at theleaf node to handle attributes. NBTree applies strategy toconstruct a decision tree and replaces leaf nodes with NBclassifiers.

    3. Performance Criteria of Aid

    3.1. Definition of DR, FAR, MTTD, and CR. Four primarymeasures of performance, namely, detection rate (DR), falsealarm rate (FAR), mean time to detection (MTTD), andclassification rate (CR), are used to evaluate traffic incidentdetection algorithms. We will quote the definitions from[18, 19].

    DR is defined as the number of incidents correctlydetected by the traffic incident detection algorithmdivided bythe total number of incidents known to have occurred duringthe observation period:

    DR = number of incident cases detectedtotal number of incident cases

    100%. (11)

    FAR is defined as the proportion of instances that wereincorrectly classified as incident instances based on the totalinstances in the testing set. Out of the total number ofapplications of the model to the dataset, FAR is calculatedto determine how many incident alarms were falsely set.In order to decrease FAR, persistent test is often used. Anincident alarm is triggered whenever consecutive outputsof the model exceed the threshold, which is called persistentcheck of

    FAR = number of false alarm casestotal number of non-incident cases

    100%.(12)

    MTTD is computed as the average length of time between thestart of the incident and the time the alarm is initiated.Whenmultiple alarms are declared for a single incident, only thefirst correct alarm is used for computing the detection rateand the mean time to detect

    MTTD =1+ 2+ +

    + +

    . (13)

    Besides these three measures, we also use classificationrate (CR) as an index to test traffic incident detectionalgorithms. Of the total number of applications of cyclelength data or input instances, the percentage of correctlyclassified instances (including both incident and nonincidentinstances) by the model is computed as CR

    CR =number of instances correctly classified

    total number of instances 100%.

    (14)

    3.2. Area under the ROC Curve (AUC). Receiver opera-tor characteristic (ROC) curves illustrate the relationshipbetween the DR and the FAR from 0 to 1. Often thecomparison of two or more ROC curves consists of eitherlooking at the area under the ROC curve (AUC) or focusingon a particular part of the curves and identifying which curvedominates the other in order to select the best-performingalgorithm. AUC, when using normalized units, is equal tothe probability that a classifier will rank a randomly chosenpositive instance higher than a randomly chosen negative one(assuming positive ranks higher than negative) [20]. Itcan be shown that the area under the ROC curve is closelyrelated to theMann-Whitney, which testswhether positivesare ranked higher than negatives. It is also equivalent to theWilcoxon test of ranks [21]. The AUC is related to the Ginicoefficient (

    1) by the formula [22]. In this way, it is possible

    to calculate the AUC by using an average of a number oftrapezoidal approximations:

    AUC = 1 + 12

    , (15)

    where 1= 1

    =1( 1)(+ 1).

    3.3. Statistics Indicators. In statistics, the mean absolute error(MAE) is a quantity used to measure how close forecasts orpredictions are to the eventual outcomes. The mean absoluteerror is given by

    MAE = 1

    =1

    . (16)

    The root-mean-square error (RMSE) is a frequently usedmeasure of the differences between values predicted by amodel or an estimator and the values actually observed.These individual differences are called residuals when thecalculations are performed over the data sample that wasused for estimation and are called prediction errors whencomputed out of sample. The RMSE serves to aggregate

  • Mathematical Problems in Engineering 5

    the magnitudes of the errors in predictions for various timesinto a single measure of predictive power:

    RMSE = 1

    =1

    ( )

    2

    . (17)

    The equality coefficient (EC) is useful for comparing differentforecast methods; for example, whether a fancy forecast isin fact any better than a nave forecast repeating the lastobserved value. The closer the value of EC is to 1, the betterthe forecast method. A value of zero means the forecast is nobetter than a nave guess:

    EC = 1

    =1( )

    2

    =12

    +

    =12

    .(18)

    Kappa measures the agreement between two raters who eachclassify items into mutually exclusive categories. Kappais computed as formula (19), () is the observed agreementamong the raters, and () is the expected agreement; thatis, () represents the probability that the raters agree bychance. The values of Kappa are constrained to the interval[1, 1]. Kappa = 1means perfect agreement, Kappa = 0meansthat agreement is equal to chance, and Kappa = 1 meansperfect disagreement:

    Kappa = () ()1 ()

    . (19)

    4. Experiments on Traffic Incident Detection

    4.1. Parameters and Procedures of Experiments. To describethe experiments clearly, we first present the definitions forall parameters and symbols used in experiments. Then, wedescribe the experiment procedures in detail.

    4.1.1. Parameters of Experiments. Some parameters areadopted to make the procedures of the experiments moreautomatic and optimized. In addition, some symbols areused to denote specified conceptions. For clarity, we havepresented the definitions of each parameter and symbol inTable 1.

    4.1.2. Construction of Datasets for Training and Testing. Thetraffic datamentioned refers to three basic traffic flow param-eters, namely, volume, speed, and occupancy. The incident isdetected based on section, which means that the traffic datacollected from the upstream and the downstream detectionstations are usually used as model inputs in AID systems. InFigure 2, A1, namely, the first attribute, so the number of -variables (predictor variables) is 6.Thismeans that thematrixused in training themodel has the size

    6.The test data

    form amatrix of size 6. The formal description of matrix

    and can be written as shown in Figure 2.One instance consists of at least the following items:

    (i) speed, volume, and occupancy of the upstream detec-tor,

    (ii) speed, volume, and density of the downstream detec-tor,

    (iii) traffic state (incident or nonincident),

    where the item traffic state is a label. The value of the labelis 1 or 1, referring to nonincident or incident, respectively,which is determined by the incident dataset. Typically, themodel is fit for part of the data (the training set), and thequality of the fit is judged by how well it predicts the otherpart of the data (the test set). The entire dataset was dividedinto two parts: a training set that was used to build the modeland a test set that was used to test the models detectionability. Where each row is composed of one observation, isthe number of instances and

    {1, 1}. The data analysis

    problem is to relate the matrix as some function of thematrix to predict (e.g., traffic state) using the data of , = (). The training set was used to develop a Nave Bayesclassifier ensemble that was, in turn, used to detect incidentsfor the test set samples. The output values of the detectionmodels were then compared with the actual ones for eachof the calibration samples, and the performance criteria werecalculated and compared.

    4.1.3. Experiments Procedures. The experiments were per-formed according to the procedures as shown in Figure 3.

    Step 1. Divide the whole dataset into training set and testset . We take part of the whole dataset as the training set, the other as test set and use the parameter

    to control

    the size of the training set , and is obtained by taking outsamples from the front to the back of.

    Step 2. Perform sampling with replacement from training set times and then obtain the training subsets

    1, 2, . . . ,

    .

    We use the parameter to control the ratio of the number ofsamples in the training subset to the number of samples inthe training set; that is, the number of samples in the trainingsubset is sub = . The range of is [0, 1].

    Step 3. While = 1, 2, 3, . . . , , perform the following:

    (1) use the training subset to train the th individual

    NB classifier INBC;

    (2) use the training subset to train the th NBTree

    classifier INBTreeC;

    (3) use the all the existing individual NB classifiers toconstruct the th ensemble NB classifier ENBC

    .

    Step 4. While = 1, 2, 3, . . . , , perform the following:

    (1) test the performances of the th individual NB classi-fier INBC

    on test set ;

    (2) test the performances of the th NBTree classifierINBTreeC

    on test set ;

    (3) test the performances of the th ensemble NB classi-fier ENBC

    on test set .

  • 6 Mathematical Problems in Engineering

    A1 A2 A3 A4 A5 A6 Class

    Upstream data Downstream data

    Matrix X Matrix Y

    First instance

    Last instance

    ......

    ...

    Traffic state1

    Traffic staten

    Speedup 1 Volumeup 1 Occupancyup 1 Speeddn 1 Volumedn 1 Occupancydn 1

    Speedup n Volumeup n Occupancyup n Speeddn n Volumedn n Occupancydn n

    Figure 2: Construction of traffic datasets.

    Detector 1 Detector 2 Detector 3 Detector n

    Signal processing unit

    Speed1

    Volume1Occupancy

    1

    Speed2

    Volume2Occupancy

    2

    Speed3

    Volume3Occupancy

    3

    Speedn

    VolumenOccupancy

    n

    Buildtrafficow

    Flow Flow Flow

    database

    Historical dataReal-time data

    New instance

    Training setensemble model

    Real-timetraffic state

    Alarm

    IncidentIf state = 1 If state = 1

    State Incident-free

    Nave Bayes classifier

    Figure 3: A freeway incident detection model based on Nave Bayes classifier ensemble.

  • Mathematical Problems in Engineering 7

    Table 1: Definition of Symbols and Parameters.

    Parameter/Symbol Definition

    Symbol

    The whole dataset Training set Test set

    The th subset of training setINBC

    The th individual Nave Bayes classifier

    ENBC

    The th ensemble Nave Bayes classifierINBTreeC

    The th individual NBTree classifier

    Parameter

    The total number of the training subsets The ration of sub to

    The number of samples in test set

    The number of samples in training set all The total number of samples in the whole datasetsub The number of samples in the training subset (each training subset has the same number of samples)incident The number of incident samples in the whole datasetnonincident The number of non-incident samples in the whole dataset

    Table 2: Parameter Setting of Experiments.

    Parameter Setting of Experiments for I-880 DatasetParameter

    sub all incident nonincident

    Value 20 0.05 45138 45518 2275 90656 4136 86520Parameter Setting of Experiments for AYE Dataset

    Parameter

    sub all incident nonincident

    Value 20 0.05 16000 13500 675 29500 6000 23500

    4.2. Experiments on I-880 Dataset

    4.2.1. Data Description for I-880 Dataset. We proceeded toreal world data. These data were collected by Petty et al. fromthe I-880 Freeway in the San Francisco Bay area, California,USA. This is the most recent and probably the most well-known freeway incident dataset collected, and these data havebeen used inmany studies related to incident detection. Loopdetector data, with and without incident, was collected froma 9.2 miles (14.8 km) segment of the I-880 Freeway betweenthe Marina and Wipple exits, in both directions. There were18 loop detector stations in the northbound direction and17 stations in the southbound direction. The data collectedincluded traffic volume, occupancy, and speed, averagedacross all lanes in 30 s intervals at the same station. Insummary, the training dataset has 45,518 training instances,of which 2100 are incident instances (from 22 incident cases).The testing dataset has 45,138 instances in all, including 2036incident instances (from 23 incident cases). Thus, incidentexamples are very rare in this dataset, as only approximately4.6% and 4.5% incident examples are contained in thetraining set and the testing set, respectively. Each instancehas seven features. In addition to the measurements ofspeed, volume, and occupancy collected at both the upstreamdetector station and the downstream detector station, thelast one is the class label, 1 for nonincident state and 1 forincident state.

    4.2.2. Parameter Setting of Experiments for I-880 Dataset. Todivide

    into sub properly, there are some parameters that

    need to be set, which can be seen in Table 2. We set values foreach parameter and present the results in Table 2.

    In our experiments, we constructed 20 individual NBclassifiers, 20NB ensemble classifiers, and 20 NBTree clas-sifiers. We tested the performances of the five rules ofeach classifier on the I-880 dataset. Then, we calculatedthe averages and variances of the performances of the 20individual NB classifiers, 20NB ensemble classifiers, and 20NBTree classifiers. The summarized results are in Table 3. InTable 3, the results are presented with the form average variance, and the best results are highlighted in bold. Tomakea visual comparison of the performances of all classifiers, weplotted them in Figures 4 and 5.

    4.3. Experiments on AYE Dataset with Noisy Data

    4.3.1. Data Description for AYE Dataset. The traffic data usedin this study for the development of the incident detectionmodels was produced from a simulated traffic system. A5.8 km section of the Ayer Rajah Expressway (AYE) inSingapore was selected to simulate incident and nonincidentconditions.This site was selected for incident detection studybecause of its diverse geometric configurations that can covera variety of incident patterns [23, 24].

  • 8 Mathematical Problems in Engineering

    Table3:Ex

    perim

    entalR

    esultsof

    NB,

    Five

    Rulesa

    ndNBT

    reeB

    ased

    asAp

    pliedto

    theI-880

    Dataset(Th

    eperform

    ancesa

    representedin

    theform

    average

    varia

    nce).

    Algorith

    mDR

    FAR

    MTT

    DCR

    AUC

    Kapp

    aMAE

    RMSE

    ECNave

    Bayes

    classifier

    0.82282.5005

    0.03986.32E07

    1.299

    10.01727

    0.95406.43E07

    0.89156.50E06

    0.594792.63E05

    0.016193.11E07

    0.179959.4

    7E06

    0.991848.03E08

    Prod

    uctrule

    ensemble

    0.84042.91E04

    0.03803.03E06

    1.76151.0

    4744

    0.95572.22E06

    0.90127.78E05

    0.613832.28E04

    0.016025.33E07

    0.178981.6

    0E05

    0.991921.3

    8E07

    Sum

    rule

    ensemble

    0.89

    635.47

    E06

    0.02975.87E07

    1.69060.89473

    0.96363.66

    E07

    0.93331.0

    6E06

    0.69321.9

    1E05

    0.0159

    92.04

    E07

    0.1788

    16.34

    E06

    0.99

    1945.26

    E08

    Max

    rule

    ensemble

    0.89605.58E06

    0.02971.6

    2E07

    1.62360.84537

    0.96361.6

    3E07

    0.93311.33E06

    0.692686.81E06

    0.016068.09E08

    0.179242.52E06

    0.991902.09E08

    Min

    Rule

    ensemble

    0.89622.29E06

    0.03042.99E06

    1.61930.85452

    0.96293.13E06

    0.93292.53E

    060.688451.2

    8E04

    0.016031.0

    6E07

    0.179023.28E06

    0.991922.73E08

    MVrule

    ensemble

    0.81936.25E05

    0.04102.72E06

    1.78021.15158

    0.95272.02E06

    0.88921.2

    9E05

    0.586395.16E05

    0.016305.09E07

    0.180521.52E05

    0.991781.32E07

    NBT

    reec

    lassifier

    0.81430.00112

    0.00

    891.4

    63E5

    1.36220.00798

    0.98

    311.4

    74E5

    0.90272.7953E4

    0.80

    520.00

    147

    0.026266.3487E6

    0.228941.2

    02E4

    0.986691.6

    7E6

  • Mathematical Problems in Engineering 9

    The simulation system generated volume, occupancy, andspeed data at upstream and downstream sites for both inci-dent and nonincident traffic conditions. The traffic datasetconsisted of 300 incident cases that had been simulated basedon AYE traffic.The simulation of each incident case consistedof three parts. The first part was the nonincident periodthat lasted for 5min. This was after a simulation of a 5minwarm-up time. During the warm-up time, the data containsnoise. The second part was the 10min incident period. Thiswas followed by a 30min postincident period. Each inputpattern included traffic volume, speed, and lane occupancyaccumulated at 30 s intervals, averaged across all the lanes, aswell as the traffic state. The value of the traffic state label is 1or 1, referring to nonincident or incident states, respectively.

    4.3.2. Parameter Setting of Experiments for the AYE Dataset.As we used a new dataset to perform the experiments, theparameter values of the experiments needed to be updated.The updated parameter values can be seen in Table 2.

    4.3.3. Experimental Results with the AYE Dataset. As men-tioned in Section 4.3.1, the AYE dataset includes noisy data,which seriously reduces the quality of the detection. There-fore, the experimental results obtained with the AYE datasetare much worse than the experimental results from the I-880 dataset overall. The AYE dataset experimental results aresummarized in Figures 6 and 7 and Table 4. In Table 4, foreach algorithm of standard NB, NB ensemble, and NBTree,we calculate the averages and variances of the performancesof the total 20 individual classifiers or ensemble classifiers.The results are presented as average variance, and the bestresults are highlighted in bold.

    4.4. Performance Evaluation. In this subsection, we haveevaluated the performances of all three algorithms, standardNB, NB ensemble, and NBTree, using the I-880 dataset andthe AYE dataset with noisy data. In Figures 4 and 6, thesubfigures (a)(f) evaluate the performances of five rules onthe indicators DR, FAR, MTTD, CR, AUC, and Kappa. InFigures 5 and 7, the subfigures evaluate the performances ofMAE, RMSE, and EC.

    4.4.1. Performance Evaluation for the I-880 Dataset. FromFigures 4 and 5, we can see that the performances of thesum rule, the max rule, and the min rule are stable andsignificantly better than the performances of the productrule and the majority vote rule. The performances of theproduct rule and the majority vote rule fluctuate violentlydynamically, which demonstrates that the performances areunstable. The reason for this is that the results of the NaveBayes algorithm depend on the appropriate threshold andparameters, and the algorithm is not resilient to estimationerrors. For example, in Figure 4(a), the DR of the productrule and the majority vote rule reaches 88% and 82% oreven higher. In contrast, DR can also reach as low as 78%or even lower. In Figure 5 concerning the product rule,when the number of classifiers is less than 10, the RMSEreaches as low as 0.17 or even lower. In contrast, the RMSE

    can reach 0.19 or even higher. These results indicate thatwhen the threshold and parameters are chosen appropriately,the detection performance can improve. We need to selectan appropriate threshold and parameters to make the NBensemble achieve optimal performances, But the proceduresfor choosing an appropriate threshold and parameters needto use the trial and error method. Until now, there has notbeen a structuredway to choose these values.There is anothersignificant phenomenon that can be found from the data inFigures 4 and 5. When the number of classifiers increases,the performance of the five rules mitigates their fluctuation,and they tend to achieve a stable value. The reason for thisis that the multiple classifiers ensemble can compensate forthe defect of a single classifier to some extent. From Figure 4,we can also see that the performances of the sum rule, themax rule, and the min rule on each indicator are very closeto each other. As for the indicators DR, FAR, AUC, andKappa, the performance of the sum rule is slightly betterthan that of the max rule and the min rule. As mentionedabove, AUC and Kappa can evaluate the performances morecomprehensively than can the other four indicators. To makean overall evaluation, the performances of the NB ensembleare slightly better than those of the Nave Bayes.

    In Table 3, the average DR value of Nave Bayes is 82.28%,whereas the average DR value of the five rules ranges from81.93% to 89.63%. This indicates that the NB ensemble algo-rithm is more sensitive to the traffic incidents and can detectmore traffic incidents than can the standard Nave Bayesalgorithm. The average MTTD value of NBTree is 1.36min,which indicates that the NBTree algorithm can detect theincidents more quickly than NB ensemble algorithm. Theaverage value of CR of NBTree is 98.31%, which indicates thatNBTree can achieve the higher classification accuracy of theincident instance than can the Nave Bayes and NB ensemblealgorithm. The average Kappa value of NBTree is 0.8052,which indicates that the performance of NBTree classifieris significantly better than those of the other classifiers. Inaddition, it can detect more incidents than the NB ensemble,but the experimental results are not the same. The reasonfor this is that the FAR of NBTree is the highest. FARimproves the performance of the NBTree classifier.TheMAE,RMSE, and EC of the product rule are best, which indicatesthat the predicted value of the product rule is the closestapproximation.

    4.4.2. Performance Evaluation Using AYE Dataset with NoisyData. From Figures 6 and 7, we can see that when the datasetincludes noisy data, the performances of the sum rule, themax rule, and the min rule are still stable and significantlybetter than the performances of the product rule and themajority vote rule. In addition, we also find that the perfor-mances of the sum rule, the max rule, and the min rule aremuch better than those of the product rule and the majorityvote rule on the indicators of DR, FAR, CR, and Kappa.Among the five rules, it appears that the min rule yields thehighest average MAE, whereas the other rules are similar. Asfar as EC is concerned, the five rules perform at a similarlevel that is better than that of the NBTree. The opposite case

  • 10 Mathematical Problems in Engineering

    Table4:Ex

    perim

    entalR

    esultsof

    NB,

    Five

    Rulesa

    ndNBT

    reeB

    ased

    asAp

    pliedto

    theA

    YEDataset(Th

    eperform

    ancesa

    representedin

    theform

    average

    varia

    nce).

    Algorith

    mDR

    FAR

    MTT

    DCR

    AUC

    Kapp

    aMAE

    RMSE

    ECNave

    Bayes

    classifier

    0.71120.00348

    0.04998.16E04

    1.3940.08869

    0.90

    940.00

    111

    0.86634.39E04

    0.62479.14E04

    0.167693.07E04

    0.578955.31E04

    0.908541.8

    3E04

    Prod

    uctrule

    ensemble

    0.64570.00127

    0.05572.40

    E06

    1.86851.2

    4139

    0.87321.6

    4E06

    0.86194.85E07

    0.74793.58E05

    0.166901.0

    4E06

    0.577753.11E06

    0.908953.68E07

    Sum

    rule

    ensemble

    0.79235.57E05

    0.05006.73E07

    1.43840.63868

    0.87743.55E07

    0.86

    673.75E07

    0.74391.2

    9E04

    0.167394.94E07

    0.57861

    1.47E06

    0.908661.7

    5E07

    Max

    rule

    ensemble

    0.78691.7

    9E04

    0.05002.21E06

    1.4571

    0.72272

    0.87711.2

    5E06

    0.86632.02E07

    0.75114.12E05

    0.168034.96E06

    0.579691.4

    6E05

    0.908281.7

    7E06

    Min

    Rule

    ensemble

    0.79

    601.2

    5E04

    0.04982.76E06

    1.46460.73539

    0.87811.10E06

    0.86

    671.14E06

    0.60521.8

    6E06

    0.1663

    83.25E06

    0.576849.8

    3E06

    0.909261.15E06

    MVrule

    ensemble

    0.62461.15E05

    0.05722.29E06

    1.83581.19124

    0.87203.84E07

    0.78371.18E06

    0.73415.31E05

    0.166872.27E06

    0.577696.87E06

    0.90

    8978.03E07

    NBT

    ree

    classifier

    0.72753.25E4

    0.02

    501.8

    5E5

    1.21030.1038

    30.90313.52E5

    0.85121.0

    4E4

    0.70853.26E4

    0.17116.42E5

    0.49

    1892.62

    E4

    0.905532.06

    E5

  • Mathematical Problems in Engineering 11

    0 5 10 15 20

    0.80

    0.82

    0.84

    0.86

    0.88

    0.90

    0.92

    DR-product ruleDR-sum ruleDR-max rule

    DR-min ruleDR-majority vote rule

    DR

    The number of Nave Bayes classifier ensembles

    (a)

    0.028

    0.030

    0.032

    0.034

    0.036

    0.038

    0.040

    0.042

    0.044

    0.046

    FAR-product ruleFAR-sum ruleFAR-max rule

    FAR-min ruleFAR-majority vote rule

    FAR

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    (b)

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    MTTD-product ruleMTTD-sum ruleMTTD-max rule

    MTTD-min ruleMTTD-majority vote rule

    MTT

    D

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    (c)

    0.948

    0.950

    0.952

    0.954

    0.956

    0.958

    0.960

    0.962

    0.964

    0.966

    CR

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    CR-product ruleCR-sum ruleCR-max rule

    CR-min ruleCR-majority vote rule

    (d)

    0.88

    0.89

    0.90

    0.91

    0.92

    0.93

    0.94

    AUC-product ruleAUC-sum ruleAUC-max rule

    AUC-min ruleAUC-majority vote rule

    AUC

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    (e)

    0.56

    0.58

    0.60

    0.62

    0.64

    0.66

    0.68

    0.70

    Kappa-product ruleKappa-sum ruleKappa-max Rule

    Kappa-min ruleKappa-majority vote rule

    Kapp

    a

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    (f)

    Figure 4: Experimental results of five rules for the Nave Bayes ensembles as applied to the I-880 dataset: (a) performance with DR; (b)performance with FAR; (c) performance with MTTD; (d) performance with CR; (e) performance with AUC; (f) performance with Kappa.

  • 12 Mathematical Problems in Engineering

    0 5 10 15 200.01450.01500.01550.01600.01650.01700.01750.01800.01850.0190

    MAE-product rule

    0.170

    0.175

    0.180

    0.185

    0.190

    0.195

    RMSE-product rule

    0.9905

    0.9910

    0.9915

    0.9920

    0.9925

    The number of NB classifier ensembles

    EC-product rule

    0.01450.01500.01550.01600.01650.01700.01750.01800.01850.0190

    MAE-sum rule

    0.170

    0.175

    0.180

    0.185

    0.190

    0.195

    RMSE-sum rule

    0.9905

    0.9910

    0.9915

    0.9920

    0.9925

    EC-sum rule

    0.01520.01540.01560.01580.01600.01620.01640.01660.0168

    MAE-max rule

    0.1750.1760.1770.1780.1790.1800.1810.1820.183

    RMSE-max rule

    0.9916

    0.9918

    0.9920

    0.9922

    EC-max rule

    0.01540.01560.01580.01600.01620.01640.01660.0168

    MAE-min rule

    0.1760.1770.1780.1790.1800.1810.1820.183

    RMSE-min rule

    0.9916

    0.9918

    0.9920

    0.9922

    EC-min rule

    0.01500.01550.01600.01650.01700.01750.01800.0185

    MAE-majority vote rule

    0.175

    0.180

    0.185

    0.190

    RMSE-majority vote rule

    0.9910

    0.9915

    0.9920

    EC-majority vote rule

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    Figure 5: Bar chart comparison of five rules for Nave Bayes classifier ensembles as applied to the I-880 dataset. The red bar is MAE, thegreen bar is RMSE, and the blue bar is EC.

  • Mathematical Problems in Engineering 13

    0.60

    0.65

    0.70

    0.75

    0.80D

    R

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    DR-product ruleDR-sum ruleDR-max rule

    DR-min ruleDR-majority vote rule

    (a)

    0.0460.0480.0500.0520.0540.0560.0580.0600.0620.064

    FAR

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    FAR-product ruleFAR-sum ruleFAR-max rule

    FAR-min ruleFAR-majority vote rule

    (b)

    0.00.51.01.52.02.53.03.54.04.5

    MTT

    D

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    MTTD-product ruleMTTD-sum ruleMTTD-max rule

    MTTD-min ruleMTTD-majority vote rule

    (c)

    0.870

    0.872

    0.874

    0.876

    0.878

    0.880CR

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    CR-product ruleCR-sum ruleCR-max rule

    CR-min ruleCR-majority vote rule

    (d)

    0.770.780.790.800.810.820.830.840.850.860.870.88

    AUC

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    AUC-product ruleAUC-sum ruleAUC-max rule

    AUC-min ruleAUC-majority vote rule

    (e)

    0.580.600.620.640.660.680.700.720.740.760.78

    Kapp

    a

    0 5 10 15 20The number of Nave Bayes classifier ensembles

    Kappa-product ruleKappa-sum ruleKappa-max Rule

    Kappa-min ruleKappa-majority vote rule

    (f)

    Figure 6: Experimental results of five rules forNave Bayes ensemble as applied to theAYEdataset: (a) performancewithDR; (b) performancewith FAR; (c) performance with MTTD; (d) performance with CR; (e) performance with AUC; (f) performance with Kappa.

  • 14 Mathematical Problems in Engineering

    MAE-min rule RMSE-min rule EC-min rule

    MAE-majority vote rule RMSE-majority vote rule EC-majority vote rule

    0.16500.16550.16600.16650.16700.16750.16800.16850.1690

    0.5730.5740.5750.5760.5770.5780.5790.5800.5810.582

    0.9040.9050.9060.9070.9080.9090.9100.911

    0.16600.16650.16700.16750.16800.16850.1690

    0.576

    0.577

    0.578

    0.579

    0.580

    0.581

    0.582

    0.9040.9050.9060.9070.9080.9090.9100.911

    0.16380.16520.16660.16800.16940.17080.17220.17360.1750

    0.5740.5760.5780.5800.5820.5840.5860.5880.5900.5920.594

    0.9040.9050.9060.9070.9080.9090.9100.911

    0.16290.16380.16470.16560.16650.16740.16830.1692

    0.5700.5720.5740.5760.5780.5800.582

    0.908

    0.909

    0.910

    0.16200.16320.16440.16560.16680.16800.1692

    0.5680.5700.5720.5740.5760.5780.5800.582

    0.9070.9080.9090.9100.9110.912

    0.911

    0.912

    0.913

    MAE-product rule RMSE-product rule EC-product rule

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    0 5 10 15 20The number of NB classifier ensembles

    MAE-sum rule RMSE-sum rule EC-sum rule

    MAE-max rule RMSE-max rule EC-max rule

    Figure 7: Bar chart comparison of five rules for Nave Bayes classifier ensembles as applied to the AYE dataset.The red bar is MAE, the greenbar is RMSE, and the blue bar is EC.

  • Mathematical Problems in Engineering 15

    is observed with RMSE. Comparing Figure 4 and Figure 5,we find that the performances of all five rules are reduced inFigure 5. The reason for this is that the noisy data involvedin the process of training a single classifier result in thefinal output of the NB ensemble being worse. However, theperformances of the sum rule, the max rule, and the min ruleare reduced, which indicates that the sum rule, the max rule,and themin rule have a better ability to tolerate the noisy dataamong the five rules.

    In Tables 3 and 4, the dataset we used is extremelyunbalanced. A very high CR, even exceeding 90%, doesnot indicate good detection performance. In Table 4, theaverage DR value of the Nave Bayes algorithm is 71.12% andthe average DR value of the NBTree algorithm is 72.75%.Both values are less than 75%, which is unsatisfactory forpractical applications. The DR values of the product ruleand the majority rule are even lower (64.57% and 62.46%,resp.). These results indicate that if the average accuracy ofthe individual NB classifiers is lower, the average accuracyof the ensemble classifiers, which are constructed usingthese individual classifiers, will become even lower than theaverage accuracy of the individual NB classifiers in certaincombination rules.

    Therefore, we should avoid drawing noisy data into theNB ensemble.The standard Nave Bayes and NBtree are bothindividual classifiers, and they only need to train one time. Incontrast to standard Nave Bayes and NBtree, NB ensembleneeds to train many individual NB classifiers to constructthe NB ensemble. The training time of the NB ensembleis relatively long. From Figures 4(a)4(f), we can see thatin order to obtain relatively better performance, the NBensemble needs approximately 15 individual NB to constructtheNB ensemble; that is, theNB ensemble algorithmneeds totrain 15 times.Thus, compared withNB ensemble, the NBtreealgorithm saves a large amount of time cost.

    5. Conclusions

    The Nave Bayes classifier ensemble is a type of ensembleclassifier based on Nave Bayes for AID. In contrast toNave Bayes, the NB classifier ensemble algorithm trainsmany individual NB classifiers to construct the classifierensemble and then uses this classifier ensemble to detect thetraffic incidents, and it avoids the burden of choosing theoptimal threshold and tuning the parameters. In our research,we take the traffic incident detection problem as a binaryclassification problem based on the ILD data and use theNB ensemble to divide the traffic patterns into two groups:an incident traffic pattern and nonincident traffic pattern. Inthis paper, we have performed two groups of experiments toevaluate the performances of the three algorithms: standardNave Bayes, NB ensemble, and NBTree. In the first group ofexperiments, we used all the three algorithms as applied to theI-880 dataset without noisy data. The results indicate that theperformances of the five rules of the NB ensemble are signif-icantly better than those of standard Nave Bayes and slightlybetter than those ofNBTree in terms of some indicators.Moreimportantly, the NB ensemble performance is very stable. To

    further test the stability of the three algorithms, we appliedthe three algorithms to the AYE dataset with noisy data in thesecond group of experiments. The experimental results indi-cate that the NB ensemble has the best ability to tolerate thenoisy data among the three algorithms. After analyzing theexperimental results, we found that if the average accuracyof the individual NB classifiers is lower, the average accuracyof the ensemble classifiers constructed by these individualclassifiers will become even lower than the average accuracyof the individual NB classifiers. To obtain good results for theNB ensemble classifier, we should avoid drawing the noisydata into the ensemble. NBTree is an individual classifier thatneeds to train only one time, whereas the NB ensemble needsto train many individual NB classifiers to construct the NBensemble. As a result, compared with the NBTree algorithm,theNBTree algorithm reduces the time cost.The contributionof this paper is that it presents the development of a freewayincident detection model based on the Nave Bayes classifierensemble algorithm.TheNB ensemble not only improves theperformances of traffic incidents detection but also enhancesthe stability of the performances with an increased numberof classifiers. The advantage of NBTree is that the MTTDvalue is better than that of the NB ensemble algorithm.We believe that the NB ensemble algorithm and NBTreecan be successfully utilized in traffic incidents detection andthe other classification problems. In a future study, we willconcentrate on constructing an NB ensemble and scaling upthe accuracy of NBTree to detect traffic incidents.

    Conflict of Interests

    The authors declare that there is no conflict of interestsregarding the publication of this paper.

    Acknowledgments

    This work is part of an ongoing research Supported byNational High Technology Research and Development Pro-gram of China under Grant no. 2012AA112304 and Researchand Innovation Project for College Graduates of JiangsuProvince no. CXZZ13 0119. Qingchao Liu would like to makea grateful acknowledgment to all the members of TrafficEngineering Lab in SoutheastUniversity, China, for their helpand many useful suggestions in this study.

    References

    [1] Y.-S. Jeong, M. Castro-Neto, M. K. Jeong, and L. D. Han,A wavelet-based freeway incident detection algorithm withadapting threshold parameters, Transportation Research C:Emerging Technologies, vol. 19, no. 1, pp. 119, 2011.

    [2] I. Eleni and G. Matthew, Fuzzy-entropy neural network free-way incident duration modeling with single and competinguncertainties, Computer-Aided Civil and Infrastructure Engi-neering, vol. 28, no. 6, pp. 420433, 2013.

    [3] F. Yuan andR. L. Cheu, Incident detection using support vectormachines, Transportation Research C: Emerging Technologies,vol. 11, no. 3-4, pp. 309328, 2003.

  • 16 Mathematical Problems in Engineering

    [4] J. Xiao andY. Liu, Traffic incident detection usingmultiple ker-nel support vector machine, Transportation Research Record,vol. 2324, pp. 4452, 2012.

    [5] X. Jin, D. Srinivasan, and R. L. Cheu, Classification of free-way traffic patterns for incident detection using constructiveprobabilistic neural networks, IEEE Transactions on NeuralNetworks, vol. 12, no. 5, pp. 11731187, 2001.

    [6] D. L. Han and A. D. May, Automatic Detection of Traffic Opera-tional Problems on Urban Arterials, Institute of TransportationStudies, University of California, Berkeley, Calif, USA, 1989.

    [7] Y. J. Stephanedes and G. Vassilakis, Intersection incidentdetection for IVHS, in Proceedings of the 74th Annual Meetingof the Transportation Research Board, Washington, DC, USA,1994.

    [8] N. Hoose, M. A. Vicencio, and X. Zhang, Incident detectionin urban roads using computer image processing, TrafficEngineering and Control, vol. 33, no. 4, pp. 236244, 1992.

    [9] J. Lu, S. Chen, W. Wang, and H. van Zuylen, A hybrid modelof partial least squares and neural network for traffic incidentdetection, Expert Systems with Applications, vol. 39, no. 5, pp.47754784, 2012.

    [10] N.-E. E. Faouzi, H. Leung, and A. Kurian, Data fusion inintelligent transportation systems: progress and challengesasurvey, Information Fusion, vol. 12, no. 1, pp. 410, 2011.

    [11] B. Twala, Multiple classifier application to credit risk assess-ment, Expert Systems with Applications, vol. 37, no. 4, pp. 33263336, 2010.

    [12] J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, On combiningclassifiers, IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 20, no. 3, pp. 226239, 1998.

    [13] Y. Bi, J. Guan, and D. Bell, The combination of multipleclassifiers using an evidential reasoning approach, ArtificialIntelligence, vol. 172, no. 15, pp. 17311751, 2008.

    [14] T. G. Dietterich, Machine-learning research: four currentdirections,The AI Magazine, vol. 18, no. 4, pp. 97136, 1997.

    [15] L. K.Hansen andP. Salamon, Neural network ensembles, IEEETransactions on Pattern Analysis and Machine Intelligence, vol.12, no. 10, pp. 9931001, 1990.

    [16] R. Kohavi, Scaling up the accuracy of Naive-Bayes classifiers:a decision tree hybrid, in Proceedings of the 2nd InternationalConference on Knowledge Discovery and Data Mining, pp. 202207, Portland, Ore, USA, 1996.

    [17] G. H. John and P. Langley, Estimating continuous distributionsin Bayesian classifiers, in Proceedings of the 11th Conferenceon Uncertainty in Artificial Intelligence (UAI 95), pp. 338345,Montreal, Canada, 1995.

    [18] E. Parkany and C. Xie, A complete review of incident detectionalgorithms and their deployment:whatworks andwhat doesnt,Tech. Rep. NETCR37, Transportation Center, University ofMassachusetts.

    [19] S. Chen and W. Wang, Decision tree learning for freewayautomatic incident detection, Expert Systems with Applications,vol. 36, no. 2, pp. 41014105, 2009.

    [20] T. Fawcett, An introduction to ROC analysis, Pattern Recogni-tion Letters, vol. 27, no. 8, pp. 861874, 2006.

    [21] J. A. Hanley and B. J. McNeil, The meaning and use of thearea under a receiver operating characteristic (ROC) curve,Radiology, vol. 143, no. 1, pp. 2936, 1982.

    [22] D. J. Hand and R. J. Till, A simple generalisation of the areaunder the ROC curve formultiple class classification problems,Machine Learning, vol. 45, no. 2, pp. 171186, 2001.

    [23] R. L. Cheu, D. Srinivasan, and W. H. Loo, Training neuralnetworks to detect freeway incidents by using particle swarmoptimization, Transportation Research Record, vol. 1867, pp. 1118, 2004.

    [24] D. Srinivasan, X. Jin, and R. L. Cheu, Adaptive neural networkmodels for automatic incident detection on freeways, Neuro-computing, vol. 64, pp. 473496, 2005.

  • Submit your manuscripts athttp://www.hindawi.com

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Mathematical Problems in Engineering

    Hindawi Publishing Corporationhttp://www.hindawi.com

    Differential EquationsInternational Journal of

    Volume 2014

    Applied MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Probability and StatisticsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Mathematical PhysicsAdvances in

    Complex AnalysisJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    OptimizationJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    CombinatoricsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    International Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Operations ResearchAdvances in

    Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Function Spaces

    Abstract and Applied AnalysisHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    International Journal of Mathematics and Mathematical Sciences

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Algebra

    Discrete Dynamics in Nature and Society

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Decision SciencesAdvances in

    Discrete MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com

    Volume 2014 Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Stochastic AnalysisInternational Journal of