Research Article Fuzzy Clustering-Based Ensemble Approach ...

13
Research Article Fuzzy Clustering-Based Ensemble Approach to Predicting Indian Monsoon Moumita Saha, 1 Pabitra Mitra, 1 and Arun Chakraborty 2 1 Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, Paschim Medinipur, West Bengal 721302, India 2 Centre for Oceans, Rivers, Atmosphere and Land Sciences, Indian Institute of Technology Kharagpur, Kharagpur, Paschim Medinipur, West Bengal 721302, India Correspondence should be addressed to Moumita Saha; [email protected] Received 2 January 2015; Revised 31 March 2015; Accepted 3 April 2015 Academic Editor: Xiaolong Jia Copyright © 2015 Moumita Saha et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Indian monsoon is an important climatic phenomenon and a global climatic marker. Both statistical and numerical prediction schemes for Indian monsoon have been widely studied in literature. Statistical schemes are mainly based on regression or neural networks. However, the variability of monsoon is significant over the years and a single model is oſten inadequate. Meteorologists revise their models on different years based on prevailing global climatic incidents like El-Ni˜ no. ese indices oſten have degree of severity associated with them. In this paper, we cluster the monsoon years based on their fuzzy degree of associativity to these climatic event patterns. Next, we develop individual prediction models for the year clusters. A weighted ensemble of these individual models is used to obtain the final forecast. e proposed method performs competitively with existing forecast models. 1. Introduction Monsoon is a complex phenomenon of a climatic system. It is influenced by multiple climatic parameters and sea- atmosphere interactions. Prediction of monsoon is chal- lenging due to large variability present in its patterns. Indian Meteorological Department (IMD) performs forecast of Indian summer monsoon rainfall (ISMR) since 1886. Indian monsoon forecast was initiated by Blanford [1] as early as 1882. e success of forecasts in span of 1882– 1885 encouraged Blanford to design operational long range forecast model for monsoon in 1886. Subsequently, Walker [2] developed models studying the statistical correlations between rainfall and different global climate parameters. apliyal and Kulshrestha [3] introduce regression model in predicting south-west Indian monsoon rainfall. Gowariker et al. [4] propose power regression model for long-term forecast of monsoon, which provided accurate forecast for a long period, but failed to predict the extreme condition of 2002. In 2004, Rajeevan et al. [5] reassess different climatic parameters and introduce four new parameters to design statistical model for issuing long-range forecast of Indian monsoon. Succeeding in 2007, Rajeevan et al. [6] built models using ensemble multiple regression and pursuit projection regression to forecast Indian rainfall and proved to be superior to past IMD models. Schewe and Levermann [7] explain the change in distribution of Indian rainfall and also explain the reasons behind failure of monsoon in certain years. Wu et al. [8] propose a linear Markov model to predict short-term climate variability of East Asian monsoon. Fan et al. [9] develop two statistical prediction schemes for seasonal forecast of East Asian summer monsoon. e schemes take the direct outputs of the existing models and give better prediction of the summer monsoon. Artificial neural networks (ANN)[10] are widely used in modelling the nonlinearity present in monsoon process. Sahai et al. [11] use ANN techniques with error backpropaga- tion to forecast Indian summer monsoon rainfall. Hong [12] Hindawi Publishing Corporation Advances in Meteorology Volume 2015, Article ID 329835, 12 pages http://dx.doi.org/10.1155/2015/329835

Transcript of Research Article Fuzzy Clustering-Based Ensemble Approach ...

Page 1: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Research ArticleFuzzy Clustering-Based Ensemble Approach toPredicting Indian Monsoon

Moumita Saha1 Pabitra Mitra1 and Arun Chakraborty2

1Department of Computer Science and Engineering Indian Institute of Technology Kharagpur KharagpurPaschim Medinipur West Bengal 721302 India2Centre for Oceans Rivers Atmosphere and Land Sciences Indian Institute of Technology Kharagpur KharagpurPaschim Medinipur West Bengal 721302 India

Correspondence should be addressed to Moumita Saha moumitasaha2012gmailcom

Received 2 January 2015 Revised 31 March 2015 Accepted 3 April 2015

Academic Editor Xiaolong Jia

Copyright copy 2015 Moumita Saha et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Indian monsoon is an important climatic phenomenon and a global climatic marker Both statistical and numerical predictionschemes for Indian monsoon have been widely studied in literature Statistical schemes are mainly based on regression or neuralnetworks However the variability of monsoon is significant over the years and a single model is often inadequate Meteorologistsrevise their models on different years based on prevailing global climatic incidents like El-Nino These indices often have degreeof severity associated with them In this paper we cluster the monsoon years based on their fuzzy degree of associativity to theseclimatic event patterns Next we develop individual predictionmodels for the year clusters Aweighted ensemble of these individualmodels is used to obtain the final forecast The proposed method performs competitively with existing forecast models

1 Introduction

Monsoon is a complex phenomenon of a climatic systemIt is influenced by multiple climatic parameters and sea-atmosphere interactions Prediction of monsoon is chal-lenging due to large variability present in its patternsIndian Meteorological Department (IMD) performs forecastof Indian summer monsoon rainfall (ISMR) since 1886Indian monsoon forecast was initiated by Blanford [1] asearly as 1882 The success of forecasts in span of 1882ndash1885 encouraged Blanford to design operational long rangeforecast model for monsoon in 1886 Subsequently Walker[2] developed models studying the statistical correlationsbetween rainfall and different global climate parametersThapliyal and Kulshrestha [3] introduce regression model inpredicting south-west Indian monsoon rainfall Gowarikeret al [4] propose power regression model for long-termforecast of monsoon which provided accurate forecast for along period but failed to predict the extreme condition of

2002 In 2004 Rajeevan et al [5] reassess different climaticparameters and introduce four new parameters to designstatistical model for issuing long-range forecast of Indianmonsoon Succeeding in 2007 Rajeevan et al [6] builtmodelsusing ensemble multiple regression and pursuit projectionregression to forecast Indian rainfall and proved to besuperior to past IMD models Schewe and Levermann [7]explain the change in distribution of Indian rainfall and alsoexplain the reasons behind failure of monsoon in certainyears Wu et al [8] propose a linear Markov model to predictshort-term climate variability of East Asian monsoon Fan etal [9] develop two statistical prediction schemes for seasonalforecast of East Asian summer monsoon The schemes takethe direct outputs of the existing models and give betterprediction of the summer monsoon

Artificial neural networks (ANN) [10] are widely usedin modelling the nonlinearity present in monsoon processSahai et al [11] use ANN techniques with error backpropaga-tion to forecast Indian summer monsoon rainfall Hong [12]

Hindawi Publishing CorporationAdvances in MeteorologyVolume 2015 Article ID 329835 12 pageshttpdxdoiorg1011552015329835

2 Advances in Meteorology

predicts Indian summer monsoon utilizing recurrent neuralnetwork and also demonstrates successful employment ofsupport vector machine in solving nonlinear regression andtime series problemsThree different backpropagation neurallearning rules namely momentum learning conjugate gra-dient descent learning and Levenberg-Marquardt learningare used by S Chattopadhyay and G Chattopadhyay [13]to perform a comparative study of different neural networkmethod to predict rainfall time series

Presence of large variability in monsoon patterns makesit difficult for a single model to predict its distributionA number of uncertainties including boundary conditionparameter and structural uncertainties are involved in con-struction of these models Thus it remains fundamentallychallenging to have a singlemodel for predictionMultimodelensembles are proposed to overcome the weakness of singlemodel which combine the outcome of different models toproduce efficient results [14 15] In addition monsoon showsdifferent characteristics over yearsThere exist groups of yearswhere variation of climatic parameters and pattern of rainfallare similar We use fuzzy clustering to cluster the similaryears together and model them separately The motivationbehind using fuzzy clustering is that each year manifests amixture of physical climatic events We cannot hard clustera year into a specific group years have their membership ofbelongingness to every cluster Fuzzy clustering is used toenclose the characteristics of different events being related toa year of study We use the same set of climatic parameters aspredictor set for every cluster but frame different models foreach cluster

A number of prediction models namely multiple regres-sion (MR) multilayer perceptron (MLP) recurrent neuralnetwork (RNN) and generalized regression neural network(GRNN) models are used for prediction of Indian monsoonfor the year clusters There exists viable reasons for usingneural networks like MLP RNN and GRNN for modelling(i) Indian monsoon is a complex process which cannot beadequately modelled by linearmodels (ii) nonlinearity in thetime-series pattern can be well captured by neural networklearning (iii) climatic events are much closely related to nearyears parameters disturbance as compared to distant yearsand neural network enables attaching weight to the yearparameter in appropriate manner

In this work climatic parameters that are strongly corre-lated with Indian monsoon are identified at the onset whichis followed by fuzzy clustering of years into groups withdegree of belongingness of each year to the clusters Thenwe model each cluster with four types of models namelyMR MLP RNN and GRNN to forecast rainfall Weightedensemble of forecasts given by respective models for eachcluster is considered as final predicted rainfall Analysisand comparisons are performed on aggregate Indian rainfalland finally a meteorological interpretation of the obtainedclusters is presented

The paper is organised in the following manner Wediscussed the details of data and predictor climatic param-eters in Sections 2 and 3 Proposed clustering basedapproach prediction model and ensemble technique arepresented in Section 4 with experimental results in Section 5

Meteorological significance is discussed in Section 6 andfinally conclusions are provided in Section 7

2 Data Sets Used

We consider the annual Indian summer monsoon rain-fall (ISMR) occurring in four months of June JulyAugust and September Annual ISMR is considered duringperiod 1948ndash2013 for our study The long period aver-age (LPA) (1948ndash2013) of ISMR is 8918mm ISMR isexpressed as percentage of the LPA value The data isobtained from Indian Institute of Tropical MeteorologyPune (httpwwwimdpunegovinresearchncclongrangedatadatahtml) [16]

Predictor parameters sea level pressure (SLP) (httpwwwesrlnoaagovpsdgcos wgspGriddeddatanoaaerslphtml) and sea surface temperature (SST) (httpwwwesrlnoaagovpsddatagriddeddatanoaaerssthtml) data areprovided by theNOAAOARESRLPSD at spatial resolutionof 2∘

times 2∘ [17] Surface pressure (SP) and zonal wind

velocity (WV) data are collected from NCEP ReanalysisDerived data provided by the NOAAOARESRL PSD(httpwwwesrlnoaagovpsddatagriddeddatanceprea-nalysisderivedsurfacehtml) [18] available at resolutionof 25

times 25∘ Finally Nino 34 data which is the sea

surface temperature anomaly for the spatial coverageof 5∘S to 5

∘N and 170∘W to 120

∘W in Pacific Oceanregion is acquired from National Center for AtmosphericResearch (httpwwwcpcncepnoaagovproductsanalysismonitoringensostuffensoyearsshtml) [19] All the abovemonthly data are considered for the period 1948ndash2013 in ourstudy and analysis

3 Global Climatic ParametersInfluencing Indian Monsoon

Indian monsoon is strongly influenced by several globalclimatic parameters occurring at places distant from Indiansubcontinent Identification of predictor parameters relies onphysical understanding of monsoon event and wind patternflow We have selected the climatic parameters based onthe parameters used by Indian meteorological departmentrsquosmodels [5 6] studying their correlation with Indian summermonsoon rainfall (ISMR) during our period of study (1948ndash2013) In the data preprocessing phase climatic anomaly dataare evaluated by calculating the deviation of parameter valuefrom long-term average value of the parameter exclusively foreachmonth followed by correlation study between ISMR andthe climatic parameters for a lag of zero to twelve monthsWe consider the best lagged predictor month having highcorrelationwith ISMRThe predictor climatic parameters andtheir correlation values with Indian monsoon are shown inTable 1 Figure 1 shows the geographic location of climaticparameters influencing Indian monsoonPredictor Sets of Climatic Parameters Based on the correlationwith Indian monsoon we have built five predictor sets forforecasting Different combinations of the identified climaticparameters (Table 1) form the predictor sets The predictorsets are shown in Table 2

Advances in Meteorology 3

Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)

CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)

70∘

70∘

60∘

60∘

60∘

60∘

40∘

40∘

40∘

40∘

20∘

20∘

20∘

20∘

0∘

70∘

70∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP

119894

representsparameter 119894 in Table 1

Table 2 Predictor sets with climatic parameters

Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9

4 Methodology

We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections

41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It

is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters

42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910

1 119910

119899 into a collection

of 119888 fuzzy clusters 119862 = cen1 cen

119888 and a partition matrix

119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908

119894119895gives the

degree of belongingness of element 119910119894to cluster with center

cen119895FCM aims to minimize an objective function of (1) The

update of partition matrix and centers occur in accordancewith (2) and (3) respectively

119869119898

=

119873

sum

119894=1

119888

sum

119895=1

119908119898

119894119895

10038171003817100381710038171003817119910119894minus cen119895

10038171003817100381710038171003817

2

1 le 119898 le infin (1)

119908119894119895=

1

sum119888

119896=1

(10038171003817100381710038171003817119910119894minus cen119895

100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen

119896

1003817100381710038171003817)2(119898minus1) (2)

cen119895=

sum119873

119894=1

119908119898

119894119895

sdot 119909119894

sum119873

119894=1

119908119898

119894119895

(3)

where119898 denotes the level of cluster fuzziness

4 Advances in Meteorology

Input

∙ Climateparameter(CP)

∙ Rainfall

Climate

anomaly

data

evaluation

Correlation

study of

CP

and rainfall

Identification

of month

of CP

highly

correlated

to

rainfall

Building

predictor

set with

different

CP

combination

Data preprocessing

Ensemble neural network model

Ensemble

of op of

all clusters

Forecast

by everyclusterby each

model

∙ MR

∙ MLP

∙ RNN

∙ GRNN

Clusters

Membershipvalue

Clustering years

Apply

alpha-

cut to

obtain

clusters

Clustering

predictor

set using

fuzzy

clustering

Ensemble

forecastClusters

ValidationPhysical interpretation of clusters

Error calculation

Comp

with

existing

models

Comp

with

nonclustered

method

Support Confidence

Threshold

Cluster interpretation

OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results

∙ Climatic eventsassociated with eachcluster

Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall

43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all

the clustersrsquo model forecast We describe below the modelsused

431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883

119894s) and a dependent variable (119884)

Advances in Meteorology 5

Table 3 Model parameter setting for MLP models

Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation

Multiple regression model having 119901 independent variables isshown in

119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573

119901119909119894119901

+ 120576119894 (4)

where 119909119894119895is the 119894th observation of 119895th independent variable

where the first independent variable takes the value 1 for all 119894and 120576 represents the residual

432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3

433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units

434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910

119895

represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by

119910(119909) =

119901

sum

119894=1

119882119894119910119894 (5)

where

119882119894=

exp (minus1003817100381710038171003817119909 minus 119909

119894

1003817100381710038171003817

2

21205902

)

sum119901

119896=1

exp (minus1003817100381710038171003817119909 minus 119909

119896

1003817100381710038171003817

2

21205902

)

(6)

The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data

Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training

44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters

Ensemble prediction119905 =119888

sum

119894=1

119882119905

119894

sdot 119875119894 (7)

where119875119894represents the prediction given by amodel for cluster

119894119882119905119894

is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters

45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach

The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 2: Research Article Fuzzy Clustering-Based Ensemble Approach ...

2 Advances in Meteorology

predicts Indian summer monsoon utilizing recurrent neuralnetwork and also demonstrates successful employment ofsupport vector machine in solving nonlinear regression andtime series problemsThree different backpropagation neurallearning rules namely momentum learning conjugate gra-dient descent learning and Levenberg-Marquardt learningare used by S Chattopadhyay and G Chattopadhyay [13]to perform a comparative study of different neural networkmethod to predict rainfall time series

Presence of large variability in monsoon patterns makesit difficult for a single model to predict its distributionA number of uncertainties including boundary conditionparameter and structural uncertainties are involved in con-struction of these models Thus it remains fundamentallychallenging to have a singlemodel for predictionMultimodelensembles are proposed to overcome the weakness of singlemodel which combine the outcome of different models toproduce efficient results [14 15] In addition monsoon showsdifferent characteristics over yearsThere exist groups of yearswhere variation of climatic parameters and pattern of rainfallare similar We use fuzzy clustering to cluster the similaryears together and model them separately The motivationbehind using fuzzy clustering is that each year manifests amixture of physical climatic events We cannot hard clustera year into a specific group years have their membership ofbelongingness to every cluster Fuzzy clustering is used toenclose the characteristics of different events being related toa year of study We use the same set of climatic parameters aspredictor set for every cluster but frame different models foreach cluster

A number of prediction models namely multiple regres-sion (MR) multilayer perceptron (MLP) recurrent neuralnetwork (RNN) and generalized regression neural network(GRNN) models are used for prediction of Indian monsoonfor the year clusters There exists viable reasons for usingneural networks like MLP RNN and GRNN for modelling(i) Indian monsoon is a complex process which cannot beadequately modelled by linearmodels (ii) nonlinearity in thetime-series pattern can be well captured by neural networklearning (iii) climatic events are much closely related to nearyears parameters disturbance as compared to distant yearsand neural network enables attaching weight to the yearparameter in appropriate manner

In this work climatic parameters that are strongly corre-lated with Indian monsoon are identified at the onset whichis followed by fuzzy clustering of years into groups withdegree of belongingness of each year to the clusters Thenwe model each cluster with four types of models namelyMR MLP RNN and GRNN to forecast rainfall Weightedensemble of forecasts given by respective models for eachcluster is considered as final predicted rainfall Analysisand comparisons are performed on aggregate Indian rainfalland finally a meteorological interpretation of the obtainedclusters is presented

The paper is organised in the following manner Wediscussed the details of data and predictor climatic param-eters in Sections 2 and 3 Proposed clustering basedapproach prediction model and ensemble technique arepresented in Section 4 with experimental results in Section 5

Meteorological significance is discussed in Section 6 andfinally conclusions are provided in Section 7

2 Data Sets Used

We consider the annual Indian summer monsoon rain-fall (ISMR) occurring in four months of June JulyAugust and September Annual ISMR is considered duringperiod 1948ndash2013 for our study The long period aver-age (LPA) (1948ndash2013) of ISMR is 8918mm ISMR isexpressed as percentage of the LPA value The data isobtained from Indian Institute of Tropical MeteorologyPune (httpwwwimdpunegovinresearchncclongrangedatadatahtml) [16]

Predictor parameters sea level pressure (SLP) (httpwwwesrlnoaagovpsdgcos wgspGriddeddatanoaaerslphtml) and sea surface temperature (SST) (httpwwwesrlnoaagovpsddatagriddeddatanoaaerssthtml) data areprovided by theNOAAOARESRLPSD at spatial resolutionof 2∘

times 2∘ [17] Surface pressure (SP) and zonal wind

velocity (WV) data are collected from NCEP ReanalysisDerived data provided by the NOAAOARESRL PSD(httpwwwesrlnoaagovpsddatagriddeddatanceprea-nalysisderivedsurfacehtml) [18] available at resolutionof 25

times 25∘ Finally Nino 34 data which is the sea

surface temperature anomaly for the spatial coverageof 5∘S to 5

∘N and 170∘W to 120

∘W in Pacific Oceanregion is acquired from National Center for AtmosphericResearch (httpwwwcpcncepnoaagovproductsanalysismonitoringensostuffensoyearsshtml) [19] All the abovemonthly data are considered for the period 1948ndash2013 in ourstudy and analysis

3 Global Climatic ParametersInfluencing Indian Monsoon

Indian monsoon is strongly influenced by several globalclimatic parameters occurring at places distant from Indiansubcontinent Identification of predictor parameters relies onphysical understanding of monsoon event and wind patternflow We have selected the climatic parameters based onthe parameters used by Indian meteorological departmentrsquosmodels [5 6] studying their correlation with Indian summermonsoon rainfall (ISMR) during our period of study (1948ndash2013) In the data preprocessing phase climatic anomaly dataare evaluated by calculating the deviation of parameter valuefrom long-term average value of the parameter exclusively foreachmonth followed by correlation study between ISMR andthe climatic parameters for a lag of zero to twelve monthsWe consider the best lagged predictor month having highcorrelationwith ISMRThe predictor climatic parameters andtheir correlation values with Indian monsoon are shown inTable 1 Figure 1 shows the geographic location of climaticparameters influencing Indian monsoonPredictor Sets of Climatic Parameters Based on the correlationwith Indian monsoon we have built five predictor sets forforecasting Different combinations of the identified climaticparameters (Table 1) form the predictor sets The predictorsets are shown in Table 2

Advances in Meteorology 3

Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)

CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)

70∘

70∘

60∘

60∘

60∘

60∘

40∘

40∘

40∘

40∘

20∘

20∘

20∘

20∘

0∘

70∘

70∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP

119894

representsparameter 119894 in Table 1

Table 2 Predictor sets with climatic parameters

Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9

4 Methodology

We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections

41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It

is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters

42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910

1 119910

119899 into a collection

of 119888 fuzzy clusters 119862 = cen1 cen

119888 and a partition matrix

119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908

119894119895gives the

degree of belongingness of element 119910119894to cluster with center

cen119895FCM aims to minimize an objective function of (1) The

update of partition matrix and centers occur in accordancewith (2) and (3) respectively

119869119898

=

119873

sum

119894=1

119888

sum

119895=1

119908119898

119894119895

10038171003817100381710038171003817119910119894minus cen119895

10038171003817100381710038171003817

2

1 le 119898 le infin (1)

119908119894119895=

1

sum119888

119896=1

(10038171003817100381710038171003817119910119894minus cen119895

100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen

119896

1003817100381710038171003817)2(119898minus1) (2)

cen119895=

sum119873

119894=1

119908119898

119894119895

sdot 119909119894

sum119873

119894=1

119908119898

119894119895

(3)

where119898 denotes the level of cluster fuzziness

4 Advances in Meteorology

Input

∙ Climateparameter(CP)

∙ Rainfall

Climate

anomaly

data

evaluation

Correlation

study of

CP

and rainfall

Identification

of month

of CP

highly

correlated

to

rainfall

Building

predictor

set with

different

CP

combination

Data preprocessing

Ensemble neural network model

Ensemble

of op of

all clusters

Forecast

by everyclusterby each

model

∙ MR

∙ MLP

∙ RNN

∙ GRNN

Clusters

Membershipvalue

Clustering years

Apply

alpha-

cut to

obtain

clusters

Clustering

predictor

set using

fuzzy

clustering

Ensemble

forecastClusters

ValidationPhysical interpretation of clusters

Error calculation

Comp

with

existing

models

Comp

with

nonclustered

method

Support Confidence

Threshold

Cluster interpretation

OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results

∙ Climatic eventsassociated with eachcluster

Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall

43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all

the clustersrsquo model forecast We describe below the modelsused

431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883

119894s) and a dependent variable (119884)

Advances in Meteorology 5

Table 3 Model parameter setting for MLP models

Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation

Multiple regression model having 119901 independent variables isshown in

119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573

119901119909119894119901

+ 120576119894 (4)

where 119909119894119895is the 119894th observation of 119895th independent variable

where the first independent variable takes the value 1 for all 119894and 120576 represents the residual

432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3

433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units

434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910

119895

represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by

119910(119909) =

119901

sum

119894=1

119882119894119910119894 (5)

where

119882119894=

exp (minus1003817100381710038171003817119909 minus 119909

119894

1003817100381710038171003817

2

21205902

)

sum119901

119896=1

exp (minus1003817100381710038171003817119909 minus 119909

119896

1003817100381710038171003817

2

21205902

)

(6)

The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data

Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training

44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters

Ensemble prediction119905 =119888

sum

119894=1

119882119905

119894

sdot 119875119894 (7)

where119875119894represents the prediction given by amodel for cluster

119894119882119905119894

is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters

45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach

The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 3: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Advances in Meteorology 3

Table 1 Climatic parameters (CP) influencing Indian monsoon with geographical location correlation values and correlated month (0signifies same years and minus1 signifies previous year)

CP CP name Location Correlation values Correlated monthsCP1 North Atlantic Ocean SST anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0242 Jan (0)CP2 North Atlantic Ocean surface pressure anomaly 20∘Nndash30∘N 100∘Wndash80∘W 0256 April (0)CP3 East Asia SLP anomaly 35∘Nndash45∘N 120∘Endash130∘E 0337 May (0)CP4 East Asia surface pressure anomaly 35∘Nndash45∘N 120∘Endash130∘E 0341 Mar (0)CP5 Equatorial South Eastern Indian ocean SST anomaly 20∘Sndash10∘S 100∘Endash120∘E 0200 Sept (minus1)CP6 Pressure gradient between Madagascar and Tibet mdash 0253 May (0)CP7 Nino 34 SST anomaly 5∘Sndash5∘N 170∘Wndash120∘W 0311 Sept (minus1)CP8 Equatorial Pacific Ocean SLP anomaly 5∘Sndash5∘N 120∘Endash80∘W 0272 Aug (minus1)CP9 North West Europe surface pressure anomaly 55∘Nndash65∘N 20∘Endash40∘E 0183 Jan (0)CP10 North Central Pacific zonal wind anomaly 5∘Nndash15∘N 180∘Endash150∘W 0457 May (0)

70∘

70∘

60∘

60∘

60∘

60∘

40∘

40∘

40∘

40∘

20∘

20∘

20∘

20∘

0∘

70∘

70∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

60∘

60∘

40∘

40∘

20∘

20∘

0∘

100∘

100∘

100∘

120∘

120∘

160∘

160∘

180∘

80∘

80∘

140∘

140∘

Figure 1 Climatic parameters over the globe governing Indianmonsoon (purple patches signify the location of climatic parameterstaken and blue patch represents the Indian region) CP

119894

representsparameter 119894 in Table 1

Table 2 Predictor sets with climatic parameters

Predictor sets Climatic parametersPredSet1 CP1 CP4 CP5 CP6PredSet2 CP4 CP5 CP6 CP7PredSet3 CP2 CP4 CP10PredSet4 CP2 CP4 CP7 CP10PredSet5 CP3 CP7 CP8 CP9

4 Methodology

We propose fuzzy clustering of monsoon years into groupsfollowed by building models for each group separately andfinally predicting Indian summer monsoon rainfall (ISMR)as weighted ensemble of forecasts provided by clustermodelsThe block diagram of the proposed fuzzy clustering-basedapproach to prediction of ISMR is shown in Figure 2Detailedsteps are described in the following subsections

41 Motivation Variability of Monsoon Patterns Trends anddistributions of monsoon vary to a large extent over years It

is thus necessary to group the years into clusters which havesimilar patterns of predictor climatic parameters affectingmonsoon The approach of clustering the years is effective aswe can build separate models for each cluster These clustermodels will be more accurate as variation within cluster isless Finally ensemble of forecasts of these cluster modelsresults in better prediction of IndianmonsoonAs an exampleconsider two clusters of years corresponding to strong El-Nino and North Atlantic Oscillation respectively A droughtyear has correlation with both events and hence might havesignificant degree of belongingness to both clusters

42 Fuzzy Clustering of Monsoon Years Fuzzy 119888-meansclustering is used for grouping the similar years togetherFuzzy 119888-means (FCM) is a method of clustering which allowsone instance of input to belong to more than one clusterwith some membership of belongingness FCM attempts topartition a set of119873 elements119884 = 119910

1 119910

119899 into a collection

of 119888 fuzzy clusters 119862 = cen1 cen

119888 and a partition matrix

119882 = 119908119894119895isin [0 1] 119894 = 1 119899 119895 = 1 119888 where 119908

119894119895gives the

degree of belongingness of element 119910119894to cluster with center

cen119895FCM aims to minimize an objective function of (1) The

update of partition matrix and centers occur in accordancewith (2) and (3) respectively

119869119898

=

119873

sum

119894=1

119888

sum

119895=1

119908119898

119894119895

10038171003817100381710038171003817119910119894minus cen119895

10038171003817100381710038171003817

2

1 le 119898 le infin (1)

119908119894119895=

1

sum119888

119896=1

(10038171003817100381710038171003817119910119894minus cen119895

100381710038171003817100381710038171003817100381710038171003817119910119894 minus cen

119896

1003817100381710038171003817)2(119898minus1) (2)

cen119895=

sum119873

119894=1

119908119898

119894119895

sdot 119909119894

sum119873

119894=1

119908119898

119894119895

(3)

where119898 denotes the level of cluster fuzziness

4 Advances in Meteorology

Input

∙ Climateparameter(CP)

∙ Rainfall

Climate

anomaly

data

evaluation

Correlation

study of

CP

and rainfall

Identification

of month

of CP

highly

correlated

to

rainfall

Building

predictor

set with

different

CP

combination

Data preprocessing

Ensemble neural network model

Ensemble

of op of

all clusters

Forecast

by everyclusterby each

model

∙ MR

∙ MLP

∙ RNN

∙ GRNN

Clusters

Membershipvalue

Clustering years

Apply

alpha-

cut to

obtain

clusters

Clustering

predictor

set using

fuzzy

clustering

Ensemble

forecastClusters

ValidationPhysical interpretation of clusters

Error calculation

Comp

with

existing

models

Comp

with

nonclustered

method

Support Confidence

Threshold

Cluster interpretation

OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results

∙ Climatic eventsassociated with eachcluster

Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall

43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all

the clustersrsquo model forecast We describe below the modelsused

431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883

119894s) and a dependent variable (119884)

Advances in Meteorology 5

Table 3 Model parameter setting for MLP models

Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation

Multiple regression model having 119901 independent variables isshown in

119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573

119901119909119894119901

+ 120576119894 (4)

where 119909119894119895is the 119894th observation of 119895th independent variable

where the first independent variable takes the value 1 for all 119894and 120576 represents the residual

432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3

433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units

434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910

119895

represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by

119910(119909) =

119901

sum

119894=1

119882119894119910119894 (5)

where

119882119894=

exp (minus1003817100381710038171003817119909 minus 119909

119894

1003817100381710038171003817

2

21205902

)

sum119901

119896=1

exp (minus1003817100381710038171003817119909 minus 119909

119896

1003817100381710038171003817

2

21205902

)

(6)

The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data

Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training

44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters

Ensemble prediction119905 =119888

sum

119894=1

119882119905

119894

sdot 119875119894 (7)

where119875119894represents the prediction given by amodel for cluster

119894119882119905119894

is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters

45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach

The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 4: Research Article Fuzzy Clustering-Based Ensemble Approach ...

4 Advances in Meteorology

Input

∙ Climateparameter(CP)

∙ Rainfall

Climate

anomaly

data

evaluation

Correlation

study of

CP

and rainfall

Identification

of month

of CP

highly

correlated

to

rainfall

Building

predictor

set with

different

CP

combination

Data preprocessing

Ensemble neural network model

Ensemble

of op of

all clusters

Forecast

by everyclusterby each

model

∙ MR

∙ MLP

∙ RNN

∙ GRNN

Clusters

Membershipvalue

Clustering years

Apply

alpha-

cut to

obtain

clusters

Clustering

predictor

set using

fuzzy

clustering

Ensemble

forecastClusters

ValidationPhysical interpretation of clusters

Error calculation

Comp

with

existing

models

Comp

with

nonclustered

method

Support Confidence

Threshold

Cluster interpretation

OutputOutput∙ Forecast of rainfall∙ Error in forecasting∙ Model comparison results

∙ Climatic eventsassociated with eachcluster

Figure 2 Proposed fuzzy clustering-based ensemble approach for prediction of Indian summer monsoon rainfall

43 Prediction Models Multiple regression and three modelsof artificial neural networks (ANN) namely multilayer per-ceptron recurrent neural network and generalized regres-sion neural network are used to design predictionmodels foreach cluster exclusively Forecast of annual ISMR is providedby each cluster model separately and also by ensemble of all

the clustersrsquo model forecast We describe below the modelsused

431 Multiple Regression (MR) Multiple regression model isused to learn the relationship between several independentpredictor variables (119883

119894s) and a dependent variable (119884)

Advances in Meteorology 5

Table 3 Model parameter setting for MLP models

Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation

Multiple regression model having 119901 independent variables isshown in

119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573

119901119909119894119901

+ 120576119894 (4)

where 119909119894119895is the 119894th observation of 119895th independent variable

where the first independent variable takes the value 1 for all 119894and 120576 represents the residual

432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3

433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units

434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910

119895

represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by

119910(119909) =

119901

sum

119894=1

119882119894119910119894 (5)

where

119882119894=

exp (minus1003817100381710038171003817119909 minus 119909

119894

1003817100381710038171003817

2

21205902

)

sum119901

119896=1

exp (minus1003817100381710038171003817119909 minus 119909

119896

1003817100381710038171003817

2

21205902

)

(6)

The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data

Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training

44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters

Ensemble prediction119905 =119888

sum

119894=1

119882119905

119894

sdot 119875119894 (7)

where119875119894represents the prediction given by amodel for cluster

119894119882119905119894

is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters

45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach

The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 5: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Advances in Meteorology 5

Table 3 Model parameter setting for MLP models

Parameter set Hidden layers Training years Training methodParSet1 [3 5] 20 BFGS quasi-Newton backpropagationParSet2 [3 5 10] 15 Conjugate gradient backpropagation with Powell-Beale restartsParSet3 [5 10] 10 Scaled conjugate gradient backpropagationParSet4 [3 5] 15 Resilient backpropagation

Multiple regression model having 119901 independent variables isshown in

119910119894= 12057311199091198941+ 12057321199091198942+ sdot sdot sdot + 120573

119901119909119894119901

+ 120576119894 (4)

where 119909119894119895is the 119894th observation of 119895th independent variable

where the first independent variable takes the value 1 for all 119894and 120576 represents the residual

432 Multilayer Perceptron Neural Network (MLP) Mul-tilayer perceptron neural network is a class of ANN whereconnections between the neurons do not form a directedcycle In this network the information propagates in onlyone direction from input nodes through hidden nodesand to the output nodes The independent and dependentvariables constitute the input and output layers respectivelyNumber of hidden layers with corresponding nodes mustbe determined empirically for each prediction task Fourdifferent parameter sets are considered empirically for modeldesigned to forecast ISMR shown in Table 3

433 Recurrent Neural Network (RNN) Recurrent neuralnetwork is a class of ANN which creates an internal stateof the network to exhibit dynamic temporal behaviourClimatic changes or events occurring in near or same timeperiod are highly correlated Similarly rainfall patterns aremore correlated to influencing factors in the near years ascompared to the distant years This phenomenon is wellcaptured by RNN which gives weights in decreasing order tothe values in near to distant years during training of networkThus it assists in modelling the system dynamics in muchnatural manner Same set of climatic parameters as MLPnetwork (Table 3) is considered with delay span of 2 units

434 Generalized Regression Neural Network (GRNN) Gen-eralized regression neural network is a variant of radialbasis function network GRNN has three layers of artificialneurons input hidden and output The hidden layer hasradial basis neurons while neurons in the output layer havelinear transfer function Output of radial basis neurons is theinput scaled by the spread factor Given 119901 input-output pairs119909119894119910119895isin R119899timesR1 with 119899 input variables and 119894 = 1 2 119901 119910

119895

represents the output from each hidden unit The GRNNoutput for a test point 119909 isin R119899 is described by

119910(119909) =

119901

sum

119894=1

119882119894119910119894 (5)

where

119882119894=

exp (minus1003817100381710038171003817119909 minus 119909

119894

1003817100381710038171003817

2

21205902

)

sum119901

119896=1

exp (minus1003817100381710038171003817119909 minus 119909

119896

1003817100381710038171003817

2

21205902

)

(6)

The reasons behind modelling using GRNN are (i) only onetunable design parameter (spread factor) (ii) one-pass algo-rithm (less time consuming) and (iii) accurately approximatefunctions from sparse data

Optimal training year is ascertained for MR and GRNNmodels by varying training years from 5 to 30 and validatingagainst least absolute error in prediction during validationperiod (1984ndash1993) A training of 119898 years specifies that forpredicting 119903th year rainfall available preceding119898 number ofyears 119903 minus 1 119903 minus 2 119903 minus 119898 present in a particular cluster areconsidered for training

44 Ensemble of Predictors Complexity in monsoon processmakes it difficult for a single model to predict rainfallaccuratelyWedesign separatemodels for each cluster of yearsobtained by fuzzy clustering using four predictors describedin Section 43 Finally annual ISMR is presented as weightedensemble of forecasts of model designed for each clusterWeight is taken as the fuzzy membership of belongingnessof the test year in different clusters

Ensemble prediction119905 =119888

sum

119894=1

119882119905

119894

sdot 119875119894 (7)

where119875119894represents the prediction given by amodel for cluster

119894119882119905119894

is the fuzzy membership of 119905th test year to cluster 119894 and119888 is the total number of clusters

45 Validation of Proposed Approach The study is performedon data for the period 1948ndash2013 Fuzzy clustering is per-formed over the period to cluster it into three groups Thenumber of clusters is decided based on cluster quality Sepa-rate prediction models are designed for all three clusters andensemble of forecasts of thesemodels is provided as predictedIndian summer monsoon rainfall Test period 2001ndash2013 isconsidered to evaluate the forecasting skills of our proposedapproach

The forecastmodels for annual ISMR are chiefly evaluatedin terms ofmean absolute errorOther error statistics namelyroot mean square error prediction yields Pearson correla-tion and Willmott index of agreement are also evaluated tojudge the efficacy of our proposed approach for predictionThey are described below

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 6: Research Article Fuzzy Clustering-Based Ensemble Approach ...

6 Advances in Meteorology

(i) Mean Absolute Error (MAE) Mean absolute errorfor prediction of annual ISMR is calculated in thefollowing way

MAE =sum119873

119894=1

|119884 minus 119883|

119873 (8)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and119873 denotes the total numberof test years

(ii) Root Mean Square Error (RMSE) Root mean squareerror calculates the differences between model pre-dicted output and actual values They are a goodmeasure to compare forecasting errors of variousmodels

RMSE = radic(119884 minus 119883)

2

119873 (9)

(iii) Prediction Yield (PY) Prediction yields are evaluatedat three different error categories (5 10 and15 errors) to assess the overall prediction resultsby judging percent of predicted years within eachallowed range of errors

(iv) Pearson Correlation Coefficient (PC) Pearson corre-lation coefficientmeasures the strength of linear asso-ciation between actual and predicted values wherethe value of 1 means a perfect positive correlation andthe value of minus1 means a perfect negative correlation

PC =

sum119873

119894=1

(119883119894minus 119883) (119884

119894minus 119884)

radicsum119873

119894=1

(119883119894minus 119883)2

radicsum119873

119894=1

(119884119894minus 119884)2

(10)

where 119883 and 119884 are the actual and predicted ISMRseries for test period and 119883 and 119884 are their corre-sponding mean

(v) Willmott Index of Agreement (WI)Willmott index ofagreement is a standardized measure of the degree ofmodel prediction error It varies between 0 and 1 withhigher values indicating a better fit of the model forprediction

Index of agreement = 1 minussum119873

119894=1

1003816100381610038161003816119883119894 minus 119884119894

1003816100381610038161003816

2

sum119873

119894=1

(10038161003816100381610038161003816119884119894minus 119883

10038161003816100381610038161003816+10038161003816100381610038161003816119883119894minus 119883

10038161003816100381610038161003816)2

(11)

5 Experimental Results and Analysis

In this section we present the evaluation of our proposedfuzzy clustering-based approach We first present the resultsof fuzzy clustering of the monsoon years for different pre-dictor sets Forecasting skills are evaluated for all cluster andthe ensemble model in terms of mean absolute errors for testperiod 2001ndash2013 In addition other measures like root meansquare errors in prediction correlation between predicted

Table 4 Cluster size (number of years) by fuzzy 119888-means clusteringwith 120572-cut of 03 over the period 1948ndash2013

Predictor set Cluster1 Cluster2 Cluster3PredSet1 16 38 30PredSet2 30 17 40PredSet3 32 14 38PredSet4 42 31 21PredSet5 15 37 26

and actual rainfall prediction yields and agreement indexbetween actual and predicted rainfall are also estimatedto establish the efficiency of our proposed approach toprediction of Indian summer monsoon rainfall

51 Clustering of Monsoon Years Fuzzy clustering is per-formed over period 1948ndash2013 to cluster the data into threeclusters We have performed an 120572-cut with value 120572 = 03to assign the data instances to the clusters The value isascertained empirically such that the distribution of elementswithin clusters is regular A data instance can be assigned tomore than one cluster simultaneously The cluster sizes areshown in Table 4 while considering various predictor sets

52 Prediction Accuracy We predict annual rainfall consid-ering for all five predictor sets (Table 2) separately using fourmodels namely MR MLP RNN and GRNN Test period isconsidered from 2001 to 2013

521 Multiple Regression Model (MR) Multiple regressionmodels are built for every cluster by ascertaining optimaltraining period for each predictor set Optimal trainingperiod is evaluated by varying training years and validatingthem for least absolute error in prediction during valida-tion period (1984ndash1993) Individual cluster based as well asweighted ensemble models are considered for predictionTable 5 gives the mean absolute error for individual clusterbased and ensemble models for test period 2001ndash2013 Themodel provides mean absolute error of 62 for PredSet4(Table 2) It is observed that the ensemblemodel outperformsall the single cluster models for every predictor set Figure 3shows the interannual variability of actual and ensemblepredicted rainfall as percent of long period average (LPA)

522 Multilayer Perceptron Neural Network Model (MLP)Multilayer perceptron neural networkmodel is designedwithfour different sets of parameters described in Table 2 Meanabsolute errors of all cluster and ensemble models are shownin Table 6MLP model reports an error of 40 for PredSet4(Table 2) with MLP parameters ParSet1 (Table 3) The actualand predicted rainfall by models built for clusters andensemble model is shown in Figure 4 Ensemble predictedrainfall closely follows actual rainfall

523 Recurrent Neural Network Model (RNN) Mean abso-lute errors for prediction of annual rainfall by recurrentneural network model for the test period 2001ndash2013 are

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 7: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Advances in Meteorology 7

Table 5 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MR cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 62

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 94 93 109 86PredSet2 20 110 75 94 83PredSet3 15 109 65 92 67PredSet4 15 104 101 68 62PredSet5 15 76 85 84 79

Table 6 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual MLP cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 40

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet4 138 181 169 82PredSet2 ParSet3 160 79 110 52PredSet3 ParSet1 80 78 65 65PredSet4 ParSet1 93 107 45 40PredSet5 ParSet1 85 153 137 110

ActualEnsembleClus1 model

Clus2 modelClus3 model

140

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 3 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters modelsby MR for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

presented inTable 7PredSet3 (Table 2)withRNN parametersParSet1 (Table 3) gives error of 51 RNN gives weightsin decreasing order of their distance from test year to thetraining years The pattern of actual and ensemble predictedrainfall in terms of percentage of LPA is shown in Figure 5

524 Generalized Regression Neural Network Model (GRNN)Generalized regression neural network ensemble and

ActualEnsembleClus1 model

Clus2 modelClus3 model

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

Figure 4 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byMLP for PredSet4 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

individual cluster modelsrsquo errors in terms of mean absoluteerrors are presented in Table 8 The model reports anerror of 61 for PredSet3 (Table 2) Figure 6 shows theinterannual variations of ensemble forecast of rainfall byGRNN ensemble model along with actual rainfall patternin terms of percentage of LPA for period 2001ndash2013 It isobserved that the predicted values are close to actual rainfallpatterns Prediction by models designed for clusters is shownby different symbols

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 8: Research Article Fuzzy Clustering-Based Ensemble Approach ...

8 Advances in Meteorology

Table 7 Mean absolute errors () for annual Indian summer monsoon rainfall prediction by individual RNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 51

Predictor set Parameter set Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 ParSet1 113 71 168 70PredSet2 ParSet1 132 135 126 85PredSet3 ParSet1 129 54 60 51PredSet4 ParSet1 123 64 47 59PredSet5 ParSet2 151 161 134 88

Table 8 Mean absolute errors () for annual Indian summermonsoon rainfall prediction by individual GRNN cluster models and ensemblemodel for test period 2001ndash2013 Reports minimum error of 61

Predictor set Training years Cluster1 error () Cluster2 error () Cluster3 error () Ensemble error ()PredSet1 20 100 76 76 64PredSet2 30 71 89 76 64PredSet3 20 58 92 60 61PredSet4 20 63 66 72 63PredSet5 25 71 94 119 66

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 5 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

53 Statistical Measures for Validation of Proposed ApproachNext we validate the models in terms of other accuracymeasures besidesmean absolute error Table 9 shows differentforecast verification statistics for ensemblemodels during testperiod 2001ndash2013 We summarize the observations below

(i) Root Mean Square Error (RMSE) MLP ensemblemodel gives RMSE of 53 followed by RNN ensem-ble model with 64 GRNN and MR models giveRMSE of 74 and 84 respectively

ActualEnsembleClus1 model

Clus2 modelClus3 model

120

100

80

60

40

20

0

Rain

fall

as

of L

PA

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

Test years

Figure 6 Performance of forecasts by proposed fuzzy clustering-based ensemble model and its respective three clusters models byGRNN for PredSet3 The deep and light purple bars represent theactual and predicted ISMR in terms of percent of LPA The symbolsrepresent forecasts given by individual cluster models The resultsare shown for test period 2001ndash2013

(ii) Prediction Yield (PY) PY for 5 error category ofMRMLP RNN andGRNN ensemblemodels is 4669 53 and 46 respectivelyThey give predictionyield of 76 92 92 and 84 for allowed errorof 10 category Finally at error category of 15MRMLP RNN andGRNN ensemblemodels give yield of92 100 92 and 100 respectively Thus noneof the predicted years show abrupt deviation fromcorresponding actual rainfall pattern

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 9: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Advances in Meteorology 9

Table 9 Prediction evaluation statistics for ensemble models during test period 2001ndash2013 (Section 45)

Verification measures MR MLP RNN GRNNRMSE for forecast () 84 53 64 74PY () at allowed error 5 46 69 53 46PY () at allowed error 10 76 92 92 84PY () at allowed error 15 92 100 92 100PC between actual and predicted rainfall 061 081 071 049WI between actual and predicted rainfall 071 089 081 062

Table 10 Comparison of absolute errors for rainfall prediction by proposed ensemble models (Ensml) with clustering (WC) approach tostandard method with same models without clustering (NC) approach

Predictor setMR MLP RNN GRNN

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

Tot error(NC) ()

Ensml error(WC) ()

PredSet1 89 86 100 82 117 70 69 64PredSet2 92 82 128 52 107 85 72 64PredSet3 74 67 67 65 62 51 61 61PredSet4 67 62 58 40 60 55 63 63PredSet5 82 79 97 110 89 88 90 67

(iii) Pearson Correlation (PC) PC of 061 081 071and 049 is observed for prediction by MR MLPRNN and GRNN ensemble models respectively Itis noticed that predicted rainfall by MLP ensemblemodel is highly correlated to actual values whilecorrelation for GRNN forecast is least

(iv) Willmott Index of Agreement (WI)WI forMRMLPRNN and GRNN ensemble models is 071 089081 and 062 respectively The index shows that theagreement between actual and predicted rainfall ishigh forMLP and RNN ensemble models

All of the mentioned statistical measures (Table 9) as wellas mean absolute error (Table 6) in prediction of monsoonascertainMLPmodel to be the best among all four proposedmodels

54 Comparison of Results

541 Comparison with State-of-the-Art Methods Proposedfuzzy clustering-based ensemble prediction models are com-paredwith themodels used by IndianMeteorologicalDepart-ment (IMD) It is comparedwith existing 16-parameter powerregression model [4] and Rajeevan et al [5] 8- and 10-parameter models Test period of seven years from 1996 to2002 is considered IMDmodels give rootmean square errorsof 108 76 and 64 respectively The MR MLP RNNandGRNN ensemblemodels give 60 34 44 and 55rootmean square errors respectively outperforming all threeIMDmodelsThe results are shown as a bar graph in Figure 7

542 Improvement of Cluster-BasedModels over ConventionalModels Ensemble model error obtained by combining allclustersrsquo model output is compared with error obtained by

IMD 16-paramIMD 8-paramIMD 10-paramMR

MLPRNNGRNN

16-par 8-par 10-par MR MLP RNN GRNNDifferent models

12

10

8

6

4

2

0

Root

mea

n sq

uare

erro

r as

of L

PA

Figure 7 Comparison of MR (grey) MLP (purple) RNN (lightpurple) and GRNN (deep purple) models with IMD existing 16-param (deep blue) 10-param (blue) and 8-param (light blue)models for time period of 1996ndash2002 [4 5] Striped bars representerrors by our proposed models

same model (parameter) trained on the whole dataset with-out clustering The mean absolute error for various modelsand predictor sets combinations are shown in Table 10 Theresult clearly depicts the improvement in prediction by clus-tering and ensemble method over nonclustered conventionalmethod

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 10: Research Article Fuzzy Clustering-Based Ensemble Approach ...

10 Advances in Meteorology

Table 11 Physical climatic events under study

Climatic event Numberof years Years associated with the event

Drought 13 1951 1965 1966 1968 1972 1974 1979 1982 1986 1987 2002 2004 2009Flood 11 1953 1956 1958 1959 1961 1964 1970 1975 1983 1988 1994

El-Nino 23 1951 1953 1957 1958 1963 1965 1966 1968 1969 1972 1977 1982 1983 1986 1987 1991 1992 1994 19972002 2004 2006 2009

La-Nina 22 1950 1954 1955 1956 1964 1970 1971 1973 1974 1975 1984 1985 1988 1989 1995 1998 1999 2000 20072008 2010 2011

Positive IOD 12 1957 1961 1963 1967 1972 1977 1982 1983 1994 1997 2006 2007Negative IOD 10 1958 1960 1964 1971 1974 1975 1989 1992 1993 1996

Table 12 Threshold of support and confidence measures forassociating obtained clusters with physical climatic events

Predictor set Support threshold Confidence thresholdPredSet1 037 030PredSet2 025 046PredSet3 021 043PredSet4 029 061PredSet5 021 054

55 Prediction of the Year 2014 Annual Indian summermonsoon rainfall for the year of 2014 is 7817mm which is878 of LPA value Proposed clustering-based ensembleMRMLP RNN and GRNN models predict rainfall of 2014 as961 803 800 and 953 of LPA respectively Thusproposed models show absolute error of 70 for forecastingrainfall of 2014

6 Meteorological Analysis

Next we try to visualize each cluster in terms of physicalclimatic events The clusters obtained by fuzzy clusteringare physically interpreted as being characterized by someglobal climatic events The climatic events considered andstudied during the time period 1948 to 2013 (period con-sidered for clustering in our work) are El-Nino La-Nina(httpggweathercomensoonihtm) positive and nega-tive Indian ocean dipole (httpbomgovauclimateIOD)drought and flood shown in Table 11

Figure 8 shows the El-Nino and La-Nina years associatedwith drought normal and excess rainfall years during 1948ndash2013 The years having rainfall 10 above LPA are excessrainfall years and years having rainfall 10 below LPA aredrought years The El-Nino and La-Nina years are shown bycolor codes (light green and green) in the figure The charthelps to visualize the cooccurrence of El-Nino and La-Ninaevents with extremities of ISMR

61Measuring Association betweenClimatic Events and ISMRSupport and confidence measures are considered to relatephysical climatic event to the clusters generated by fuzzyclustering They are defined below

25

20

15

10

5

0

minus5

minus10

minus15

minus20

minus25

Dev

iatio

n of

rain

fall

from

o

f LPA

1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Years

Normal years

El-Nino yearsLa-Nina years

Figure 8 El-Nino (light green) and La-Nina (green) years associa-tion with drought (years below 10 of LPA rainfall) normal (yearsbetween +10 and minus10 of LPA rainfall) and excess (years above10 of LPA rainfall) years during period 1948ndash2013

(i) Support Support is defined as percentage of totalnumber of years in the cluster corresponding to theclimatic event

Support =119909ce119873

(12)

where119909ce denotes the number of years associatedwitha specific climatic event in the cluster and 119873 is thetotal count of years in the cluster

(ii) Confidence Confidence is defined as percentage ofyears associated with the climatic event in the clusterto the total number of such event years

Confidence =119909ce119879ce

(13)

where 119879ce is the number of years associated with theclimatic event during the period 1948ndash2013

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 11: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Advances in Meteorology 11

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

50400

302022

10 2040

60

ConfidenSupp

(a)

14

12

10

8

6

4

2

060

5040

3020

100 0

2040

6080

100

ConfidenceSupport

Year

-cou

nt

5040

3020

10 2040

60

C nfidencSupport

(b)

Figure 9 Histogram of the confidence and support measures as bins of year-count before (a) and after (b) thresholding for PredSet1

Table 13 Identified physical climatic events being associated with clusters obtained by fuzzy clustering

Predictor Cluster1 Cluster2 Cluster3PredSet1 Drought El-Nino La-Nina La-NinaPredSet2 Flood La-Nina Drought Drought El-Nino La-NinaPredSet3 El-Nino positive IOD Drought Drought El-NinoPredSet4 La-Nina Flood La-Nina DroughtPredSet5 mdash Drought El-Nino Flood

We relate a cluster to a physical climatic event describedin Table 11 if both support and confidence measures attainthe corresponding thresholds The thresholds are chosen in away that 50 of years of study are under consideration A lowthreshold compromises the importance of a climatic eventbeing related to a particular cluster on the other hand if evenless number of years are taken then threshold values shouldbe high which in turn will leave out most of the clustersTherefore as an optimal between the extremes 50 of yearsare considered Figure 9 shows histograms with confidenceand support as bins of year-count for cases before andafter threshold process respectively for predictors PredSet1(Table 2) The threshold values obtained for predictor setsare presented in Table 12 For each predictor set we associatethe clusters with physical climatic events if they satisfy bothsupport and confidence thresholds The climatic events cor-responding to cluster are shown in Table 13 Results establishcoexistence of events of La-Nina and flood It also puts lighton high probability of occurrence of El-Nino drought andpositive IOD events simultaneously

7 Conclusion

Monsoon is an important phenomenon for economic devel-opment of agricultural-land like India Large variability ofmonsoon over years makes prediction of rainfall a challeng-ing task The paper attempts to address this problem by clus-tering the years into similar groups and finally multimodel

ensemble forecast is provided for Indian summer monsoonrainfall

Different climatic parameters with best correlated monthvalue are identified and five different predictor sets are builtfor prediction of Indian monsoon Four different modelsnamely MR MLP RNN and GRNN are designed foreach cluster exclusively The final forecast is provided byweighted ensemble of forecasts by each clusterrsquos model whereweight is considered as fuzzy membership of belonging-ness in each cluster Multilayer perceptron ensemble modelprovides mean absolute error of 40 for prediction ofannual rainfall which is appreciable for forecasting complexmonsoon process Proposed fuzzy clustering-based ensembleapproach surpasses the conventional approach Performanceof proposed clustering-based ensemble models is superior toexisting IMDrsquosmodels [4 5]The error statistics also ascertainthe superiority of multilayer perceptron model over otherthree proposed models Lastly in meteorological context theclusters are linked with global climatic events

In the future large number of climatic parametersinfluencing Indian monsoon can be explored and differentpredictor set can be used for different clusters of years toprovide even better forecasting accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 12: Research Article Fuzzy Clustering-Based Ensemble Approach ...

12 Advances in Meteorology

Acknowledgment

This work is supported by RBU project through RESPONDprogram of ISRO through KCSTC IIT Kharagpur

References

[1] H F Blanford ldquoOn the connexion of the Himalaya snowfallwith dry winds and seasons of drought in Indiardquo Proceedingsof the Royal Society of London vol 37 no 232ndash234 pp 3ndash221884

[2] G T Walker ldquoCorrelation in seasonal variations of weathermdashIV a further study of world weatherrdquo Memoirs of the IndiaMeteorological Department vol 24 pp 275ndash332 1924

[3] V Thapliyal and S M Kulshrestha ldquoRecent models for longrange forecasting of South-West monsoon rainfall in IndiardquoMausam vol 43 no 3 pp 239ndash248 1992

[4] V Gowariker V Thapliyal S M Kulshrestha G S MandalN Sen Roy and D R Sikka ldquoA power regression model forlong range forecast of southwest monsoon rainfall over IndiardquoMausam vol 42 no 2 pp 125ndash130 1991

[5] M Rajeevan D S Pai S K Dikshit and R R Kelkar ldquoIMDrsquosnew operational models for long-range forecast of southwestmonsoon rainfall over India and their verification for 2003rdquoCurrent Science vol 86 no 3 pp 422ndash431 2004

[6] M Rajeevan D S Pai R A Kumar and B Lal ldquoNew statisticalmodels for long-range forecasting of southwest monsoon rain-fall over Indiardquo Climate Dynamics vol 28 no 7-8 pp 813ndash8282007

[7] J Schewe and A Levermann ldquoA statistically predictive modelfor future monsoon failure in Indiardquo Environmental ResearchLetters vol 7 no 4 Article ID 044023 2012

[8] Q Wu Y Yan and D Chen ldquoA linear markov model for eastasian monsoon seasonal forecastrdquo Journal of Climate vol 26no 14 pp 5183ndash5195 2013

[9] K Fan Y Liu and H Chen ldquoImproving the prediction ofthe east asian summer monsoon new approachesrdquo Weather ampForecasting vol 27 no 4 pp 1017ndash1030 2012

[10] F Mekanik M A Imteaz S Gato-Trinidad and A ElmahdildquoMultiple regression and artificial neural network for long-termrainfall forecasting using large scale climate modesrdquo Journal ofHydrology vol 503 pp 11ndash21 2013

[11] A K Sahai M K Soman and V Satyan ldquoAll India summermonsoon rainfall prediction using an artificial neural networkrdquoClimate Dynamics vol 16 no 4 pp 291ndash302 2000

[12] W-C Hong ldquoRainfall forecasting by technological machinelearning modelsrdquo Applied Mathematics and Computation vol200 no 1 pp 41ndash57 2008

[13] S Chattopadhyay and G Chattopadhyay ldquoComparative studyamong different neural net learning algorithms applied torainfall time seriesrdquo Meteorological Applications vol 15 no 2pp 273ndash280 2008

[14] N Acharya S C Kar M A Kulkarni U C Mohanty andL N Sahoo ldquoMulti-model ensemble schemes for predictingnortheast monsoon rainfall over peninsular Indiardquo Journal ofEarth System Science vol 120 no 5 pp 795ndash805 2011

[15] V R Durai and R Bhardwaj ldquoImproving precipitation forecastsskill over India using a multi-model ensemble techniquerdquoGeofizika vol 30 no 2 pp 119ndash141 2013

[16] B Parthasarathy A A Munot and D R Kothawale ldquoMonthlyand seasonal rainfall series for All-India homogeneous regions

and meteorological subdivisions 1871ndash1994rdquo Tech Rep RR-065 Indian Institute of Tropical Meteorology 1995

[17] G P Compo J S Whitaker P D Sardeshmukh et al ldquoThetwentieth century reanalysis projectrdquo Quarterly Journal of theRoyal Meteorological Society vol 137 no 654 pp 1ndash28 2011

[18] E Kalnay M Kanamitsu R Kistler et al ldquoThe NCEPNCAR40-year reanalysis projectrdquo Bulletin of the AmericanMeteorolog-ical Society vol 77 no 3 pp 437ndash471 1996

[19] E M Rasmusson and T H Carpenter ldquoVariations in tropicalsea surface temperature and surface wind fields associated withthe Southern OscillationEl Ninordquo Monthly Weather Reviewvol 110 no 5 pp 354ndash384 1982

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in

Page 13: Research Article Fuzzy Clustering-Based Ensemble Approach ...

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ClimatologyJournal of

EcologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

EarthquakesJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom

Applied ampEnvironmentalSoil Science

Volume 2014

Mining

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

International Journal of

Geophysics

OceanographyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of Computational Environmental SciencesHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal ofPetroleum Engineering

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GeochemistryHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Atmospheric SciencesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OceanographyHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MineralogyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MeteorologyAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Paleontology JournalHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ScientificaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geological ResearchJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Geology Advances in