UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia...

14
UNIVERSITI PUTRA MALAYSIA HADI GALAVI FK 2010 56 TIME SERIES MODELING OF WATER LEVEL AT SULAIMAN STATION, KLANG RIVER, MALAYSIA

Transcript of UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia...

Page 1: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

UNIVERSITI PUTRA MALAYSIA

HADI GALAVI

FK 2010 56

TIME SERIES MODELING OF WATER LEVEL AT SULAIMAN STATION, KLANG RIVER, MALAYSIA

Page 2: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

TIME SERIES MODELING OF WATER LEVEL AT SULAIMAN STATION,

KLANG RIVER, MALAYSIA

By

HADI GALAVI

Thesis submitted to the School of Graduate Studies, Universiti Putra Malaysia,

in Fulfilment of the Requirements for the degree of Master of Science

November 2010A BSTRA CT

Page 3: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

ii

DEDICATION

This thesis is dedicated to my parents for their love, endless support and

encouragement.

Page 4: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

iii

ABSTRACT

Abstract of thesis presented to the senate of Universiti Putra Malaysia in fulfilment of the requirement for the degree of Master of Science

TIME SERIES MODELING OF WATER LEVEL AT SULAIMAN STATION,

KLANG RIVER, MALAYSIA

By

HADI GALAVI

November 2010

Chairman: Prof Lee Teang Shui, PhD

Faculty: Engineering

The temporal and spatial flow in rivers represents the total response of a watershed and

is affected by every hydrological event occurs in the watershed. The important point in

planning and management of water projects is to achieve accurate estimation of river

stage and flow. Among the two major approaches for modeling and forecasting

hydrologic events, the empirical or black-box models have received more attention than

the physical based models by hydrologists. Among the empirical models the

Autoregressive Integrated Moving Average (ARIMA) models which have been

conventionally used for hydrologic modeling and the Adaptive Neuro-Fuzzy Inference

System (ANFIS) as a data driven model were used in this study to investigate their

effectiveness in forecasting the river flows. The Klang River is one of the important

urban rivers in Malaysia and is the subject of this research. A total of 3012 daily average

water level measurements (2002 – 2010) of the river were used in this study. The

number of inputs for both models is selected based on the Autocorrelation function

(ACF) and the Partial Autocorrelation Function (PACF). The best ARIMA model is

selected based on the minimum corrected Akaike Information Criterion (AICC) statistic

Page 5: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

iv

obtained among the several possible ARIMA models. An ARIMA(3,1,3) model was

found to be the appropriate ARIMA model for the modelling of the selected case study.

A Subtractive Clustering Method was used in the fuzzy inference system to determine

the optimal number of Membership Functions (MF) and rules. Using the cross validation

method the best training subset is selected to train the ANFIS model based on that

dataset. The estimation of parameters of the model is accomplished using the hybrid

learning algorithm consisting of standard neural network backpropagation algorithm and

least squares method.

Among the several structures of ANFIS model examined, the model with four

membership functions for the model inputs and output, Guassian membership function

for the fuzzification, four rules for the fuzzy inference engine, and trained using hybrid

algorithm was selected as the best constructed ANFIS model. The performance of the

models, ANFIS and ARIMA models, was compared based on established statistical

performance measures. Results show the superiority of the ANFIS model over the

ARIMA model with interestingly better overall index of 1.514 against 1.108. However,

the database of ARIMA model when became updated with each prediction step showed

that results were dramatically better than the simple ARIMA model with 0.48

improvement in overall index. The updated ARIMA model showed very close results to

those of the ANFIS model with overall index of 1.588 and 1.514, respectively. Overall,

the updated ARIMA model in terms of overall index measure outperforms the ANFIS

model for prediction of average water level, but not in the prediction of an exact time a

day. Both models based on the obtained results (below one percent mean absolute

percentage error) are appropriate models for the modelling and forecasting of Klang

River water level. These models can be applied to the other case studies with model

calibration in terms of the available dataset.

Page 6: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

v

ABSTRAK

Abstrak tesis dikemukakan kepada Senat Universiti Putra Malaysia sebagai memenuhi keperluan untuk ijazah Master Sains

TIME SERIES MODELING OF WATER LEVEL AT SULAIMAN STATION,

KLANG RIVER, MALAYSIA

Oleh

HADI GALAVI

November 2010

Pengerusi: Prof Lee Teang Shui, PhD

Fakulti: Kejuruteraan

Aliran di dalam sungai dari segi masa dan reruang membayangkan jumlah sambutan

sebuah kawasan tadahan dan ia dipengaruhi setiap peristiwa hidrologi yang berlaku

di dalam tadahan itu. Perkara terpenting di dalam perancangan dan pengurusan

projek air adalah untuk mencapai penganggaran tetap paras air dan aliran. Di antara

dua pendekatan utama bagi pemodelan dan peramalan peristiwa hidrologi, model

empirik atau kotak-hitam telah mendapa lebih tumpuan daripada juru-hidrologi

dibandingkan model berassakan fizik. Di antara model empirik Autoregressi

Terkamir Purata Bergerak (ARIMA) yang biasa digunakan sebagai model hidrologi

dan Sistem Taabir Kelam Saraf Suai (ANFIS) sebagai model berpandu data telah

digunakan dalam kajian ini demi menyiasat keberkesanan untuk meramalkan aliran

sungai. Sungai Klang ialah sebuah sungai perbandaran di Malaysia dan dipilih untuk

kajian ini. Sebanyak 3012 ukuran harian paras air (2002 – 2010) sungai itu diguna

dalam kajian.

Page 7: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

vi

Nombor masukan dipilih berdasarkan Fungsi Autosekaitan (ACF) dan Fungsi

Autosekaitan Separa (PACF) bagi kedua dua model. Model ARIMA terbaik dipilih

berdasarkan statistik Kriterium Matlumat Akaike (AICC) minimum yang didapati di

antara beberapa model ARIMA. Model ARIMA(3,1,3) didapati paling sesuai untuk

memodelkan kes kajian terpilih itu. Kaedah Gugusan Penolakan telad digunakan di

dalam sistem taabir kelam untuk menentukan nombor optima Fungsi Keahlian (MF)

dan peraturan. Dengan menggunakan kaedah pengesahan silang subset latihan

terbaik dipilih untuk melatih model ANFIS berdasarkan set data tersebut. Parameter

penganggaran model dicapai mengunakan algoritma belajar hibrid yang

mengandungi algoritma piawaian rangkaian saraf perambatan-balik dan kaedah

kuasa dua terkecil.

Diantara beberapa struktur model ANFIS yang diuji, model dengan empat fungsi

keahlian bagi masukan dan keluaran model, fungsi keahlian Gaussian untuk

kekelaman, empat peraturan untuk injin taabir kelam dan terlatih dengan

menggunakan algoritma hibrid dipilih sebagai model terbaik dibina. Prestasi model

ANFIS dan ARIMA dibandingkan berdasarkan ukuran prestasi statistik tertubuh.

Keputusan menunjukkan keunggulan model ANFIS dibandingkan model ARIMA.

Akan tetapi, bila database model ARIMA dikemaskini bagi setiap langkah ramalan,

keputusan menunjukkan bahawa ianya lebih baik dibandingkan dengan model

ARIMA dahulu. Model ARIMA terkemaskini menunjukkan keputusan rapat dengan

model ANFIS dan terbukti tertunjuk dengan beberapa ukuran statistik model

ARIMA adalah lebih baik. Pada keseluruhan, model ARIMA terkemaskini didapati

berukuran index keseluruhan lebih berprestasi. Berasaskan kepada keputusan

Page 8: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

vii

tercapai, kedua dua model (ARIMA dan ANFIS) adalah sesuai untuk memodelkan

dan meramalkan paras air di dalam sungai Klang.

Page 9: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

viii

ACKNOWLEDGEMENTS

First of all, I would like to express my deepest appreciation to my supervisor Prof.

Dr. Lee Teang Shui, for his valuable guidance and suggestions, kindly attitude, and

for believing in me during this work. I have been honored to have the opportunity to

be his student. I would also like to thank Prof. Dr. Mohammad Hamiruce B.

Marhaban, my co-supervisor, for his valuable comments and discussions on this

study.

I attribute a great deal of my success to the positive interactions I have had with my

family. My mother’s emotional support with endless pure love over the years has

been immeasurable. My father gave me encouragement and support, without which

my educational goals could not be realized. I have learnt from him to love the

science, be patient, and work hard.

Additionally, I would like to thank Dr. Mahendran Shitan for always making time to

answer my questions despite his busy schedule at department of mathematics.

Special thanks are due to the Department of Irrigation and Drainage (Kuala Lumpur)

for providing very useful information in order to carry out this study.

Page 10: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

ix

APPROVAL I certify that a Thesis Examination Committee has met on 04 November 2010 to conduct the final examination of Hadi Galavi on his thesis entitled “Time Series Modelling of Water Level at Sulaiman Station, Klang River, Malaysia” in accordance with the Universities and University College Act 1971 and the Constitution of the Universiti Putra Malaysia [P.U.(a) 106] 15 March 1998. The Committee recommends that the student be awarded the Master of Science. Members of the Examination Committee were as follows: Helmi Zulhaidi Mohd Shafri, PhD Lecturer Faculty of Engineering Universiti Putra Malaysia (Chairman)

Law Teik Hua, PhD Lecturer Faculty of Engineering Universiti Putra Malaysia (Internal Examiner) Mohd Amin b Mohd Soom, PhD Professor Faculty of Engineering Universiti Putra Malaysia (Internal Examiner)

Othman A. Karim, PhD Associate Professor Information Technology Centre Universiti Kebangsaan Malaysia (External Examiner)

SHAMSUDDIN SULAIMAN, PhD Professor and Deputy Dean School Of Graduate Studies Universiti Putra Malaysia Date: 18 January 2011

Page 11: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

x

This thesis was submitted to the senate of Universiti Putra Malaysia and has been accepted as fulfilment of the requirement for the degree of Master of Science. The members of the supervisory committee were as follows:

Lee Teang Shui, PhD Professor Faculty of Engineering Universiti Putra Malaysia (Chairman) Mohammad Hamiruce b. Marhaban, PhD Professor Faculty of Engineering Universiti Putra Malaysia (Member)

HASSANAH MOHD GHAZALI, PhD Professor and Dean School Of Graduate Studies Universiti Putra Malaysia Date:

Page 12: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

xi

DECLARATION

I declare that the thesis is my original work except for quotations and citations which have been duly acknowledged. I also declare that it has not been previously and is not concurrently submitted for any other degree at Universiti Putra Malaysia or other institutions.

HADI GALAVI Date:

Page 13: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

xii

TABLE OF CONTENTS

Page

DEDICATION ii ABSTRACT iii ABSTRAK v

ACKNOWLEDGEMENTS viii APPROVAL ix

DECLARATION xi LIST OF TABLES xiv

LIST OF FIGURES xv

LIST OF ABBREVIATIONS xvii LIST OF NOTATIONS xviii

CHAPTERS

1 INTRODUCTION 1

1.1 Background 1

1.2 Problem Statement 3

1.3 Research Objectives 6

1.4 Scope of the Research 7

1.5 Significance of the work 7

2 LITERATURE REVIEW 9

2.1 Hydrological Forecast 9

2.2 Hydrological Time Series 10

2.3 Box-Jenkins Models 12

2.4 Soft Computing 17

2.4.1 Artificial Neural Network (ANN) 18

2.4.2 Neuro-Fuzzy Modelling 22

2.5 Summary 31

3 ARIMA AND NEURO-FUZZY MODELLING APPROACHES 33

3.1 General 33

3.2 Box-Jenkins Models 34

3.2.1 Data Preprocessing 36

3.2.2 Identification of the Tentative ARIMA Model 37

3.2.3 Parameter Estimation 41

3.2.4 Diagnostic Checking 45

3.2.5 Forecasting 48

Page 14: UNIVERSITI PUTRA MALAYSIApsasir.upm.edu.my/40933/1/FK 2010 56R.pdf · universiti putra malaysia hadi galavi fk 2010 56 time series modeling of water level at sulaiman station, klang

© COPYRIG

HT UPM

xiii

3.3 Neuro-Fuzzy Modelling Approach 50

3.3.1 Data Pre-Processing 52

3.3.2 Model Inputs and Output 53

3.3.3 ANFIS Architecture and Algorithm 54

3.3.4 Fuzzy Clustering 59

3.3.5 Estimation of Parameters 63

3.3.6 Defuzzification 66

3.3.7 Model Performance Measurements 66

4 MODEL DEVELOPMENT AND TESTING 70

4.1 Study area and Data Acquisition 70

4.1.1 Available Data 71

4.2 ARIMA-Based Model 72

4.2.1 Data Preprocessing 72

4.2.2 Model Identification 76

4.2.3 Parameter Estimation and Diagnostic Check 77

4.3 ANFIS-Based Model 81

4.3.1 Input Selection 81

4.3.2 Data Preparation 82

4.3.3 FIS Construction 84

4.3.4 ANFIS Training 86

5 RESULTS AND DISCUSSIONS 88

5.1 ARIMA Model 88

5.2 ANFIS-Based Model 91

5.2.1 Selection of Best Training/Testing Dataset 91

5.2.2 Selection of Best FIS 93

5.2.3 Sensitivity to the Different MFs 98

5.3 The Finally Selected ANFIS Model 101

5.4 Comparison of ARIMA and ANFIS based Models 103

5.5 Summary 110

6 CONCLUSIONS AND RECOMMENDATIONS 112

6.1 Conclusions 112

6.2 Recommendations for Future Research 114

REFERENCES 116

APPENDIX 123

BIODATA OF STUDENT 129

LIST OF PUBLICATIONS 130