Examining Data Pattern

40
Exploring data patterns

description

Examining Data Pattern

Transcript of Examining Data Pattern

Page 1: Examining Data Pattern

Exploring data patterns

Page 2: Examining Data Pattern

2

Overview

• The purpose of the forecast is to reduce the range of uncertainty within which management judgments must be made.

• Petunjuk Utama:• accurate enough and justified on a cost-

benefit basis• effectively presented

Page 3: Examining Data Pattern

3

Forecasting Process

• Problem formulation and data collection

• Data manipulation and cleaning

• Model building and evaluation

• Model implementation (the actual forecast)

• Forecast evaluation

Page 4: Examining Data Pattern

4

Problem formulation and data collection

• The problem determines the appropriate data• If a quantitative forecasting methodology is

being considered, the relevant data must be available and correct

• If appropriate data are not available• The problem may have to be redefined or a

non-quantitative forecasting methodology employed.

Page 5: Examining Data Pattern

5

Data manipulation and cleaning

• Some effort is required to get data into a form that is required for using certain forecasting procedures.• It is possible to have too

much data as well as too little in forecasting process.

• Some data may not be relevant to the problem.

• Some data may have missing values that must be estimated.

Page 6: Examining Data Pattern

Data

• Data should be reliable and accurate

• Data should be relevant

• Data should be consistent

• Data should be timely

Generally there are two types of data:

• Data collected at single point in time• Cross sectional data: observations collected at a single

point in time

• Observations of data made over time• Time series: data are collected, recorded, or observed over

successive increments of time

Page 7: Examining Data Pattern

Time series data pattern

• Trend: long-term component that represents the growth or decline in the time series over an extended period of time

• Cyclical component: wavelike fluctuation around the trend

• Seasonal component: pattern of change that repeats itself year after year

Page 8: Examining Data Pattern

8

Pola data

Trend-cyclicalSeasonal

Page 9: Examining Data Pattern

Exploring data pattern: autocorrelation analysis

• Autocorrelation: the correlation between a variable lagged one or more periods and itself

• Autocorrelation coefficients for different time lags of a variable are used to identify time series data pattern

Page 10: Examining Data Pattern

10

Autocorrelation Analysis

Page 11: Examining Data Pattern

11

Pola Data Autocorrelation Analysis

12

12

1

23

22

1

n

t tt

n

tt

n

t tt

n

tt

Y Y Y Yr

Y Y

Y Y Y Yr

Y Y

Page 12: Examining Data Pattern

12

Pola Data

• Random• Nilai rk untuk nilai berapa pun k mendekati nol

• Trend• Nilai rk untuk nilai k=1, tinggi dan k naik maka

rk makin kecil

• Seasonal/cyclic• Nilai rk untuk nilai pengulangan musim/siklus

akan tinggi.

Page 13: Examining Data Pattern

13

Contoh

Page 14: Examining Data Pattern

14

Contoh

100

110

120

130

140

150

160

170

0 5 10 15

Page 15: Examining Data Pattern

15

r1

Page 16: Examining Data Pattern

16

r2

Page 17: Examining Data Pattern

r3

Page 18: Examining Data Pattern

Correlogram

• Correlogram or autocorrelation function is a graph of autocorrelations for various lags of a time series

321

1,0

0,8

0,6

0,4

0,2

0,0

-0,2

-0,4

-0,6

-0,8

-1,0

Lag

Auto

corr

elation

Autocorrelation Function for C1(with 5% significance limits for the autocorrelations)

Page 19: Examining Data Pattern

Random

10987654321

1,0

0,8

0,6

0,4

0,2

0,0

-0,2

-0,4

-0,6

-0,8

-1,0

Lag

Auto

corr

elation

Autocorrelation Function for C1(with 5% significance limits for the autocorrelations)

Page 20: Examining Data Pattern

Trend

Lag

Auto

corr

ela

tion

16151413121110987654321

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Autocorrelation Function for index(with 5% significance limits for the autocorrelations)

Page 21: Examining Data Pattern

Trend

• A significant relationship exists between successive time series values

• Autocorrelation coefficients are typically large for the first several time lags and then gradually drop toward zero as the number of lags increase

• Non-stationary time series• Stationary time series: mean and variance

remain constant over time varies about a fixed level (no growth or decline) over time

Page 22: Examining Data Pattern

Seasonal data

Lag

Auto

corr

ela

tion

2624222018161412108642

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Autocorrelation Function for C1(with 5% significance limits for the autocorrelations)

Page 23: Examining Data Pattern

Seasonal data

• If a series is seasonal, a pattern related to the calendar repeats itself over a particular interval of time (usually a year)

• Observations in the same position for different seasonal periods tend to be related

Page 24: Examining Data Pattern

Choosing a forecasting technique

• Define the nature of the forecasting problem

• Explain the nature of the data under investigation

• Describe the capabilities and limitations of potentially useful forecasting techniques

• Develop some predetermined criteria on which the selection decision can be made

Major factor influencing the selection of forecasting technique is the identification and understanding of historical patterns in the data

Page 25: Examining Data Pattern

Forecasting for stationary data

Stationary forecasting techniques are used whenever:

• The forces generating a series have stabilized and the environment in which the series exists is relatively unchanging

• A very simple model is needed because of lack of data or for ease of explanation or implementation

• Stability may be obtained by making simple corrections for factors such as population growth or inflation

• The series may be transformed into a stable one

• The series is a set of forecast errors from a forecasting technique that is consider adequate

Forecasting techniques: naïve, simple average, moving average, ARMA (Box-Jenkins)

Page 26: Examining Data Pattern

Forecasting techniques for data with trend

• A time series is said to have trend if its average values changes over time (increase or decrease)

• Forecasting techniques:• Moving averages• Holt’s linear exponential smoothing• Simple regression• Growth curves• Exponential models• ARIMA (Box-Jenkins)

Page 27: Examining Data Pattern

Forecasting techniques for seasonal data

Forecasting techniques for seasonal data are used whenever:

• Weather influences the variable of interest, i.e. electrical consumption, clothing, agricultural growing session

• The annual calendar influences the variable of interests

Forecasting techniques: classical decomposition, winter’s exponential smoothing, multiple regression, ARIMA

Page 28: Examining Data Pattern

Forecasting techniques for cyclical series

• Cyclical patterns are difficult to model because their patterns are typically not stable

• The up-down wavelike fluctuations around the trend rarely repeat at fixed interval of time, the magnitude of fluctuations also tends to vary

• Because of the irregular behavior of cycles, analyzing cyclical component of the series often requires finding coincidental or leading economic indicators

Page 29: Examining Data Pattern

Forecasting techniques for cyclical series

Forecasting cyclical data are used whenever:

• The business cycle influences the variable of interest

• Shifts in popular tastes occur

• Shifts in population occur

• Shifts in production life cycle occur

Forecasting techniques: classical decomposition, economic indicator, econometric models, multiple regression, ARIMA (Box-Jenkins)

Page 30: Examining Data Pattern

other factors to consider

• Time horizon• Short-intermediate term: moving averages,

exponential smoothing, Box-Jenkins, classical decomposition, regression model

• Long term: regression model

• Time constraint• Exponential smoothing, trend projection, regression

model, classical decomposition

• Representation for decision making process• Regression model, trend projection, classical

decomposition, exponential smoothing

Page 31: Examining Data Pattern

31

Tugas#1

• Cari data:• Jakarta IHSG• Harga Minyak Dunia• Harga Emas• Kurs Rupiah terhadap dollar• Gross Domestic Product• Kebutuhan darah • Jumlah wisatawan DIY• Konsumsi listrik/air/kebutuhan pokok (sembako)

• Evaluasi Pola Datanya

• Laporan:• 10 halaman maksimum• Presentasi minggu depan

Page 32: Examining Data Pattern

32

Model building and evaluation

• Fitting the collected data into a forecasting model that is appropriate in terms of minimizing forecasting error.

• The simpler the model, the better it is in terms of gaining acceptance.

• Judgment is involved in this selection process.

Page 33: Examining Data Pattern

33

Model implementation (the actual forecast)

• Forecasting for recent periods in which the actual historical values are known is often used to check the accuracy of the process.

Page 34: Examining Data Pattern

34

Forecast evaluation

• Comparing forecast values with the actual historical values.

• Examination of the error patterns often leads the analysis to a modification of the forecasting procedure.

Page 35: Examining Data Pattern

35

Measuring forecasting errors

1. Mean Absolute Deviation (MAD)

1

n

t ti

A FMAD

n

2. Mean Square Error (MSE)

2

1

( )n

t ti

A FMSE

n

Page 36: Examining Data Pattern

36

Measuring forecasting errors

3. Mean Percentage Error (MPE)

)

1

(

100

nt t

i t

A F

AMPE

n

4. Mean Absolute Percentage Error (MAPE)

1 100

nt t

i t

A F

AMAPE

n

Page 37: Examining Data Pattern

37

Measuring forecasting errors

6. Tracking Signal

t tX FTS

MAD

5. R-squared

Page 38: Examining Data Pattern

38

Metode Validasi Silang

• Simple cross validation• Training set & test set

• Double cross validation• Training set & test set + silang

• Test-set cross validation• 30% acak test set & sisanya training

set

Page 39: Examining Data Pattern

39

Metode Validasi Silang

• k-fold cross validation• Dibagi k bagian, k untuk test set &

sisanya training set + diulang k kali

• Leave-on-out cross validation• Mirip k-fold, dengan k=jumlah data

Page 40: Examining Data Pattern

______________________________Creative-Productive-Efficient

40

Questions?