Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong...

17
Traffic Prediction in a Bike-Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research, Beijing, China

Transcript of Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong...

Page 1: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Traffic Prediction in a Bike-Sharing System

Yexin Li, Yu Zheng, Huichu Zhang, Lei ChenThe Hong Kong University of Science and Technology

Microsoft Research, Beijing, China

Page 2: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Bike-sharing systems are widely available Bike-Sharing System

Origin station

Destination station

Check out a bike

Check in the bike

Ride

Check out a bike Ride to destination Check in the bike

Current Problem

Spatial distribution

Skewed distributions of Bike Usage

05

10152025

Temporal distribution

No bikes No docks

Page 3: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Predict bike usages at each station

An Idea Solution

Reallocate bikes by trucks

8am 9am 10am 11am

S1

S2

S1

S2

Bike usage is chaotic at an individual station !

1st 4th 7th 10th 13th 16th 19th 22th 25th 28th 31th

Page 4: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

A Practical Solution

day hourhour0 5 10 15 25

1

1.5

2

0.5

S1

S2

C2

C1

Tra

nsit

ion

Var

.

E) Transition Var.

Tran

sitio

n Va

r.

1 20 40

60

120

180

dayD) check-out of C1

Chec

k-ou

t 7-8am C1

Observations Bike usage of a cluster is more predictable. Inter-cluster transition is more stable.

Prediction for each station is unnecessary Users check out/in bikes at a random station Events affect an area instead of a station

Our solution Cluster stations into groups Predict bike usage of each station cluster Reallocate bike between station clusters

8am 9am 10am

Page 5: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Impacted by multiple factors Meteorology

Challenges Cluster definition

Features considered when clustering

Data imbalance # Sunny hours >> # Rainy hours (11.7, 4.6 mph) never happened in

NYC, during 01/4-31/9, 2014 Weather distribution

Snowy

Sunny

Foggy

Rainy

Weather distributionTime/ hTemp/ oC

WS/

mp

h

5×1033110

30

10

20

0

Temperature & Wind Speed sample

Correlation between clusters Events

Larger check-out at A Larger check-in at B

AB

Correlation between clusters

Page 6: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Framework of Our Solution

Bipartite station clustering

Check-outPredict bike usage of the entire city

… …

0.2

0.1

Hierarchical Prediction

Predict check-out proportion

Check-in

Learning

Check-in Inference

Probability & Expectation

Check-inTransition matrix

Trip duration

Check-out

Page 7: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Motivation of Bipartite Station Clustering Stations in one cluster should be closed to each other. Stations in one cluster should perform similarly.

Inter-cluster transition is more stable. Check-out proportion is more stable.

Less stable

10000

More stable

25.025.025.025.00

C1

C2

C3

C4

C5

C1

C2

C3

C4

C5

Page 8: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Bipartite Station Clustering Procedure

Geo-clustering, i.e., K1 Clusters T-matrix generation T-clustering, i.e., K2 Clusters

I

TS1: (0.1, 0.3, 0.2, 0.4)

0.1

T-matrix of

42

31 1

12

3

0.3

0.2

0.4

0.1

42

31 1

12

3

0.2

0.4

0.3

0.5

42

31 1

12

3

0.2

0.2

0.1

TS2: (0.1, 0.2, 0.4, 0.3) TS7: (0.5, 0.2, 0.2, 0.1)I

Ax=

SxSx Sx

Sx

T-matrix Generation

I

TS1: (0.1, 0.3, 0.2, 0.4)

0.1

T-matrix of

42

31 1

12

3

0.3

0.2

0.4

0.1

42

31 1

12

3

0.2

0.4

0.3

0.5

42

31 1

12

3

0.2

0.2

0.1

TS2: (0.1, 0.2, 0.4, 0.3) TS7: (0.5, 0.2, 0.2, 0.1)I

Ax=

SxSx Sx

Sx

ۏێێێێۍ

0.1 0.3 0.2 0.40.1 0.2 0.4 0.3. . . .. . . .. . . .0.5 0.2 0.2 0.1 ےۑۑۑۑې

… …

Page 9: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Motivation of Hierarchical Prediction Bike usage in the entire city is more regular

can be predicted more accurately. Bound the total prediction error in the lower level

Entire Traffic

Che

ck-o

ut

1 20 40

1.2

2.4

3.6×103

dayEntire Traffic

day

Check-out of a cluster

1 20 40

60

120

180

day

Che

ck-o

ut

day

Predict bike usage of the entire city

Predict check-out proportion

… …

0.2

0.1

Hierarchical Prediction

Page 10: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Bike Usage of the Entire City Solution Gradient Boosting Regression Tree, i.e., GBRT

Day Hour

Weather

Ent

ire

traf

fic

6:00am-7:00am 7:00am-8:00am 8:00am-9:00am1

0.5

×103

2

1

×103

0 10 20 30

3.5

2

×103

day day day0 10 20 30 0 10 20 30

4

2

× 103

0 10 20 0 10 20 0 10 20 0 10 20 0 10 20 0 10 20 0 10 20

4

2

× 103

4

2

× 103

4

2

× 103

3.52

× 103

2.5

1

× 103

2.5

1

× 103Mon. Tue. Wed. Thu. Fri. Sat. Sun.

13th , Aug. Rainy 25th , Sep. Windy

0 50 100 150 200

×104

4

3

2

1

Feb. – Aug., 2014

day

Aug.Feb.

Ent

ire

traf

fic 1.4

0.9

×103

0.4

0 10 20 30

8:00am-9:00am

day0 50 100 150 200

×104

4

3

2

1

Feb. – Aug., 2014

day

Aug.Feb.

Ent

ire

traf

fic 1.4

0.9

×103

0.4

0 10 20 30

8:00am-9:00am

day

Temperature keeps increasing

Wind speed

Temperature

Features Extraction

Page 11: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Check-out Proportion Prediction�� t=

∑𝑖=1

𝐻

𝑊 ( 𝑓 𝑖 , 𝑓 𝑡 )× 𝑃𝑖

∑𝑖=1

𝐻

𝑊 ( 𝑓 𝑖 , 𝑓 𝑡)𝑃 t𝑃 𝑡− 1𝑃 𝑡−𝐻 𝑃 𝑡−𝐻+2𝑃 𝑡−𝐻+1

… …

W(𝑓𝑖 , 𝑓𝑡 ) = 𝜆1(𝑖, 𝑡) × 𝜆2(𝑤𝑖 , 𝑤𝑡) × 𝐾((𝑝𝑖 , 𝑣𝑖 ), (𝑝𝑡, 𝑣𝑡 )) λ1 (𝑡 1 ,𝑡 2 )=1𝑡 1 ,𝑡 2

× 𝜌1∆ h(𝑡1 , 𝑡2 )× 𝜌2

∆ 𝑑(𝑡 1 ,𝑡 2) Time

Weather

snowy

rainy

foggy

sunny

snowy rainy foggy sunny1 α1 α2 α3

1 α4 α5

1 α6

1

=

Temperature & Wind speed

foggy

foggy

1 12

Page 12: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Transition Matrix & Trip Duration

Transition Probability. The probability that a bike will be checked in to cluster 𝐶𝑗 given it is checked out from 𝐶𝑖 in time . 𝑡

Inter-cluster transition

210

C1 C3

Data

Density

x_1 data

fit 1

1.5 ×10-3

Data

Density

fit 1

1.6 ×10-3

1.20.80.4

3 54×103/s

210 3 54Trip duraion Trip duraion ×103/s

C1 C2

Den

sity

0.5

1.0

Trip duration Using a log-normal distribution to fit

C1 C2

C3C4

0.1

0.39

0.5

0.65

0.150.15

0.6

0.1

0.29

0.88

0.05

0.05

0.010.05

0.01 0.02

02.005.088.005.0

29.001.06.01.0

15.015.005.065.0

5.039.01.001.0

Page 13: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Check-in Inference Check-out

Check-in

=

t

jiCCtt

m

j

tC

i DTO

Eij

j

60

0

,,

60

1 1

,

,2 60

iitC EEIi ,2,1,

iBCCtiB jijBj

DTP ,,,,

ijBi, PE

,1

Expectation of on-road bikes to each cluster

Bikes on road

C1

C2

C4

C3

... ...1st min 2nd min 60th min

C1 C2 C4C3

0.10.5

0.30.1

iB jP ,

jB

0.4 0.2 0.3 0.1

2 2 2

Bikes will be borrowed

t<t t+< t+

Page 14: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Experiments Datasets

Citi-Bike Data in New York City Meteorology Data in New York City Capital Bikeshare in Washington D.C. Meteorology Data in Washington D.C.

Metric Error Rate

m

i tC

m

i tCtC

i

ii

X

XXER

1 ,

1 ,, |ˆ|

Data Released: http://research.microsoft.com/apps/pubs/?id=255961

Page 15: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

ExperimentsCheck-out All Hours Anomalous Hours Methods GC BC GC BC

HA 0.353 0.355 1.964 1.968ARMA 0.346 0.346 2.276 2.273GBRT 0.311 0.314 0.696 0.683

HP-KNN 0.298 0.299 0.692 0.685HP-MSI 0.288 0.282 0.637 0.503

Clustering Results

Check-in All Hours Anomalous Hours

Methods GC BC GC BC

HA 0.347 0.352 1.837 1.835

ARMA 0.340 0.344 2.152 2.143

GBRT 0.309 0.309 0.681 0.671

HP-KNN 0.302 0.295 0.694 0.684

HP-MSI 0.297 0.290 0.642 0.506

P-TD 0.335 0.302 0.498 0.445

Accuracy improvement >0.03 for all hours

>0.18 for anomalous hours

Page 16: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Conclusions

Bipartite station clustering Cluster stations based on locations and transitions

Hierarchical prediction improves the accuracy Bound the total error in the lower level >0.03 improvement for all hours

Multi-similarity-based model Deal with data imbalance >0.18 improvement for anomalous hours

Page 17: Traffic Prediction in a Bike- Sharing System Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen The Hong Kong University of Science and Technology Microsoft Research,

Thanks !

Contact: Dr. Yu Zheng [email protected] Released Data: http://research.microsoft.com/apps/pubs/?id=255961