Zheng WANG Yan LIU Jieping YE · Smart Transportation Brain Control Analysis Data Collection Signal...
Transcript of Zheng WANG Yan LIU Jieping YE · Smart Transportation Brain Control Analysis Data Collection Signal...
Zheng WANG Yan LIU Jieping YEDiDiAI Labs
Didi Chuxing
DiDiAI Labs
Univ. of Southern California
DiDiAI Labs
Univ. of Michigan, Ann Arbor
Outlinen ChallengesandopportunitiesintransportationAI (20min)• Overviewofurbantransportation• TheemergingchallengesintransportationAI
n AIapplicationsintransportation (165min+Break)• MapservicesI:mapmatching,routeplanning,estimatedtimeofarrival(ETA) (60min)• Break(30min)• MapservicesII: trafficestimation,trafficforecast(45min)• Decisionmakingservices:dispatching(30min)• AIapplicationsandAIforsocialgood(30min)
n DataandtoolsfortransportationAI (15min)n Q&A(10min)
Part 1:ChallengesandOpportunitiesinTransportationAI
HistoryofUrbanTransportation
HistoryofTrafficLight
1860sBefore
NoTrafficLight Hand-operated TrafficLight
1910s
Electric TrafficLight
1920s
Weigh inMotion
1960s1980s
Camera RadarLoop Detector
Transportation:AMulti-DisciplinaryIndustry
Planning
Management
Engineering PolicyMaking
Design
Science
Transportation
……
Cutting-EdgeIssues
7
CooperativeVehicle-Highway Systems RideSharing Multimodal
Transportation
SmartTransportationSystemSmartTravelers
Smart InfrastructureSmartVehicles Cloud
BigData
TransportationEngineeringAI
FromDriveAlonetoRideSharing
FromHumanDriving toAutonomousDriving
FromIndependent SystemstoCooperativeVehicle-Highway Systems
BuildingtheBrainsofSmartTransportation
Intelligent andConnectedVehicles FutureIntersection
AdaptiveControlasaService
AIandMachineLearning
NeuralNetworks
MachineLearning:supervised,unsupervised
DeepLearning
ReinforcementLearning
?
DataResource
• Locationdata,Trajectorydata
•Transaction data
•Profile data
• Sensors:multimedia data
•Cross-platform identification
GPS Data
LocationDataandFloating-CarTrajectory
Sensors
Loopdetector,camera,microphone,mobilesensors …
TransportationAI
BigdatamakesAIpossiblefortransportation.
SmartTransportationBrain
Control
Analysis
DataCollection
SignalControl
FreewayControl
TrafficGuidance
IncidentManagement
AIDispatch
PerformanceMeasures
CongestionDiagnosis
NetworkDesign
TrafficSimulation
AccidentAnalysis
DiDi Data
GovernmentData
Collaborators’Data
CrowdSourcedData
Supply
Demand
??
? ?
Ride-sharingServices
PlatformOptimization
MapServices
Taxi
Express
Car Pool
Premiere
……
Demand-Supply Prediction
Order Dispatch
CarPooling
ResourceAllocation
Multi-modal
RoutePlanningETA
Pick-uplocations
VRNavigation
RoutePlanning
Outlinen ChallengesandopportunitiesintransportationAI (20min)• Overviewofurbantransportation• TheemergingchallengesintransportationAI
n AIapplicationsintransportation (165min+Break)• MapservicesI:mapmatching,routeplanning,estimatedtimeofarrival(ETA) (60min)• Break(30min)• MapservicesII: trafficestimation,trafficforecast(45min)• Decisionmakingservices:dispatching(30min)• AIapplicationsandAIforsocialgood(30min)
n DataandtoolsfortransportationAI (15min)n Q&A(10min)
Part 2:AIApplicationsinTransportation
FutureTransportation
SmartInfrastructure
SmartVehicle
SharedMobility
ElectricalVehicle,AutonomousVehicle,…
Highway,Road,SmartTrafficLight,…
MapService,DecisionService,…
KeyComponents
•BasicLayer:mapserviceandLBS•UpperLayer:decisionserviceandmarketplace
MapService I
SmartMapforModernTransportation
Navigation
RideHailing TransportationSystemEfficiency
AutonomousDriving
CoreMapService
MapMatching
RoutePlanning ETA
TrafficPositioning
MapMatching
MapMatching
•ProblemDefinition
• Solutions toMapMatching
•Challenges
MapMatching
GPS Points
Road Network
Naive Approach:NearestNeighbor
GPS Points
Road Network
SequencetoSequence:HMM
•ModelthisprocessusingHiddenMarkovModel.
Hidden state (Road
Segment)
Observation(GPS)
TransitionProbability
Emission Probability
zi
xi
x1 x2x0 x3 x4
z4z1 z2 z3
*PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009
EmissionProbability
0
0.02
0.04
0.06
0.08
0.1
0.12
0 5 10 15 20
DistanceBetweenGPSandMatchedPoint(meters)
GPSDeviationFollowsGaussianDistribution
Data Histogram
Gaussian Distribution
*Partofthisslideisborrowedfrom:PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009
TransitionProbability
01234567
0 0.5 1 1.5 2
Pro
bab
ility
abs(great circle distance - route distance) (meters)
DistanceDifferenceDistribution
Data Histogram
Exponential Distribution
*Partofthisslideisborrowedfrom:PaulNewsonetal.HiddenMarkovMapMatchingThroughNoiseandSparseness.ACMSIGSPATIAL 2009
ParameterEstimation
• Inreality:heuristicestimation• Emissionprobability parameter(noiseinlocationmeasurements):
• Transitionprobabilityparameter(toleranceofnon-directroutes):
• Parameterrefinement
• In literature:parameterlearning
ParameterLearningwithIRL
•Map Matching with Inverse Reinforcement Learning• Transition probability variant, HMM
• 𝑃"#$%& = )*exp(−
|#∗234|* )
• Conventionally, 𝑟∗ = 𝑑) + 𝑑9 + 𝑑: + 𝑑; − 𝑑2• Here 𝑟∗ = 𝑑) + 𝑑9 + 𝑑: + 𝑑; − 𝑑2 +𝑤"=#%(𝑢?) + 𝑢)9 + 𝑢9:)
• Weight estimation 𝑤"=#% with IRL
*T.Osogami etal.MapMatchingwithInverseReinforcementLearning,IJCAI2013.
StateEstimationusingViterbiAlgorithm
•Map-matchingasstateestimation• Input:asequenceofGPSpoints,HMM• Output:themostlikelystatesequence,i.e.,asequenceofedgesintheroadnetwork.
•ViterbiAlgorithm• DynamicProgrammingbasedmethodtoidentifythestatesequencewiththehighestprobability.
Challenges
• Large-scaleGPSdata
• Low-qualityGPSdataformobiledevice
• Limited amountoflabeleddata• Unsupervised learning:EMforHMM• Semi-supervised learning
Reference• SotirisBrakatsoula etal.Onmap-matchingvehicletrackingdata,VLDB2005• PaulNewsonetal.HiddenMarkovmapmatchingthroughnoiseandsparseness,ACMSIGSPATIAL2009• YinLouetal.Map-matching forlow-sampling-rateGPStrajectories,ACMSIGSPATIAL2009• T.Osogami etal.MapMatchingwithInverseReinforcementLearning,IJCAI 2013
RoutePlanning
RoutePlanning
RoutePlanning
•Challenges:• Large-scaletransportationnetwork• Highqueryspeed• Accurate result
•Classical problem:shortest pathalgorithm• Dijkstra’salgorithmanditsextensions:(high)query speed,lessrobust• A*-algorithm:robust,lowqueryspeed• Customizablerouting:robust,relativehighqueryspeed
•Data-drivenapproaches• Whatistheproperedgeweightforthegraph?• Howcanwetakeadvantageofthebigdata?
Shortest Path
•Buildagraphbasedonmapandtrafficinformation.•Graphedge weightfortravelcost• Findaroutewiththeminimumtravelcost.
A
B
C
D
E
F
G1
1
2
3
4
2
6
2
Dijkstra'salgorithm:agreedyapproach
•Given a weightedgraphwithnonnegativeedgeweights,Dijkstra'salgorithmfindstheshortestpathbetweennodesinthegraph.
Thecomplexity ofDijkstra’s algorithmis𝑂 𝑒 + 𝑣 log|𝑣| .
A*-Algorithm:aheuristicapproach
• Search withaheuristicguidance
*PicturefromHannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.
Forward Search Bidirectional Search A* Search
Speedup
• SpeedupDijkstra’sAlgorithm
Dijkstra’s Algorithm BidirectionalDijkstra’sAlgorithm GraphContraction
FastShortest Path
ContractionHierarchiesmethodisatechniquetospeedupshortest path routingbyfirstcreatingprecomputed"contracted"versionsoftheconnection graph.Itcanberegardedasaspecialcaseof"highway-noderouting".
•ContractionHierarchies(CH)• Preprocessing:sortingthenodes,addingshortcut,layer-wisecontraction• Query:bidirectionalDijkstra's algorithm,unfolding theshortcut
ComponentReduction
shortcut
•Remove nodes andedgesw/oshortcut.
E C
A
BD
E C
A
BD E C
BD
E C
BD
IsAintheshortestpath?N
Y
OrderforContraction
NodeOrder
• Setting the node orderforcontraction:edgedifference,costofcontraction,uniformity,costofqueries…•Orderforcontrationandshortestpathsearch.
4
2
3
5
1
1
2
3
4
5
Graph Contraction
•Multi-levelcontraction
4
2
3
5
11
2
3
4
5
Bidirectional Search
•BidirectionalDijkstra's algorithm:forwardsearchgoesfromlowordernodetohighordernode,andviceversa.•Unpacktheshortestpath.
Node Order
backwardforward
Guarantee
•Provedbycontradiction
MultipleMetric
•CRP:customizablerouteplanning• Metricindependentprocessing(partition)• Metriccustomization• Query
Partition
…
•Multi-layerpartitiononunweighted graph• Overlaygraph:distancepreservingsubgraph
OverlayGraph
•Threepossiblewaysofpreservingdistanceswithintheoverlaygraph:fullclique,arcreductionandskeleton.
*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.
Customization
Pruning
*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.
Query
• Shortestpathinasubgraph:overlaygraph+subgraphincludingODnodes.•Unpacktheshortestpath:Dijkstra’sAlgorithmforeachclique.
*PicturefromDaniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience2017.
SummaryofShortest Path Search
*HannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.
ResultsonroadnetworkofWestEurope,usingtraveltimesasedgeweights.(18Mnodesand42.5Medges)
Challenges
TravelDistance TravelSpeed Traffic condition Driver/PassengerPreference
It is verydifficulttopresetproperpenalty weighttorepresentall thosefactorsCanwelearnhumandrivingpatterns?
DataDrivenvsHeuristic
Data Driven:decisionprocess
Approximately,RoutePlanningproblemcanbeseenasadeterministicMDPproblem.Ourgoalistomaximize/minimize customer’sreward/cost
decision
Time cost,distance,
user preferenceetc.
Agent
Environment
Data Driven
Classification/Reinforcementlearning(drivingpolicy)problem
S
S
S
S
V(S)
a
a
a
1. Selection
Data Driven
How to guaranteethe long time planningeffectiveness?
S
S
S
S
S
V(S)
V(S)
a
a
a
a
2. Expand & Evaluate
a
a
Data Driven
S
S
S
S
S
V(S)
V(S)a
a
3. Backtrack
a
a
Q
Q
Data Driven
S
S
a
4. Execute
Data Driven
Learning to Plan the Route
•ModelingTrajectorieswithRecurrentNeuralNetworks• H.Wu,Z.Chen,W.Sun,etal.ModelingTrajectorieswithRecurrentNeuralNetworks.IJCAI2017.
•Route Planning with ReinforcementLearning• T.Weber,S.Racanière,D.P.Reichert,etal.Imagination-AugmentedAgentsforDeepReinforcementLearning.NIPS 2017.
ModelingTrajectorieswithRecurrentNeuralNetworks
•RNN:capturevariablelengthsequence• Similarity&differenceinlanguage/trajectorymodelingusingRNN• Similar:generatewords/edgesstepbystep,dependingonthepresentandpastwords/edges.• Different:thetransitionfromonewordtoanyotherwordisfree,whileonlythetransitionfromoneedgetoitsadjacentedgesispossible.
ModelingTrajectorieswithRecurrentNeuralNetworks
•Goal:useRNNtolearnthetopologicalconstraints.
• Solution:modificationofstate-constrainedsoftmax function.
ModelingTrajectorieswithRecurrentNeuralNetworks
HiddenState
HiddenState
ModelingTrajectorieswithRecurrentNeuralNetworks
Route Planning with Reinforcement Learning
Imagination-Augmented Agent (I2A), which incorporating model-free RL and model-
based RL, improvesdataefficiency,performance,androbustness.
Route Planning with Reinforcement Learning
Observed State
Model-FreeApproaches
Disadvantages• Requireslarge amountsoftraining
data• Resultingpoliciesdonotreadily
generalize tonoveltasksinthesameenvironment
Model-BasedApproaches
Advantages• Endowingagentswitha modelof
theworld• Supportgeneralization tostatesnot
previouslyexperienced
Disadvantages
• Complexdomains hard to buildenvironmentmodels.
• Performancesuffersfrommodelerrorsresulting.
Route Planning with Reinforcement Learning
Environment Model Construction Options• Make assumptions aboutthestructureoftheenvironmentmodelwith domain
knowledge• Trained directlyonlow-levelobservationswithlittledomainknowledge,similarly
torecentmodel-freesuccesses.(e.g., I2A)• PerfectWorldModel
Input:Observed State,Chosen Action à Output:Next State,Rewarde.g.
Left
EM
Down
EMState
Action
Input InputOutput Output
10 + (−1)Reward −1Box reach target 10
Step cost -1Step cost -1
Key of Model-Based method —— EnvironmentModel
Route Planning with Reinforcement Learning
Rollout EncoderRNN - GRU
“Last-mile”Value Presentation
Output Layer of Model-based: Path EncoderRole:EM generate path after taking a specific action via imagination.
Encoder transform the path into the evaluation of selected action.
e.g.
Imagination PathO"𝑂I";), �̂�";)𝑂I";9, �̂�";9
Action: Left
. . .
Realized by RNN
EM EM
Input
Output
Imagination
Route Planning with Reinforcement Learning
• I2A performs better than others • Performance ∝ Imagination Steps
• Diminishingreturnswithmorerollout steps
Route Planning with Reinforcement Learning
Address the challenge• RNN Encodercaptures thesequentialinformation.• Work well even with imperfect EM (learntoignorethelatteraserrorsaccumulate)
Alternative Solution• Learningtonavigateincitieswithoutamap,fromdeepmind.
*PiotrMirowski,etal.Learningtonavigateincitieswithoutamap.Arxiv 2018.
Reference• R.Geisberger etal.Contractionhierarchies:Fasterandsimplerhierarchicalroutinginroadnetworks.In7thWorkshoponExperimentalAlgorithms.LNCSSeries,vol.5038.Springer,pp319-333,2008• Daniel Delling etal.CustomizableRoutePlanninginRoadNetworks.TransportationScience.vol.51(2),pp395-789,2017.• HannahBast etal.RoutePlanninginTransportationNetworks. Arxiv 2015.• HWuetal.ModelingTrajectorieswithRecurrentNeuralNetworks.IJCAI2017.• T.Weberetal.Imagination-AugmentedAgentsforDeepReinforcementLearning.NIPS 2017.• PiotrMirowski etal.Learningtonavigateincitieswithoutamap.Arxiv 2018.
ETA(EstimatedTimeofArrival)
ETA(EstimatedTimeofArrival)
ETAApplications
Route Planning Ride SharingOrder Dispatching Pricing
MainStreamETAModels
• Additivemodels:• Rule-basedadditivemodels:explicitlymodelingthesegmentsinapath• Aggregatingthetimeofsub-paths• Machinelearningmodelsforthesub-pathproblem.
• Globalmodels:• Formulating ETAasaregressionproblem:Learning toEstimatetheTravelTimes(L2ETT)• Simple regression model and deeplearningmodel
• Path-freemodels:• Pathisnotavailable
SimpleAdditive Model
• Simple rules based on physical structure of road network•Popular solutionindigitalmapindustry•Challenges: precisespeedestimationanderror accumulation
AggregatingSub-Paths•Thepathcanbedecomposedintosub-paths.•Timeofeachsub-pathisinferencedfromthehistoricaltrajectories.•ETAisthesummationofthetimeofsub-paths.
*HongjianWangetal.Asimplebaselinefortraveltimeestimationusinglarge-scaletripdata.SIGSPATIAL2016.
TensorDecomposition•Build a 3D tensor (driver, road segment, time slot) andusetensordecompositiontoestimatethetraveltimeofeachsegment.•Use dynamic programing to find the optimal concatenations.
*YilunWang etal.Traveltimeestimationofapathusingsparsetrajectories.KDD2014.
RegressionModelonGPSPoints•DeepregressionmodelonGPSdata• GPSpointswillnotbeavailablebeforetrip inreal-worldcase.
*DongWang,etal.WhenWillYouArrive?EstimatingTravelTimeBasedonDeepNeuralNetworks.AAAI2018.
Time Series Prediction•Conventional regressionmodel:support vector regression•Deeplearningmodel:recurrent neural network
*Chun-Hsin etal.Travel-timepredictionwithsupportvectorregression.IEEETrans.ITS2004.*Yanjie Duan etal.TraveltimepredictionwithLSTMneuralnetwork.ITSC2016.
selected routes
f(t-n),..,f(t-1) à f(t)
f(t-n),..,f(t-m) à f(t)
ProbabilisticETA•Output a distributionof ETA•Consider the variance of travel time•Assume Gaussian distribution for linktraveltime
*MohammadAsghari etal.Probabilisticestimationoflinktraveltimesindynamicroadnetworks.SIGSPATIAL2015.
AdditiveModelvsGlobalModel
Additive model?
Pure learning?
1Heuristic rule
2 Indirective objective
3 Less robust
4 Insufficient use of data
Rethinking ETA
Historical &real-timetraffic data
Segment-wisestatistic andaccumulation
Directestimation byregressionmodel
Statisticalmachinelearning
Learning to Estimate the Travel Time
•Big data + machine learning• high accuracy• datadriven• robustness
Feature extraction
Static/dynamic featureDense featureSparse feature
Raw data
Machine learningmodel
ETAservice
ProblemFormulation• Formulation: ETA as a regression problem on sequential input
•Data:acollectionoftrips𝑿 = 𝑥O OP)Q
• 𝑥O denotesfeaturevectoronthetrajectoryalongtheroutepath 𝑝O• morethan20milliontripsaday
•Objective:forMAPEloss,minV∑ |XY2V(ZY)|
XYQOP)
•Robust:differenttrainingdatafordifferentscenarios• pickingthepassenger• deliveringthepassenger• orderdispatch• carpooling
RichFeature System
Roadnetwork Order GPS
trajectoryTrafficrecord Weather User
profile
Staticfeature
Dynamicfeature
Statistical densefeature
Large scalesparse feature
Personalizedfeature
Spatialpattern
Temporalpattern
Historicalpattern
Real-timepattern
Personalizedpattern
Data
Basic Feature
FeatureEngineering
Output
Twofaces:globalfeaturevssequentialfeature
Raw data
Raw data
……
……
feature
labelTod
RecurrentModel
•ModelsequentialfeatureswithRecurrentNeuralNetwork(RNN).
Tree-BasedModel->Wide&DeepModel
• Features• Spatialinformation• Temporalinformation• Trafficinformation• Personalized information• Augmentedinformation
•W&Dmodelprocessestheglobalinformation
Wide-Deep-RecurrentNetwork(WDR)
*Zheng Wangetal.LearningtoEstimatetheTravelTime,KDD2018.
OfflineEvaluationDataset
OfflineEvaluationResults
OnlineEvaluationResults
WDRachievesthebestperformanceforallmetrics.
Progress
RegressionModel
DeepLearningModel
Low risk of overfitting
Easy to implement
Fast training and inference
Better interpretability
Better performance
sequential information
ModelCompression
High inferencespeed
More portable
More Practical:OriginDestinationETA
Routeisnotavailableuntiltheendofthetrip.
OriginDestinationETA
•Origin-destinationETA(pathfree)
•Challenges• Limitedinformation:noactualpath• Complicatedspatiotemporaldependencies
MURAT:Multi-taskRepresentationLearningforETA
RoadNetworkEmbedding
SpatiotemporalSmoothness
MURAT:Multi-taskRepresentationLearningforETA
TravelDistance
#RoadSegments
#Trafficlights
#Left/rightTurns
HistoricalPaths AuxiliaryTasks
MURAT:Multi-taskRepresentationLearningforETA
*Yaguang Lietal.Multi-taskRepresentationLearningforTravelTimeEstimation,KDD,2018
Evaluation
Beijing60M+trips
NewYorkCity20M+trips
LearnedRepresentation
Timeinday
CircularShape
Smoothtransition
Reference• Corrado deFabritiis etal.TrafficEstimationAndPredictionBasedOnRealTimeFloatingCarData,IEEETITS,2008.• ErikJenelius etal.Traveltimeestimationforurban roadnetworksusinglowfrequencyprobevehicledata,TransportationResearchB,2013• Yilun Wangetal.Traveltimeestimationofapathusingsparsetrajectories,KDD2014• ZhengWangetal.Learningtoestimatethetraveltime,KDD2018.• Yaguang Lietal.Multi-taskRepresentationLearningforTravelTimeEstimation,KDD2018.
MapServiceII
TrafficEstimationandForecasting
Introduction
• Trafficcongestingiswastefuloftime,moneyandenergy• TrafficcongestioncostsAmericans$124billion+ direct/indirectlossin2013.
• Accuratetrafficforecastingcouldsubstantiallyimproverouteplanningandmitigatetrafficcongestion.
DataSource
•Traffic estimation and forecasting• Data source: loop detector, GPS trajectory and camera• Target: traffic speed in real time and for the future
TrafficEstimation
•Calculate theground-truthtrafficdata:multiplesourcedatafusion,spatial-temporalspline.
TrafficPrediction
• Input:roadnetworkandpastT’trafficspeedobservedatsensors•Output:trafficspeedforthenextTsteps
7:00AM 8:00AM
Input:ObservationsOutput:Predictions
...
... 8:10AM,8:20AM,…,9:00AM
ChallengesforTrafficForecasting
Page 112
12/08/2017
ComplexSpatial Dependency
Spee
d (m
ile/h
)
Non-linear, non-stationary Temporal Dynamic
ChallengesforTrafficForecasting
• Spatialrelationshipamongtrafficflowis non-Euclideananddirected
KNN-basedApproaches
• Findsimilarhistoricaltraffictimeseries• Extractvariousfeaturesfromtraffictimeseries• Definesimilaritymetricsbetweentraffictimeseries
•Calculatethepredictionbyaggregatingnexttrafficreadingsofnearestneighbors.
*PicturefromWikipedia.https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
TimeSeriesMethods
• SeasonalAutoregressiveIntegratedMovingAverage(SARIMA)
• SupportVectorRegression(SVR)
* Picture from http://www.saedsayad.com/support_vector_machine_reg.htm
TrafficPredictionwithLatentSpaceModel(LSM)
•Modeltheroadnetworkasagraph•ModelcorrelationbetweenroadsegmentwithLatentSpacemodel
*DingxiongDengetal,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016
Defineamodelthatcangeneratethetrafficconditionofroadnetwork
LatentAttributes
•Eachvertexhaslatent attributes• Vertexi haslatentattributevectorui
highway arterial business resident intersection non-inter
i 0.9 0.1 0.8 0.2 1 0
j 0.8 0.1 0.7 0.3 0 1
…
NodeattributematrixUnxk(nisthenumberofvertices)
AttributeInteraction
highway arterial business resident intersection non-inter
highway 40 20 20 40 10 20
arterial …
business …
residential …
intersection …
non-inter …
Attributeinteractionmatrix𝑩 ∈ 𝑹𝒌×𝒌
• InteractionmatrixB betweendifferentattributes
SpeedfromLatentAttribute
•Trafficspeedbetweenvertices𝑖 and 𝑗 (𝑖 → 𝑗) isalinearcombinationofthecorrespondingtrafficpatterns
1
1
40 0
0 101 1 = 50XX
highway business
ui
ujTB
i j?
BasicGraphModel
Non-negativeGraphMatrixFactorization(NMF)[1]
120
U
Graphmatrix:𝑮 ∈ 𝑹𝑵×𝑵 Latentproperties:𝑼 ∈ 𝑹𝒏×𝒌 and𝑩 ∈ 𝑹𝒌×𝒌
UTB
[1]Wang,F.,Li,T.,Wang,X.,Zhu,S.andDing,C.,2011.Communitydiscoveryusingnonnegativematrixfactorization.DataMiningandKnowledgeDiscovery,22(3),pp.493-521.
travelspeed
OverviewofLSMforRoadNetwork
Time
Latent attributes evolve with different timestamps
12
G1
Predict
U1 U2 Ut
Gt G2
Learn
Gt+1G3
LSMforRoadNetwork
122
BasicGraphModel
TransitionEffect
TemporalEffect
OvercometheSparsity
*DingxiongDengetal,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016
ExperimentalSettings
•Dataset• MarchandApril,2014sensordatawithmorethan60millionrecords• TwosubgraphsofLosAngelesroadnetwork
•Baselines:• LSM-Naive[Zhangetal.KDD’12]• ARIMA[Williamsetal.TRB’98],ARIMA-SP• SVR[Wuetal.ITS’04],SVR-SP
•Measurement:MAPE,RMSE
ExperimentalResults
5% 10% 15% 20% 25% 30% 35%
MAP
E(%
)
Time (pm)
LSM-Global LSM-Inc ARIMA ARIMA-SP
Non-Rush hour
5% 10% 15% 20% 25% 30% 35%
MA
PE (
%)
Time (am)
LSM-Global LSM-Inc ARIMA ARIMA-SP
Rush hour
Latent space models (LSM) outperform other baselines.
DeepLearningforTrafficForecasting
•TrafficPredictionusingStackedAutoencoder(SAE)togetherwithlogisticregressionontopofthenetworkforsupervisedtrafficflowprediction
* Yisheng Lv et al. Traffic flow prediction with big data: A deep learning approach, IEEE TITS, 2015
DeepLearningforExtremeTrafficPrediction
•MainGoal:forecastingtrafficforextremeincludingrush-hourandpost-accident
Y. Qi et al, Deep Learning: A Generic Approach for Extreme Condition Traffic Forecasting. SDM 2016
DeepMixtureLSTM
•DeepMixtureLSTMisamixtureofLSTMandAutoencoder• LSTM:normalconditiontraffic• Autoencoder:accidentspecificfeatures(time,severity)• Merge:linearregression
Experiments
•Baselines:• ARIMA:AutoRegressiveIntegratedMovingAverage• RandomWalk:constanttimeserieswithrandomnoise.• HistoricalAverage:weightedaverageofpreviousseasons• SupportVectorRegression(SVR):regressionusingSupportVectorMachine
•Data:• Traffic:2,018sensorsfromMay19,2012toJune30,2012• Accident:6,811incidents spreadacross1,650sensors
Rush-HourForecasting
•Weobservealmost50%improvementforthepeakhourtrafficforecast
Post-trafficForecasting
• DeepMixtureLSTMisroughly30%betterthanthebaselinemethods
ConvolutionNeuralNetworkforTrafficForecasting
•Modeltrafficspeedsindifferentlocationsasamatrix(image).•Applyconvolutiontomodelthespatiotemporaldependency.
* Xiaolei Ma et al. Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction, Sensors, 2017
TrafficForecastingwithConvolutiononGraph
• Modelspatialdependencywithproposed diffusionconvolutionongraph
• Modeltemporaldependencywithaugmentedrecurrentneuralnetwork
*YaguangLietal,DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting.ICLR,2018
SpatialDependencyinTrafficPrediction• Spatialdependencyamongtrafficflow
Sensor1 Sensor2
Sensor3
CloseinEuclideanspace
Similartrafficspeed
is non-Euclideananddirected
𝑑𝑖𝑠𝑡%i" 𝑣O → 𝑣j ≠ 𝑑𝑖𝑠𝑡%i" 𝑣O → 𝑣j
SpatialDependencyModeling
•Modelthenetworkoftrafficsensors,i.e.,loopdetectors,asadirected graph• Graph𝓖 = (𝐕,𝑨)• Vertices𝑽:o sensors• Adjacencymatrix𝑨:→ weightbetweenvertices
𝐴Oj = exp −disttuv 𝑣O, 𝑣j
9
𝜎9 ifdisttuv 𝑣O, 𝑣j ≤ 𝜅
disttuv 𝑣O, 𝑣j :roadnetworkdistancefrom𝑣O to𝑣j,𝜅:thresholdtoensuresparsity,𝜎9 varianceofallpairwiseroadnetworkdistances
ProblemStatement•Graphsignal:𝑿𝐭 ∈ ℝ|}|×~,observationon𝓖 attime𝑡• 𝑽 :numberofvertices• 𝑃 :featuredimensionofeachvertex.
•ProblemStatement:Learnafunction𝑔(·) tomap𝑇�historicalgraphsignalstofuture𝑇 graphsignals
… …
𝑿"2��;) 𝑿" 𝑿";) 𝑿";�𝑔 .
SpatialDependencyModeling
•ConvolutionNeuralNetworks*(CNN)learnmeaningfulspatialpatterns• State-of-the-artresultsonimagerelatedtasks
Image
Convolutional Filter
* Y LeCun et al. Gradient-based learning applied to document recognition. Proc. IEEE 1998
GeneralizeConvolutiontoGraph•Diffusionconvolutionfilter:combinationofdiffusionprocesseswithdifferentstepsonthegraph.
MaxMinFilterweight
=𝜃? +𝜃) +𝜃9 +…+𝜃�
0StepDiffusion
1StepDiffusion
2StepDiffusion
KStepDiffusion
ExamplediffusionfilterCenteredat
𝑿:,� ⋆𝒢 𝑓� = � 𝜃� 𝑫𝑶2𝟏𝑨�𝑿:,�
�2)
�P?
Transitionmatricesofthediffusionprocess
Learningcomplexity:𝑂 𝐾
⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix.
GeneralizeConvolutiontoGraph•Diffusionconvolutionfilter:combinationofdiffusionprocesseswithdifferentstepsonthegraph.
= 𝜃? + 𝜃) + 𝜃9 + … + 𝜃�
0StepDiffusion
1StepDiffusion
2StepDiffusion
KStepDiffusion
ExamplediffusionfilterCenteredat
𝑿:,� ⋆𝒢 𝑓� = � 𝜃�,) 𝑫𝑶2𝟏𝑨�+ 𝜃�,9 𝑫𝑰2𝟏𝑨⊺
�𝑿:,�
�2)
�P?
Dualdirectionaldiffusiontomodelupstreamanddownstreamseparately
⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix,𝐷�:diagonalin-degreematrix
MaxMinweight
AdvantageofDiffusionConvolution
•Efficient• Learningcomplexity:𝑂 𝐾• Timecomplexity:𝑂 𝐾 𝐸 , 𝐸 numberofedges
•Expressive• Manypopularconvolutionoperations,including theChebNet [Defferrardetal.,NIPS’16],canbeseenasspecialcasesofthediffusionconvolution[Lietal.ICLR’18].
𝑿:,� ⋆𝒢 𝑓� = � 𝜃�,) 𝑫𝑶2𝟏𝑨�+ 𝜃�,9 𝑫𝑰2𝟏𝑨⊺
�𝑿:,�
�2)
�P?
* Defferrard,Metal,ConvolutionalNeuralNetworksonGraphswithFastLocalizedSpectralFiltering,NIPS,2016*YaguangLietal.DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting,ICLR,2018
⋆𝒢:diffusionconvolution,𝐷�:diagonalout-degreematrix,𝐷�:diagonalin-degreematrix
Multi-stepaheadpredictionwithRNN
x��
DCGRU
x��
DCGRU
x��
DCGRU
Previousmodeloutput isfedintothenetwork
x�� x��
ErrorPropagation
x�
x
Modelprediction
Observationorgroundtruth
DCGRU
x)
DCGRU
x9
DCGRU
x:
Teachthemodeltodealwithitsownerror.CurrentTime
ModelTemporalDynamicsusingRecurrentNeuralNetwork
ImproveMulti-stepaheadForecasting•Trafficpredictionasasequencetosequencelearningproblem• Encoder-decoderframework
DCGRU
𝑥)
DCGRU
x9
DCGRU
𝑥:
Encoder
x��
DCGRU
x��
DCGRU
x��
DCGRU
Decoder
<GO> 𝑥� 𝑥�
𝑥�
𝑥
Modelprediction
Observationorgroundtruth
CurrentTime
𝑥� 𝑥� 𝑥�
Backproperrorsfrommultiplesteps.
*Sutskever etal.Sequencetosequencelearningwithneuralnetworks,NIPS2014
Groundtruthbecomesunavailableintesting.
𝛿� 𝛿� 𝛿�
𝑥),𝑥9, 𝑥: → 𝑥�
𝑥), 𝑥9,𝑥: → 𝑥�, 𝑥�,𝑥�
ImproveMulti-stepaheadForecasting• Improvemulti-stepaheadforecastingwithscheduledsampling
DCGRU
x)
DCGRU
x9
DCGRU
x:
x��
DCGRU
x��
DCGRU
x��
DCGRU
<GO>
x�x�� x�x��
Scheduledsampling:Choosetousethepreviousgroundtruthormodelpredictionbyflippingacoin
𝑥�
𝑥
Modelprediction
Observationorgroundtruth
Encoder Decoder
CurrentTime
*Bengio,Samy etal.Scheduledsamplingforsequencepredictionwithrecurrentneuralnetworks.NIPS2015
DiffusionConvolutionalRecurrentNeuralNetwork• DiffusionConvolutionalRecurrentNeuralNetwork(DCRNN)• Modelspatialdependencywithdiffusionconvolution• Sequencetosequencelearningwithencoder-decoderframework• Improvemulti-stepaheadforecastingwithscheduledsampling
* Yaguang Li et al. Diffusion Convolutional Recurrent Neural Network: Data-driven Traffic Forecasting, ICLR 2018
Experiment- Datasets
•METR-LA:• 207trafficsensorsinLosAngeles• 4monthsin2012• 6.5Mobservations
•PEMS-BAY:• 345trafficsensorsinBayArea• 6monthsin2017• 17Mobservations
Experiments
•Baselines• HistoricalAverage(HA)• AutoregressiveIntegratedMovingAverage(ARIMA)• SupportVectorRegression(SVR)• VectorAuto-Regression(VAR)• FeedforwardNeuralnetwork(FNN)• FullyconnectedLSTMwithSequencetoSequenceframework(FC-LSTM)
•Task• Multi-stepaheadtrafficspeedforecasting
ExperimentalResults
•DCRNNachievesthebestperformanceforallforecastinghorizonsforbothdatasets
1.00
2.00
3.00
4.00
5.00
6.00
7.00
15Min 30Min 1Hour
MeanAb
soluteError(M
AE)
METR-LA
HA ARIMA VAR SVR FNN FC-LSTM DCRNN
1.00
1.50
2.00
2.50
3.00
3.50
15Min 30Min 1Hour
MeanAb
soluteError(M
AE)
PEMS-BAY
HA ARIMA VAR SVR FNN FC-LSTM DCRNN
EffectsofSpatiotemporalDependencyModeling
•w/otemporal:removingsequencetosequencelearning.•w/ospatial:removethediffusionconvolution.
1.5
2
2.5
3
3.5
4
4.5
5
15Min 30Min 1Hour
MeanAb
soluteError(M
AE)
DCRNNw/oTemporal DCRNNw/oSpatial DCRNN
Removing either spatial or temporal modeling results in significantly worse results.
METR-LA
RelatedWork
•TrafficPredictionwithout spatialdependencymodeling• Simulationandqueuing theory[Drew1968]• KalmanFilter:[Okutani etal.TRB’83][Wangetal.TRB’05]• ARIMA: [Williamsetal.TRB’98][Panetal.ICDM’12]• SupportVectorRegression(SVR):[Mulleretal,ICANN'97][Wuetal.ITS‘04]• Gaussianprocess[Xie etal.TRB’10][Zhouetal.SIGMOD’15]• Recurrentneuralnetworksanddeeplearning: [Lv etalITS’15][Maetal.TRC’15][YuandLietalSDM’17]
RelatedWork
•TrafficPredictionwith spatialdependencymodeling• VectorARIMA[WilliamsandHoel JTE’03],[Chandraetal.ITS’09]• SpatiotemporalARIMA[Kamarianakis etal.,TRB’03][MinandWynter,TRC’11]• k-NearestNeighbor[Lietal.ITS’12][Riceetal.ITS’13]• LatentSpaceModel[Dengetal.KDD’16]• ConvolutionalRecurrentNeuralNetwork[Maetal.Sensors’17]• DiffusionConvolutionalRecurrentNeuralNetwork[Lietal.ICLR’18]
Reference• [Chandraetal.ITS’09]Chandra,S.R.andAl-Deek,H.,Predictionsoffreewaytrafficspeedsandvolumesusingvectorautoregressivemodels.ITS,2009
• [Dengetal.KDD’16]Deng,D.,Shahabi,C.,Demiryurek,U.,Zhu,L.,Yu,R.andLiu,Y.,LatentSpaceModelforRoadNetworkstoPredictTime-VaryingTraffic.KDD,2016.
• [Drew1968]DonaldRDrew.Trafficflowtheoryandcontrol.Technicalreport,1968.• [Lietal.ITS’12]Li,S.,Shen,Z.andXiong,G.,Ak-nearestneighborlocallyweightedregressionmethodforshort-termtrafficflowforecasting,ITS,2012
• [YuandLietal.SDM’17]Li,Y.,Yu,R.,ShahabiC.,Demiryurek,U.,Liu,Y.,DeepLearning:AGenericApproachforExtremeConditionTrafficForecasting,SDM,2017
• [Lietal.ICLR’18]Li,Y.,Yu,R.,ShahabiC.,Liu,Y.,DiffusionConvolutionalRecurrentNeuralNetwork:Data-drivenTrafficForecasting.ICLR,2018
• [Lv etalITS‘15]Lv,Y.,DuanY.,KangW.,Li,Z.,Wang,F.,TrafficFlowPredictionWithBigData:ADeepLearningApproach,ITS,2015.
• [Maetal.TRC’15]Ma, X., Tao Z.,Wang,Y.,Yu, Y.,Longshort-termmemoryneuralnetworkfortrafficspeedpredictionusingremotemicrowavesensordata,TRC2015
• [Maetal.Sensors’17]Ma,X.,Dai,Z.,LearningTrafficasImages:ADeepConvolutionalNeuralNetworkforLarge-ScaleTransportationNetworkSpeedPrediction,Sensors2017
• [Mulleretal,ICANN'97]Müller,K.R.,Smola,A.J.,Rätsch,G.,Schölkopf,B.,Kohlmorgen,J.andVapnik,V.,Predictingtimeserieswithsupportvectormachines.ICANN,1991
Reference• [MinandWynter,TRC’11]Min,W.,Wynter,L.,Real-timeroadtrafficpredictionwithspatio-temporalcorrelations,RTC,2011
• [Okutani etal.TRB’83]Okutani,I,andY.J.Stephanides.DynamicPredictionofTrafficVolumeThroughKalmanFilteringTheory.TransportationResearchB,1984
• [Panetal.ICDM’12]Pan,B.,Demiryurek,U.andShahabi,C.,Utilizingreal-worldtransportationdataforaccuratetrafficprediction.ICDM,2012
• [Wangetal.TRB’05]Wang,Y.,Papageorgiou,M.,Real-timefreewaytrafficstateestimationbasedonextendedKalmanfilter:ageneralapproach.TransportationResearchPartB:Methodological,2015
• [WilliamsandHoel JTE’03]Williams,B.M.,Hoel,L.A.,ModelingandforecastingvehiculartrafficflowasaseasonalARIMAprocess:Theoreticalbasisandempiricalresults.Journaloftransportationengineering,2003
• [Wuetal.ITS‘04]Wu,C.H.,Ho,J.M.andLee,D.T.,Travel-timepredictionwithsupportvectorregression.IEEEtransactionsonintelligenttransportationsystems,2004
• [Xie etal.TRB’10]Xie,Y.,Zhao,K.,Sun,Y.andChen,D.,Gaussianprocessesforshort-termtrafficvolumeforecasting.TransportationResearchRecord,2010
• [Zhouetal.SIGMOD’15]Zhou,J.andTung,A.K.,2015,May.Smiler:Asemi-lazytimeseriespredictionsystemforsensors.InProceedingsofthe,ACMSIGMOD,2015
DecisionService
IntelligentOrder-DispatchingSystem
OrderandDriverForecasting ALarge-ScaleDistributedComputing
PlatformEfficiencyandCustomerExperienceOptimization
MapService
RoutePlanningTheCoreofDispatching
TominimizecostTomaximizedriverefficiencyTooptimizetransportationefficiency
ETA(EstimatedTimeofArrival)
Toestimatethetravelingtimeandthewaitingtimeofeachride
Order-DispatchingMatrix
DiDiPremier
DiDiExpress
DiDiTaxi
OrdersforTaxiOrdersforPremier OrdersforExpress
DiversifiedOrders
TransportCapacity
BipartiteGraphMatching
Hungarianmatchingalgorithm (alsocalledtheKuhn-Munkres algorithm)
MDPFormulationforDispatching
ReinforcementLearningDispatching•Everydispatchingdecisionaffectsfuturesupply(drivers)distribution•Maximizedrivers’collectiveincomethroughoptimizeddispatching,whileensuringgoodcustomerexperience
Reinforcement Learning•Focusonlong-termreward(e.g.oneday)•Considerfutureimpactofcurrentdecision•Valuefunctionisa keyquantitytolearn
Q(s,a)
Method
CombiningReinforcementLearningandCombinatorialOptimization
Online Planning Step Offline Learning Step
…
Matching Value
Instant Reward
Future State-Value+
=
Value Functions
Historical data
Xuetal.,KDD2018
MDPDefinition
EachorderdispatchaccordstoadecisionmadebytheplatformFormulateorderdispatchintoaMarkovDecisionProcess(MDP)Optimalpolicygeneratestheoptimalrevenueoftheplatform–bysatisfyingmorerequestsandmaximizingdriver’sincome
• State�Time&location
• Action�order-driverpair• S:Driver’scurrentstate• S’:Order’sactual
finishing state,distributedneartheestimatedfinishingstateby!""#$
• R:Order’sprice�distributedneartheestimatedpriceby%"$
agent (driver)
order start
order finish
&''#(
pickup
delivery
Agent:ADriver
Episode:Awholedayfrom04:00to04:00(+1)
State:Time&Spacelocation
Action:Pickupanorderordonothing
Reward:Driver’sIncome(orderprice)
ValueFunction:TheexpectedfutureincomeofadriverinastateS
Motivation
max𝑉¢ 𝑠 = 𝐸¢[𝑅";)+ 𝛾𝑅";9+ ⋯|𝑆" = 𝑠]
Learning– PolicyEvaluation
𝑇?
𝑺𝟎(𝑻𝟎,𝑿)
𝑇)
𝑺𝟏(𝑻𝟏,𝑿)
𝑇9 𝑇:
𝑺𝟐(𝑻𝟑,𝒀)
Vacant:V¢ 𝑆? ← 𝑉¢ 𝑆? + 𝛼(0 + 𝛾𝑉¢ 𝑆) − 𝑉¢(𝑆?))Serving:V¢ 𝑆) ← 𝑉¢ 𝑆) + 𝛼(𝑅 + 𝛾9𝑉¢ 𝑆9 − 𝑉¢(𝑆)))
Dynamic Programming
Planning– AdvantageFunction
Linkweight:
Q s, a→ A(s, a)
= R.+V s’ – V(s)
Order’svalue
Valuefunctionofdriver’scurrentstate
Valuefunctionofdriver’sexpectedfinishingstate
DeepRLforDispatching
DeepReinforcementLearning
moreadaptivetoreal-timesupply-demand contextchanges
weightssharingamonginputs:location,time,destination,context- bettergeneralization
。。。
。。。
facilitateslearningfrommultiplecitiesandtimes
Wangetal.,ICDM2018
DeepQ-networkwithactionsearch
• Usehistoricaltripsdataastrainingtransitions.
• Eachtripxdefinesatransitionoftheagent’sstate
(s0,a,r,s1):• Currentstates0:=(l0,t0,f0),location,time,and
contextualfeatures• Action:theassignedtrip;• Next-states1:=(l1,t1,f1)• Reward:thetotalfeecollectedforthetrip.
• Implicitlyconsiderpick-uptime,tripdurationrelativetoreward
q Trainingdata
q Actionsearch
q Expandedactionsearch
DeepQ-networkwithactionsearch
q Trainingdata
q Actionsearch
q Expandedactionsearch
• Constructanapproximatefeasiblespacefortheactionsfor
computingmax·�∈¸
Q(s), a�) forthetargets
• Insteadofsearchingthroughallvalidactions,wesearchwithin
thehistoricaltripsoriginating fromthevicinityofs:
• Thesamesearchprocedureisusedforevaluation,wherewe
simulatethedriver’strajectoryduringthedayusinghistorical
tripsdata.
trip. Random sampling from the action space is thus not appropriate. Hence, we developed anapproximation scheme for computing this term by constructing an approximate feasible space for theactions, ˜A(s). This notation makes explicit the dependency of the action space to the state s wherethe search starts. Instead of searching through all valid actions, we search within the historical tripsoriginating from the vicinity of s:
˜A(s) := {xs1 |x 2 X , B(xs0) = B(s)}, (3)
where X is the set of all trips, and B(s) is discretized spatio-temporal bin that s falls into. For spatialdiscretization, we use the hexagon bin system, where in our case here, a hexagon bin is representedby its center point coordinates. xs0 is the s
0
component of the trip x. Certainly, the larger the searchspace, the more computation is required for evaluating the value network at each action point. Wehave made the number of actions allowed in the action search space a tuning parameter and dorandom sampling without replacement if necessary. The same search procedure is used for policyevaluation, where we simulate the driver’s trajectory during the day using historical trips data.
3.3 Expanded action search
Due to training data sparsity in certain spatio-temporal regions, e.g. some remote area in earlymorning, the above action search may return an empty set. In this case, we perform an expandedaction search in both spatial and temporal spaces. The first search direction is to stay at the lastdrop-off location and wait for a period of time, which corresponds to keeping the l-component, slconstant and advancing st, till one of the following happens (s0 is the searched state.): 1) ˜A(s
0) is
non-empty. We return ˜A(s
0). 2) A terminal state is reached. We return the terminal state. 3) s0t
exceeds the wait-time limit. We return s
0. The second search direction is through spatial expansionby searching the neighboring hexagon bins of s in a layered manner. See Figure 2 for an illustration.For each layer L of hexagon bins, we search within the appropriate time interval to take into accountthe travel time required to reach the target hexagon bin from s. The travel time estimation can beobtained from a map service, for example, and is beyond the scope of this paper. We denote the layerL neighboring spatio-temporal bins of s by B
0(s, L) and the set of historical trips originating from
any of the bins in B
0(s, L) by
˜A0(s, L) := {xs1 |x 2 X , B(xs0) 2 B
0(s, L)}. (4)
We stop increasing L when ˜A0(s, L) is non-empty and return ˜A0
(s, L). Otherwise, we returnB
0(s, L
max
), i.e. the hexagon bins’ center points and their associated time components. Algorithm3.1 summarizes the full action search.
3.4 Terminal state values
From the definition of a terminal state in Section 2, it is clear that Q(s, a) with st near the endof the episode horizon should be close to zero regardless sl. Following the idea of the dynamicprogramming algorithm, we add transitions with s
1
being a terminal state to the replay buffer at thevery beginning of training. We find that it helps getting the terminal state-action values right earlyin training. This is important because the target values for the states s
0
’s in the mini-batch updatesof DQN are computed through bootstrapping on the values of states that are temporally after them.Since the training samples with a terminal state form a very small percentage of the entire data set, auniform sampling would cause the values of many states far away from the terminals to be supervisedwith the incorrect targets, hence slowing down the learning process.
3.5 Experience augmentation
The original training data is the experience generated by the given historical policy, which may notexplore the trajectory space well and, in particular, contains very few transitions that require longwaiting time or repositioning without a passenger before a trip starts. Such transitions are typicalexperiences when the driver is at a state where historically there were few trips originating fromthere. If the agent is only trained on the original trips data, it would not learn to make good decisionsshould the driver go into a rare state in the future. The way we mitigate this problem is to supplementthe original training experience with transitions obtained through action search. Specifically, foreach time bin, we randomly sample a set of locations within the geographical boundary under
3
DeepQ-networkwithactionsearch
q Trainingdata
q Actionsearch
q Expandedactionsearch
• Duetotrainingdatasparsityincertainspatio-temporal
regions,weperformanexpandedactionsearchinboth
spatialandtemporalspaces.
Algorithm 3.1 Spatio-temporal action search1: Given s = (l
0
, t
0
); t0 is search time limit.2: A {}3: T
max
min(T, t
0
+ t
0)
4: if ˜A(s) 6= ; then5: A A+
˜A(s)
6: else7: for t = t
0
, t
0
+ 1, · · · , Tmax
do8: s
0 (l
0
, t)
9: if ˜A(s
0) 6= ; then
10: A A+
˜A(s
0)
11: break12: end if13: end for14: if A did not change from line 6 then15: A A+ s
0
16: end if17: for L = 1, · · · , L
max
do18: if B0
(s, L) 6= ; then19: A A+
˜A0(s, L)
20: break21: end if22: end for23: if A did not change from line 17 then24: A A+B
0(s, L
max
)
25: end if26: end if27: return A
Figure 2: Action search. The red circle lines cover the first two layers of neighboring hexagon binsof l
0
. The arrows represent searches to hexagon bin centered at l1
(first layer) at time t
1
> t
0
andhexagon bin centered at l
2
at time t
2
> t
2
. The spatio-temporal bins covered by the inner red circleare B
0((l
0
, t
0
), 1)..
4
Trainingformultiplecities
Dispatchingsystemsupportsalargenumberofcities§ Computationallyintensive§ Diversesupply-demandsettings
Knowledgetransfer§ Leveragecommonproperties:e.g.rush-hourtrafficpattern§ Improvelearningefficiency
Wangetal.,ICDM2018
TransferLearning
Idea§ Re-useweights§ Betterinitialsolution
ExistingMethods§ Fine-tuning§ Progressive:lateralconnections
betweensourcecitynodesandtargetcitynodes
CorrelatedFeatureProgressiveTransfer(CFPT)
Spatio-temporaldisplacement§ Involverelativetransitionpatternlikedistanceandtimecost
Generalonlinefeatures§ Numberofidledrivers§ Numberoforderscreated§ Averagepick-uptimeforpassengers§ …
ExperimentsTrainingdata:onemonthofExpressCartripdataSingle-drivertestenvironment
DQNv.s.policyevaluation§ Policyevaluation:max-Q->mean-Qinmini-batchupdates
Transferlearningv.s.no-priorDQNSpatialtransfer
§ SourcecitytotargetcityTemporaltransfer
§ Samecity:onetimeperiodtoanother City Size RegionA Large NorthernB Small SouthernC Medium SouthernD Large Western
Results
DQNv.s.policyevaluation§ Optimizationhelps
§ Smallercitieswithsimplerpatternand
fewertransitionshavesmaller
advantages
ResultsNotransferv.s.transfer
Spatialtransfer§ Robustness§ Betterinitialsolution§ Higherconvergedcumulativerewards
ResultsTemporaltransfer
DynamicRideSharing
DynamicRideSharingrequest Aroute request B
𝑣9(0,2)
𝑣º(2,4)
𝑣:(3,0)
𝑣½(5,2)
𝑣�(6,5)
𝑣�(10,2)
𝑣�(8,4)5
5
7 8
4
5
5 3
35
𝑣)(0,3)
1
𝐵
𝐴
Pick A
Pick B
Drop A
Drop BDrop B
• Limitations• targetonsingleobjective• relyoninefficient insertion,O(n2)orO(n3)
• Contributions• aunifiedcostfunction,generalizethreemainobjectives• dynamicprogrammingbasedinsertion,lineartime
Drop A
*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.
UnifiedProblemDefinition• Givenasetofdrivers,asetofrequestsdynamicallyarrived,theplatformaimstoplanrouteofeachdriverforservingtherequeststominimizetheunifiedcost.• UC W,R = α Ç ∑ D(SÊ)�
Ê∈Ì + ∑ pÍ�Í∈ÎÏ
• α:weight,pÍ:penaltyforrejectingtherequest
• Ð α = 1pÍ = ∞ ⟹
• Ðα = 0pÍ = 1 ⟹
• Ð α = unitpaymentpÍ = fareoftherequest⟹
� 最大化平台收益
MinimizeTotalDistance
MaximizeServedRequest
MaximizeTotalRevenue
*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.
CoreOperationinRidesharing:Insertion
• Givenaworkerwithcurrentroute,anewrequestisinserted intocurrentroutewithminimalincreaseddistance,whileordersofoldrequestsremainthesame.
PickA DropAPickBC
DropBDropC
PickD
DropD
oldroute newrequest
BeforeInsertion AfterInsertion
PickA DropAPickBC
DropBDropC
PickD
DropD
oldroute newroute
*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.
DynamicProgrammingbasedInsertion
• EventhoughthereareO(n2)pairs,insertion canbeimplementedbydynamicprogramminginlineartime.
Dio j =∞, ifpicked j − 1 > KÊ −KÍDio j − 1 , if det lÚ2), oÍ, lÚ > slack[j − 1]min Dio j − 1 , det lÚ2), oÍ, lÚ , otherwise
Plc[j] =
NIL, ifpicked j − 1 > KÊ − KÍPlc j − 1 , if det lÚ2), oÍ, lÚ > slack[j − 1]Plc j − 1 , ifDio j − 1 < det(lÚ2), oÍ, lÚ)j − 1, ifDio j − 1 ≥ det(lÚ2), oÍ, lÚ)
DynamicProgramming
*YongxinTongetal.AUnifiedApproach toRoutePlanning forSharedMobility,VLDB,2018.
O(n2)pairs oldroutenewrequest
Reference• [Tong et al. VLDB’18] Tong, Y., Zeng, Y., Zhou, Z., Chen, L., Ye, J., Xu, K., A Unified Approach
to Route Planning for Shared Mobility, VLDB, 2018.• [Tong et al. SIGMOD’18] Tong, Y., Wang, L., Zhou, Z., Chen, L., Du, B., Ye, J., Dynamic
Pricing in Spatial Crowdsourcing: A Matching-Based Approach, SIGMOD, 2018.• [Tong et al. VLDB’17] Tong, Y., Wang L., Zhou Z.,Ding. B., Chen L., Ye J., Xu K., Flexible
Dynamic Task Assignment in Real-Time Spatial Data, VLDB, 2017.
SupplyandDemandForecasting
SupplyandDemandForecasting
Tongetal.,KDD2017
SupplyandDemandForecasting
DeepST or ST-ResNet
• Spatial proximity: extract features from spatially proximate neighborhoods
• Temporal feature: temporal proximity and periodicity (n-weeks ago, n-days ago, n-hours ago)
Zhang,etal, AAAI2017
SupplyandDemandForecasting
Yao,etal, AAAI2018
Deep Multi-View ST Network
• Spatial proximity: extract features from spatially proximate neighborhoods
• Temporal feature: Recurrent Neural network
• Semantic similarity: build semantic graph followed by embedding
SmartTransportation
DataCollection,StateEstimation,andPrediction
10
20
30
40
0
Time
2018-4-18Time MsgType Msg16:47:39 TrafficSign NoParking16:47:39 TrafficSign SpeedLimit60km/h16:47:39 TrafficSign BusLane16:47:51 TrafficSign NoLeftTurn16:48:02 Congestion StopinQueue
Shutong St
Shuhan Rd
Tongde St
0 25 50 75 100
Space
Time-spacediagram
……
PerformanceEvaluationandDiagnosisImbalance
Spillover
Underutilization
Everymovementattheintersection
Imbalance UnderutilizationSpillover Normal
RealTimeTrafficControlReal-timedataprocessingwithdistributedcomputing
Trafficstatesreconstruction
Integratedrampmeteringandtrafficsignalcontrol
Datafusion-mobilesensing+fixedlocationdetection
Machinelearningbasedcontroller
IntegratedSolution- SmartTransportationBrain
Path GuidanceLane Control Signal TimingSmart Dispatch Network Optimization
GovernmentData
IndustryData
CrowdsourceData
DiDi Data
Data Fusion
Simulation
CloudComputation
AI
MeasureSystem
ControlCenter
AnalysisCenterDataCenter
SmartTransportationBrain
Improvingtrafficconditionsinover20cities
WuhanSmartTrafficLights 170+VariableMessageSigns:11
ChengduSmartTrafficLights 120+
ShenzhenSmartTrafficLights 10VariableMessageSigns:2
GuangzhouSmartTrafficLights 110+
SuzhouSmartTrafficLights 160+
NanjingSmartTrafficLights 18
JinanSmartTrafficLights 340+VariableMessageSigns:82
1300+
10�20%
ElectricVehicle
Motivations
• 400,000 electriccarsoperatingonDiDi’s platform
LessEmissionBrighterFuture!
>1.7millionnewenergy vehicles inChina(theworld'slargest)
• Chargingstation• Batterymanagingsystem(BMS)• Dispatchstrategy
Supportingthenewenergyvehicles
ChargingStation
RichTraceData
ChargingStationLocations
StationCapacities
InteractiveMap
SpatialClustering
DemandPrediction
Product
Supply-demand Pilesneeded
PotentialChargingStations
BMS(ConsumptionPrediction)
Driving StyleVehicleinformation Weather
RoadCondition
1. Power output2. Weight3. …
1. Sunny/rainy2. Temperature3. …
1. Acc/break freq2. Max speed3. …
1. Even?2. Up/down hill3. …
Unified Scale
Difference Model
PredictionMAPE distribution
Counts
truthpredicted
Asampletracefitting
DispatchingStrategy
Mydestinationis10kmaway
Ihavetochargenow
Iam100%full
Ihave30%battery
Iamgoingtoaplacenearchargingstation
Iamgoing50kmaway
Adaptingtheuniquecharacteristicofnewenergyvehicles
DiDi Brain
AIforSocialGood
AIforSocialGoodIntelligentMedical Hardware
Barrier-Free
Disease Warning
EnvironmentalSustainability
UrbanAirQuality Monitoring
Health
NeuralNetworks
Reinforcement Learning
CallforParticipant
DeepLearning
EVChargingPilesLocation
MachineLearning
Part 3:DataandToolsforTransportationAI
OpenDataset
KDDCup2017HighwayTollgatesTraffic
FlowPrediction
GAIAOpenDatasetTrajectoryData
Uber Movement
FederalHighwayAdministrationNextGenerationSimulation (NGSIM)Program
Public Data
TheGAIAInitiativeaimstoestablishstronglinksbetweenDiDi andacademia,andfacilitatedata-drivenresearchintransportation.Itprovidesaplatformfortheacademiatoaccessanonymizeddatafromreal-lifeurbanscenarios.
Open InnovativeCollaborative
Introduction
DailyRides
30 Million
DataProcessed
4875TB+
RoutingRequests
40 Billion 100TB+
New Data
Every Day
BigData
Real
High Accuracy
Full Sample
Vehicle Trajectory DataFrom: Chengdu andXi’an, China
Sharing AnonymizedData
01 02 03
Register
VisittheGAIAInitiativewebsite:GAIA.didichuxing.com/en andclick”ApplyNow”toregister.
Verification &Review
IntheDataCenter,click“DataAccessApplication”.Readtheuseragreementcarefullyandfillouttheapplicationformwithyourinfo,thensubmittheformforverification.
Access Data
Onceyourapplicationisapproved,DiDi willsendyouanemailwithinstructionsfordataaccess.Pleasecheckyouremail.
Process
Data
DiDi Academia
Scenarios
ToRedefinetheFutureofMobility
Outreach
GAIA.didichuxing.com/en