OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John...

18
OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysi s Systematica

Transcript of OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John...

Page 1: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

OPUSOPUSOptimising the Use of Partial

Information in Urban and Regional Systems

Professor John PolakImperial College London

Katalysis Systematica

Page 2: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Partners• Imperial College London (UK)

– Centre for Transport Studies (Coordianator)

– Epidemiology and Public Health

• Transport for London (UK)

• Katalysis Ltd. (UK)

• Swiss Federal Institute of Technology (CH)

• Facultés Universitaires Notre-Dame de la Paix (B)

• Systematica s.r.l. (I)

• PTV (D)

• World Health Organisation (I)

Page 3: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Background• Increasingly complex urban systems – cannot

be completely observed

• Yet increasing need for comprehensive information

• Use data from many different sources – sample surveys, census, operational data streams, data captured from IST systems etc.

• Particular issues related to urban transport, health and environment (temporal & spatial)

• How should we combine these data?

Page 4: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Specific instances• Estimation of person trip rates from household

surveys and vehicle counts/onboard surveys

• Estimation of congestion from conventional traffic surveys, number plate recognition data and GPS/GSM tracking data

• Estimation of population exposure to environmental hazards from data on population, traffic flows and vehicle emissions

• Estimation of employment from official statistics, energy use, waste generation etc.

Page 5: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Objectives• Develop statistical framework for optimal

combination of complex spatial and temporal data

• Develop substantive models for the estimation of indicators of urban/regional mobility

• Develop metadata, databases and estimation software for major applications in London and Zurich

• Undertake feasibility studies in related transport and health domains

• Organise active dissemination and outreach

Page 6: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Methodology• Statistical models

– Structural models (existing domain theory)

– Measurement models (sampling and non-sampling variation)

– Estimation framework and software (Bayesian)

• Metadata – Consistent description of data

– Process metadata descriptions of models (via XML)

• Statistical databases– Integrated treatment of data, models, and

outputs via object oriented methods

Page 7: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

OPUS work structureStatistical and database theory (WP2, WP3)

Application domain theory and methods (WP4, WP5)

Application software and tools (WP6, WP7)

London Zurich Health Transport

Case studies (WP8, WP9) Feasibility studies (WP10,11)

Evaluation (WP12)

Management and coordination (WP1)

D i s s e m i n a t i o n

(WP13)

Page 8: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Framework (1)

Real world process),( TT

The World ‘today’

)(N

Measurement 1 Measurement 2 Measurement n

)( 1X )( 2X )( nX

Direct measurement

Page 9: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Framework (2)Real world process

),( TT

)(N

Measurement 1 Measurement 2 Measurement n

)( 1Y )( 2Y )( nY

Indirect measurement

The World ‘today’

Complex interaction with other quantities as captured in existing domain models

Page 10: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Framework (3)• Given

– direct measurements (X1, X2,…,Xn) and

– indirect measurements (Y1,Y2,…,Yn)

• And assumptions/models describing– natural variation

– the errors associated with different measurement processes

– the structural relationships between direct and indirect measurements and underlying quantities

• Estimate

– The most likely values of the parameters of the underlying process

),( TT

Page 11: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Modelling approach (1)• The key to combining data is to be able to

specify the joint probability distribution of all the parameters of interest

• The joint pdf allows us to compute the likelihood of the partial observations, conditional on the underlying parameters

• But in most cases the joint pdf is not easily available

– It may only be defined implicitly (e.g. by a model system)

– For any non-trivial model, it is usually very complex

Page 12: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Modelling approach (2)• Capture existing domain knowledge (structure

and measurement) in the form of a graphical model (Bayesian Belief Network)

• Exploit conditional independence (when it exists) to develop appropriate MCMC samplers for the underlying joint pdf (pragmatic Bayesianism)

• WinBUGS/OpenBUGS is useful, but not necessarily always sufficient

– Most useful problems are very large scale

– Conditional independence is not always available (equilibrium systems)

Page 13: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Illustrative example (1)• Suppose we have a set of nodes (trip

origins and destinations) linked by a transport network

• We wish to estimate the intensity of trip making from different locations based on:

– sample household surveys (direct)

– counts of flows on selected links on the network (indirect)

• Used separately each type of survey typically gives different estimates (e.g., up to 30% difference in London in 1991)

Page 14: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Illustrative example (2)• Routeing in the network is given by a path-

link incidence matrix A, such that

Y = AX

where Y are link flows and X are trip intensities. A is provided by an existing transport model (e.g., VISUM)

• Both X and Y can be assumed to be subject to sampling and non-sampling variation

Page 15: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Illustrative example (3)• Likewise A can be assumed to have

uncertainty, representing uncertainty in (transport) domain model knowledge

• The BBN allows the consistent propagation of uncertainties

• A special MCMC sampler has been developed to efficiently sample from the posterior of X given observations onY (requires QR factorisation of A to accommodate network topology)

• Method is scalable to large problems

Page 16: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Software approach• Use R as basic platform for implementation

• R calls existing domain modelling systems (VISUM) and relevant MCMC sampler

• Results returned to R for visualisation and export to statistical database

• In future, these functionalities may be embedded directly in the statistical database

Page 17: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Conclusion• OPUS has been a large and ambitious project

• Methodological contributions

– Innovative application of Bayesian statistical methods

– Innovative application of OO database and process metadata methods

• Policy relevance

– Improved quality of information on regional and urban transport/land use systems

– Ability to better quantify and disclose uncertainty in key parameters

Page 18: OPUS OPUS Optimising the Use of Partial Information in Urban and Regional Systems Professor John Polak Imperial College London Katalysis Systematica.

Contact

Prof. John PolakDirector, Centre for Transport StudiesDepartment of Civil and Environmental

Eng.Imperial College LondonLondon SW7 2AZT: +44(0)20-7594-6089F: +44(0)20-7594-6102E: [email protected]: www.opus-project.org