Download - Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

Transcript
Page 1: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

Development of Dynamic Census:�Estimating demographics and trajectories of actual populations in Bangladesh using CDR data �

University of Tokyo Shibasaki & Sekimoto Lab.

Dynamic Census Development Team Ayumi Arai*

Apichon Witayangkurn Hiroshi Kanasugi

Zipei Fan Ryosuke Shibasaki

��

What is CDR data? �

��

Localization Trajectory

How the data look like Starting time of calls

Location of antenna

CDR data can provide partial views of large-scale human mobility and distribution

Call Detail Records = CDR data

Page 2: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

��

Motivation �¡  Popula'onsta's'csareimportantforac'vi'esbothinprivateandpublicsectors.Butaretheseenoughforunderstandinghumanac'vity?

¡  CDRdataareusefulforunderstandinghumanmobility.But..¡  Interpreta'onofanalysisresultsmaybemisleadingifCDRscan

representlimitedpartofsociety(James&Versteeg,2007;Tatem&Smith,2010)

¡  Difficulttoexaminetheimpactofrepresenta'vebiaswithoutknowingwhichpartofsocietyCDRsdepict(Wesolowskietal.,2013).

Canwedevelophumantrajectorydata,whicharelabeledwithdemographicaSributesandrepresentactualpopula'onsusingCDR?

��

Advantages and challenges of CDR data �¡  Advantages

1.  Poten'allyhighpopula'oncoverage2.  Nearreal-'mehumanmobility3.  Rou'nelycollectedbythemobilenetworkoperator(MNO)

¡  Challenges

1.  Recordedatirregularintervals2.  Spa'alresolu'ondependingoncellantennaloca'ons3.  Anonymized4.  Representa'venessbias

A novel data set “Dynamic Census” is developed by addressing these challenges

Page 3: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

��

What is “Dynamic Census”? �

¡  Trajectoriesanddemographicsofactualpopula3oninareascoveredbyCDRdata

¡  Griddedpopula3onsta3s3csonkeydemographicaSributesathourlybasis,e.g.workingmale,housewife,student,andother

Working males’ 24 hour population distribution in Dhaka

CDRs �Census � Dynamic Census�x�

y �

Years �

x�

y �

x�

y �

Hours� Hours�

T0 �

T0+5 �

t0 �

t0+1 �

t0+2

t0+3 �

t0 �

t0+3 �

Page 4: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

Impacts and Uniqueness of Dynamic Census �

CancaptureBOPwhichhasnon-marginalimpactsoneconomy.Difficult-to-reachpopula'onforfieldsurveycanbealsocaptured. �

95% World’s cellular

network coverage�

Applicableanywherecoveredwithcellularnetworks �

�1% Cost necessary for

developing Dynamic Census�

40% Those who belong to Base of Pyramid �

Timeandfinancialcostsaremuchlowerthanconduc'ngconven'onalcensus�

CDR data �

Interpola3on

Spa3aldisaggrega3on&routeinterpola3on

Es3ma3onoftheunobservablepopula3on

How to address challenges in CDR data�

Dynamic Census�

Loca3onlabeling

Irregularrecordinterval

Non-uniformresolu3on

Anonymized

Challenges in CDR data

DemographicaDributees3ma3on

Representa3veness

① �

④ �

② �

③ �

Fieldsurveydata�mobilephoneusers�

Fieldsurveydata&buildingdata

(users&non-users)

Supplement data

Buildingmapdata&Roadnetworkdata

Page 5: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

��

CDR data �

Interpola3on

Spa3aldisaggrega3on&routeinterpola3on

Es3ma3onoftheunobservablepopula3on

Dynamic Census�

Loca3onlabeling

Irregularrecordinterval

Non-uniformresolu3on

Anonymized

Challenges in CDR data

DemographicaDributees3ma3on

Representa3veness

① �

④ �

② �

③ �

Fieldsurveydata�mobilephoneusers�

Fieldsurveydata&buildingdata

(users&non-users)

Supplement data

Buildingmapdata&Roadnetworkdata

���

1. Irregular record interval�

Home

Office

Calledat8:40am

Calledat3:15pm

Interpolate CDRs based on the routine observed from longer-term data

H

W

Time� Place�8:00~8:59 � H �9:00~9:59 � W�

10:00~10:59 � W�11:00~11:59 � W�12:00~12:59 � W�13:00~13:59 � W�14:00~14:59 � W�15:00~15:59 � W�

Interpola3on

Time� Place�8:00~8:59 � H �9:00~9:59 �

10:00~10:59 �

11:00~11:59 �

12:00~12:59 �

13:00~13:59 �

14:00~14:59 �

15:00~15:59 � W�

CDR data Actual behavior

8:00 Departure�

8:45 Arrival�

Noinforma'oninCDRdata

Page 6: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

Interpolation �¡  Extracting routine patterns

¡  Atopicmodelisemployed

¡  Rou'nepaSernisexpressedastheprobabilitydistribu'onofkeyloca'ons(Home,Work,andOther)

¡  Spatiotemporal interpolation

¡  HiddenMarkovModelisemployed('mingoftransi'onisiden'fied)

Topic model Hidden Markov Model

Collaborative filtering approach

+

���

CDR data �

Interpola3on

Spa3aldisaggrega3on&routeinterpola3on

Es3ma3onoftheunobservablepopula3on

Dynamic Census�

Loca3onlabeling

Irregularrecordinterval

Non-uniformresolu3on

Anonymized

Challenges in CDR data

DemographicaDributees3ma3on

Representa3veness

① �

④ �

② �

③ �

Fieldsurveydata�mobilephoneusers�

Fieldsurveydata&buildingdata

(users&non-users)

Supplement data

Buildingmapdata&Roadnetworkdata

Page 7: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

2. No demographic attribute info�

Popula'onsinCDRsdonotalwaysrepresentthepopula'onunderstudy

Canspecifythepopula'onunderstudy

DemographicaDributesarees3mated

Target population group�

���

Demographic attribute estimation�¡  Approach

¡  RandomForestisemployedforbuildinganes'ma'onmodel

¡  One-month-call-recordsfrom58volunteersareusedastrainingdata

¡  One-day-call-recordsfrom922mobilephoneusersareusedforexaminingrela'onshipbetweencallingbehavioranddemographicaSributes

¡  Estimated features

¡  Workingmale,housewife,student,andother

¡  Incomelevel(individual)andAgegroup(-20/21-35/36-60/61-)←Resultstobeimproved

Class � Accuracy � Precision� Recall�Working male� 0.79� 0.63� 0.70�

Housewife� 0.67� 0.47� 0.82�

Student� 0.89� 0.40� 0.22�

Other� 0.63� 0.20� 0.07�

Estimation results

Page 8: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

Calling behavior survey to relate demographic attributes and CDR data�¡  Purpose

¡  Relatecallingbehavior(callrecords)anddemographicaSributes¡  Surveyed area and population

¡  15Wardsarechosenbasedonlanduse.ForeachWard,18HHseacharechosenfrom3incomegroupsinGreaterDhaka(Two-stagestra'fiedsampling)

¡  Allmembersareinterviewed¡  InterviewedondemographicaSribute,travel-ac'vity,andmobilephoneuse

¡  Key of this survey ¡  Incomelevelisdeterminedbasedonthetypeofbuildings

Interviewataslumhousehold �

Interviewatahighincomehousehold �

���

CDR data �

Interpola3on

Spa3aldisaggrega3on&routeinterpola3on

Es3ma3onoftheunobservablepopula3on

Dynamic Census�

Loca3onlabeling

Irregularrecordinterval

Non-uniformresolu3on

Anonymized

Challenges in CDR data

DemographicaDributees3ma3on

Representa3veness

① �

④ �

② �

③ �

Fieldsurveydata�mobilephoneusers�

Fieldsurveydata&buildingdata

(users&non-users)

Supplement data

Buildingmapdata&Roadnetworkdata

Page 9: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

��

3. Non-uniform spatial resolution�

Spatial disaggregation

Loca'oninCDRdataisatantennalevel

Disaggregatedbasedonthedistribu'onofbuildings

Disaggregated

��

Stay point reallocation �¡  Modifying spatial resolution

¡  Stay points are reallocated to building POIs ¡  Antennabasisloca'onsarereallocatedtobuildingPOIswithinvoronoi¡  Eachvoronoicellisconsideredtobeanareacoveredbyanantenna

¡  Allocation probability is based on the area size of building

¡  Types of buildings are used as the proxy of the income level

��2016.03.11�

Distribution of POIs�

Stay point (antenna) �

Voronoi cell generated from antenna location �

POIs within voronoi�

Page 10: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

CDR data �

Interpola3on

Spa3aldisaggrega3on&routeinterpola3on

Es3ma3onoftheunobservablepopula3on

Dynamic Census�

Loca3onlabeling

Irregularrecordinterval

Non-uniformresolu3on

Anonymized

Challenges in CDR data

DemographicaDributees3ma3on

Representa3veness

① �

④ �

② �

③ �

Fieldsurveydata�mobilephoneusers�

Fieldsurveydata&buildingdata

(users&non-users)

Supplement data

Buildingmapdata&Roadnetworkdata

���

4. Representativeness�

mobile users �

Populations in CDR data

Suppose you have CDRs from Dhaka … �

Those who are not

included in CDRs �

Unobservable in CDR data

Entire population in Dhaka

Scalingfactorsarecomputedtoes'mate

thispart�

Page 11: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

Entire living populations �

(B) People in HHs which do NOT include any GP users Unobservables�

(A) People in HHs which include GP users GP users + unobservables�

Understanding population covered by CDRs on household basis

���

Estimation of the unobservable �¡  Scaling factor

¡  Approx.householdnumberiscalculatedbasedonthenumberofbuildings

¡  ScalingfactoriscomputedfromthetypicalHHstructure,obtainedthroughfieldsurvey

Real population of areas covered by CDR data Non users �

Users�

Non users (Not appear in CDR

data)�

Users�

Non users �

User

x scaling factor �

(a) HHs including users

User

x scaling factor �

(b) HHs consisting of no users alone User

x (scaling factor �+�)

Page 12: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

Purpose •  Inves'gatethepopula'onstructureforeachincomelevel•  Obtaindatatocalculatescalingfactorstocomputethenumberof

popula'onsfromthedistribu'onofbuildingbyincomelevel

Surveyed Voronoi area

Surveyed area and population •  En'repopula'onsinaVoronoicellwere

surveyedinDecember2014•  2,839HHsconsis'ngof11,521people

from366buildings(outof367buildings)

Key of SCC •  Incomelevelisdeterminedbasedonthe

typeofbuildings

Small-scale census survey (SSC) to see population structure for each income level

Average HH structures obtained from survey

HHs not including GP users HHs including GP users

Page 13: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

���

Type of building and income level

Contents of the map data •  Approx.650,000buildings(withthetypeofbuildings)•  Residen'albuildingsareclassifiedintofourgroupsbytheheightof

buildings

Sample of the map

Criteria of the type of buildings •  High(Sevenormorestories)•  Middle(Morethantwostories)•  Low(Onetotwostories)•  Slum(Onestory)

:High

:Middle

:Low

:Slum

Legend of the type of building

���

Future work �

CDR data �

Dynamic Census�

Spa3otemporalinterpola3on

DemographicaDributees3ma3on

Es3ma3onoftheunobservablepopula3on

Staypointextrac3on&loca3onlabeling

Spa3aldisaggrega3on&routeinterpola3on

Willbeimprovedwiththeuseofsmartphones

Resultsdependondataquality→Compara3vestudies

Buildingmapdataiscostly→Satelliteimageprocessing

Varietyinlife-styles→Compara3vestudies

Page 14: Development of Dynamic Census - LIRNEasialirneasia.net/wp-content/uploads/2016/07/DynamicCensus_13July2016.… · Development of Dynamic Census: ... hourly basis, e.g. working male,

Anyques'ons/sugges'[email protected]