Re-thinking Modelling: a Call for the Use of Data Mining in Data-driven Social Simulation Samer...

Post on 18-Jan-2016

216 views 0 download

Tags:

Transcript of Re-thinking Modelling: a Call for the Use of Data Mining in Data-driven Social Simulation Samer...

Re-thinking Modelling: a Call for the Use of Data Mining in

Data-driven Social Simulation

Samer HassanJavier ArroyoCelia Gutiérrez

Universidad Complutense de Madrid

Samer Hassan SS@IJCAI 2009 2

Contents

Data-driven ABM

DM-assisted Methodology

Case Study: Mentat

Application

Conclusions

Samer Hassan SS@IJCAI 2009 3

Research Aim

Samer Hassan SS@IJCAI 2009 4

Research Aim

Theoretical

KISS

Structural Validation

Abstract

General

Samer Hassan SS@IJCAI 2009 5

Research Aim

Data-driven

Non-KISS

Empirical Validation

Specific (case study)

Expressive

Theoretical

KISS

Structural Validation

Abstract

General

Samer Hassan SS@IJCAI 2009 6

Classical Logic of Simulation

Samer Hassan SS@IJCAI 2009 7

Data-Driven Logic

Samer Hassan SS@IJCAI 2009 8

Data-driven Approach

Complexity

Large amounts of Data

Auxiliary AI: Fuzzy Logic Ontologies Evolutionary Computation Data Mining

Samer Hassan SS@IJCAI 2009 9

Data Mining

Data Mining Extracting patterns and relevant information from large

amounts of data

Pre-processing of empirical data Cluster finding Discovery of hidden patterns Locates redundancies

Post-processing of simulation output Clustering:

• Discovery of hidden patterns • Validation of clusters• Locates inconsistencies

Classification• Cluster matching

Samer Hassan SS@IJCAI 2009 10

Contents

Data-driven ABM

DM-assisted Methodology

Case Study: Mentat

Application

Conclusions

Samer Hassan SS@IJCAI 2009 11

Methodology for DM-assisted ABM

Samer Hassan SS@IJCAI 2009 12

Methodology for DM-assisted ABM

Data Collection Initial point Validation points

• Necessarily ≠ initial

Type Explicit Externalised

Empirical distributions• Secondary sources

Methods Quantitative

• E.g. surveys Qualitative

• E.g. interviews

Samer Hassan SS@IJCAI 2009 13

Methodology for DM-assisted ABM

Analysis Preprocessing of empirical data

Roles Domain expert

• Guide DM exploration• Interpretation

DM expert• Confirm or refine theories

Samer Hassan SS@IJCAI 2009 14

Methodology for DM-assisted ABM

Selection of Relevant Data Filtering Adaptation of data

• Normalisation• Discretisation

Domain Expert• Theory

DM• Redundancies• Overlooked independent

variables

Samer Hassan SS@IJCAI 2009 15

Methodology for DM-assisted ABM

Data Analysis Large data collections Guided by theory

Types Cluster analysis Principal Component Analysis Time series methods Association rules

Samer Hassan SS@IJCAI 2009 16

Methodology for DM-assisted ABM Interpretation of results

Theory expert• Relate results to theory

New findings are added to the findings base

Samer Hassan SS@IJCAI 2009 17

Methodology for DM-assisted ABM

ABM Building Based on Findings Modeller

Steps Formalisation Data-driven Design Implementation Initialisation

Samer Hassan SS@IJCAI 2009 18

Methodology for DM-assisted ABM

Simulation Fine tuning the ABM

• Sensitivity analysis• Intensive testing

Output• Record agent trace

Samer Hassan SS@IJCAI 2009 19

Methodology for DM-assisted ABM

Validation Analysis of the results

• Empirical validation • Theoretical consistency

Roles• DM expert

• Analyse the data

• Domain expert• Extract conclusions

Iterative cycle

Samer Hassan SS@IJCAI 2009 20

Contents

Data-driven ABM

DM-assisted Methodology

Case Study: Mentat

Application

Conclusions

Samer Hassan SS@IJCAI 2009 21

The Problem

Aim: simulate the process of change in social values in a period in a society

Plenty of factors involved

Inertia of generational change: To which extent the demographic dynamics explain the

mental change?

Inter-generational: Agent characteristics remain constant Macro aggregation evolves

Samer Hassan SS@IJCAI 2009 22

Mentat: architecture

Agent:

Mental State attributes

Life cycle patterns

Demographic micro-evolution: • Couples• Reproduction• Inheritance

Samer Hassan SS@IJCAI 2009 23

Mentat: architecture

World: 3000 agents

Grid 100x100

Demographic model

8 indep. parameters

Social Network: Communication with

Moore Neighbourhood

Friends network

Family network

Samer Hassan SS@IJCAI 2009 24

Contents

Data-driven ABM

DM-assisted Methodology

Case Study: Mentat

Application

Conclusions

Samer Hassan SS@IJCAI 2009 25

Data Collection in Mentat

Initial data: EVS-1980

• Representative sample of Spain Qualitative info Empirically-grounded demographic equations

Validation data: EVS-1990 EVS-1999

Samer Hassan SS@IJCAI 2009 26

Analysis in Mentat

Selection of relevant data EVS-1980,1990,1999

Options:1. Algorithm for the best

subset of variables2. Rely on domain expert

Tested domain knowledge• (2) chosen

Variables adaptation• Normalisation

Name Type Range

gender categorical

age numeric ≥18

studies numeric ≥5

civil state categorical

economy numeric real

ideology ordinal 1-10

conf. church ordinal 1-4

church att. Ordinal 1-7

relig. person categorical

Samer Hassan SS@IJCAI 2009 27

Analysis in Mentat

Data Analysis Algorithm selection

• Wrapped k-means• Explore different k (# of clusters)

Discarded variables• Gender & Age provokes appearance of irrelevant clusters

• E.g. widowed women

• Economy is redundant• High correlation with Education

Samer Hassan SS@IJCAI 2009 28

Analysis in Mentat

Interpretation Sociological research

Religious typology (RLGTYPE)• Based on 3 variables• Ecclesiastical, low-intensity, alternatives & non-religious

Clusters found (1980, 1999)• Based on the 9-3=6 variables• 5 clusters with sociological meaning• Consistent with RLGTYPE

Theoretical observations of the pattern evolution:• Religiosity strength falls • Ideological spectrum twists to the left

• education & economy • Newest type of religiosity, “alternatives” rise

• youngsters

Samer Hassan SS@IJCAI 2009 29

Analysis in Mentat

Samer Hassan SS@IJCAI 2009 30

Validation in Mentat

Mentat re-building & simulation explored

Mentat output clusterised Same 5 clusters found Similar evolution trends 3 theoretical observations shown

Inconsistencies detected Liberal cluster % do not match

• although aggregated they do Graphics show less youngsters

• Liberal clusters deeply affected

Guide to re-design

Samer Hassan SS@IJCAI 2009 31

Contents

Data-driven ABM

DM-assisted Methodology

Case Study: Mentat

Application

Conclusions

Samer Hassan SS@IJCAI 2009 32

Conclusions

DM-assisted ABM methodology Suitable for DDABM

• Complexity• Large amounts of data

Limitations • KISS• Qualitative sources

Uses Build new ABM Re-thinking existing DDABM

• Revealing hidden facts• Detect inconsistencies

Samer Hassan SS@IJCAI 2009 33

Thanks for your attention!

Samer Hassansamer@fdi.ucm.es

Universidad Complutense de Madrid

Samer Hassan SS@IJCAI 2009 34

Contents License

This presentation is licensed under a

Creative Commons Attribution 3.0 http://creativecommons.org/licenses/by/3.0/

You are free to copy, modify and distribute it as long as the original work and author are cited

Para ver esta película, debedisponer de QuickTime™ y de

un descompresor TIFF (sin comprimir).