Why you should care about synthetic data
-
Upload
real-impact-analytics -
Category
Technology
-
view
202 -
download
0
Transcript of Why you should care about synthetic data
DATASYNTHETIC
Presented by Real Impact Analytics
WHY YOU SHOULDCARE ABOUT
QUESTIONS?#SYNTHETICRIA
OVERVIEWSYNTHETIC DATA
What is synthetic data?Why use it?How to create it?Who creates it?Conclusion
WHAT IS SYNTHETIC DATA
SYNTHETIC DATA?WHAT IS
Generic and artificial dataused to mimic real-worlddata sets.
Generic and artificial dataused to mimic real-worlddata sets.
Protect people’s privacysubstitutes real data that contains personal information
SYNTHETIC DATA?WHAT IS
Generic and artificial dataused to mimic real-worlddata sets.
Test robustness and accuracyduring software development
SYNTHETIC DATA?WHAT IS
Generic and artificial dataused to mimic real-worlddata sets.
Create artificial basewith similar features of real data sets
SYNTHETIC DATA?WHAT IS
WHYUSE IT?
Use of actual data sets is nolonger allowed, to protecteveryone’s right to privacy.
To develop big data tools, weneed realistic data sets fortesting algorithms and easy datavisualization.
Synthetic data - similar to realdata sets & shareable to public -acts as a substitute withoutinvading anyone’s privacy.
HOWTO CREATE IT?
TO CREATE IT?HOW
DRAWINGNUMBERS
AGENT-BASEDMODELLING
OR1 2
TO CREATE IT?HOW
DRAWING NUMBERS
Observe real-world statisticdistributions from original data to reproduce artificial bases by drawing simple numbers.
1
EXAMPLETELECOM DATA
DRAWING NUMBERS
DRAWING NUMBERS
Observe the real temportaldistributions of texts and phone calls from CDR data (call detail records).
Create an artificial base of customers.
DRAWING NUMBERS
Simulate texts and phone calls with time stamps following the distributions. The goal is to simulate CDRs so they follow the same distribution as real CDRs.
DRAWING NUMBERS
TO CREATE IT?HOW
Create physical models to explain observed behaviour to generate generic, random data using this model.
AGENT-BASEDMODELLING2
EXAMPLETELECOM DATA
AGENT-BASED MODELLING
Analyze real data from texts and phone calls, identifying temporal and behavioural patterns.
AGENT-BASEDMODELLING
Create a physical model based on those observations and evolutions over time.
AGENT-BASEDMODELLING
This model simulates texts and phone calls over time as they would occur in real life.
AGENT-BASEDMODELLING
WHOCREATES IT?
CREATES IT?WHO
IN-HOUSE DEVELOPMENT
AD-HOC DEVELOPMENT
OR
DEPENDING ON THE COMPLEXITY OF THE DATA SET
CONCLUSION
SYNTHETIC DATA
SYNTHETIC DATACONCLUSION
Your ability to generate realistic syntheticdata is essential to developing algorithms and software that will maximize the valueof your big data tools, without transgressing privacy laws.
@RIAnalytics
realimpactanalytics.com
@RealImpactAnalytics
Real Impact Analytics
Real Impact Analytics (RIA) taps into rich telecomdata to capture its value. The data is turned intoaction with big data apps that ease our clients’day-to-day work.
RIA provides guided and predictive analyticsthrough proprietary software. Five of the top tenglobal telecom operators trust us to enhancecustomer experience through Customer ValueManagement, and optimize daily operations withour Commercial Excellence apps.
To learn how Real Impact Analytics can create thesame value for you, visit realimpactanalytics.com.
About Us