Demystifying Data Science
-
Upload
jonathan-sedar -
Category
Data & Analytics
-
view
135 -
download
2
Transcript of Demystifying Data Science
![Page 1: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/1.jpg)
Demystifying Data ScienceWhat does it mean in practice?
Jonathan SedarPrincipal Data ScientistApplied AI Ltd
www.applied.ai@applied_ai@jonsedar
![Page 2: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/2.jpg)
Applied AI is a Data Science Consultancy
We create a competitive advantage for financial services companies through applied artificial intelligence
www.applied.ai @applied_ai @jonsedar
![Page 3: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/3.jpg)
Know Your Customers Develop Your Market Manage Risk & Regulation
Innovate & Experiment Streamline Operations Embed Data Science
![Page 4: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/4.jpg)
Demystifying Data Science
Motivations A Maturity Model
An Ecosystem Model Practical Examples & Advice
![Page 5: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/5.jpg)
Data Science
![Page 6: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/6.jpg)
$> DATA.SCIENCE()
![Page 7: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/7.jpg)
![Page 8: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/8.jpg)
Intelligently Learning From Data
![Page 9: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/9.jpg)
Extracting information from all that Big Data you're collecting
.. and the small stuff too
![Page 10: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/10.jpg)
Discovering correlations, inferring patterns of behaviour ... and training
models to predict outcomes
![Page 11: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/11.jpg)
Running the business more effectively ... and systematising
insights and products
![Page 12: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/12.jpg)
How wonderful for you
![Page 13: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/13.jpg)
Learning from data is nothing new
![Page 14: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/14.jpg)
Most of our business is doing it already
![Page 15: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/15.jpg)
Trading & Quant Finance Increase Revenue
![Page 16: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/16.jpg)
Process Optimisation Reduce Costs
![Page 17: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/17.jpg)
Portfolio Risk Modelling Manage Risk
![Page 18: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/18.jpg)
Reserves & Stress Testing Meet Compliance
![Page 19: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/19.jpg)
Learning from data benefits the whole business
Increase Revenue
tune risk profileunderstand the competition
optimise business processesimprove customer retention
inform & adapt to regulatory changedemonstrate leadership
innovate product-market fitincrease customer base
Reduce Cost
Manage Risk Meet Compliance
![Page 20: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/20.jpg)
![Page 21: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/21.jpg)
Data Science Maturity Model
![Page 22: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/22.jpg)
![Page 23: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/23.jpg)
Sophisticated Analyses•Hypothesis testing & data discovery
•Advanced statistics & predictive modelling
•Deliver immediate value, guide strategy
•Advanced data science is supported thought the organisation and embedded in:
•Products & Services•Senior Decision Making•Business Administration
Full Capability Data Science
• Identify new opportunities and useful data sources
•Basic modelling•Senior leaders help to define & develop the business case
Getting Started•Create ‘data products’, reports, new systems to embed change
•Replace legacy systems•Build internal knowledge and skills
Business Operations
![Page 24: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/24.jpg)
• Auto Insurer: “Help me price correctly”
• Extracted, cleaned, parsed data from messy internal & external sources
• Lightweight multidimensional analysis of customer base inc interactive dashboards
• Reports and strategic recommendations to board level, proving the need for further analysis
Getting Started
![Page 25: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/25.jpg)
Sophisticated Analyses
• Life & Pensions: “Help me model my customer churn (a credit risk situation)”
• Sourced, cleaned, prepared internal & external data
• Created advanced time-to-event models using Bayesian statistics
• Churn modelling output identified key risk groups & potentially large new revenues and cost savings
![Page 26: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/26.jpg)
Business Operations
• Asset Management Co: “Help me price real estate at the optimal market price”
• Sourced, cleaned, prepared data, undertook initial investigations and statistical modelling
• Created a price prediction “engine” within a microservice API, now used within daily operations
• Accurate estimates and reduced manual effort
![Page 27: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/27.jpg)
Full Capability Data Science
• The holy grail!
• A centre of excellence guiding:
• Products
• Decision Making
• Business Administration
![Page 28: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/28.jpg)
![Page 29: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/29.jpg)
Data Science Ecosystem
![Page 30: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/30.jpg)
![Page 31: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/31.jpg)
Data Curation
• Making the right data available for modelling and maintaining it well.
• Garbage-in-garbage-out
• Getting to ‘good data’ is subtle
• 80% of the process
![Page 32: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/32.jpg)
Machine Learning• Learning from data
• The empirical practice at the heart of statistics.
• A machine (aka computer or model) is trained on a dataset to predict values
• Predict or infer real-word behaviours.
![Page 33: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/33.jpg)
Business Integration• Conventional business analysis lives and
dies within spreadsheets & presentations
• Expensive dashboards require unstable data pipelines.
• Huge data warehouses and "lakes" are so complicated they're barely utilised.
• Business integration is hard, but critical
![Page 34: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/34.jpg)
![Page 35: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/35.jpg)
Three Stories of Data Science in Practice
![Page 36: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/36.jpg)
Data Curation
![Page 37: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/37.jpg)
Curating external datasets to better understand customers
Clustering Introspection Visualisation
![Page 38: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/38.jpg)
We work mainly with insurance companies They don’t have a reputation for being exciting
But from a data science point of view…
![Page 39: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/39.jpg)
It’s quite interesting!
![Page 40: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/40.jpg)
“Our term insurance policies are lapsing before they become profitable”
![Page 41: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/41.jpg)
We modelled lapse using survival analysis (more of which later)
Along the way noticed something…
![Page 42: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/42.jpg)
The churn rate was sky-high in new estates
![Page 43: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/43.jpg)
Geographic Effects
![Page 44: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/44.jpg)
And Socioeconomic Effects
![Page 45: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/45.jpg)
We could use these effects to:
Identify lapse-prone customers More accurately price credit risk
Identify new markets
![Page 46: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/46.jpg)
… we’re not the first people to think of this
![Page 47: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/47.jpg)
We can do it better and cheaper ourselves
![Page 48: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/48.jpg)
First: geocode the customer baseGet lat/long based on address
Used Nominatim (FOSS, based on PostGIS) rather than Google, because …
![Page 49: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/49.jpg)
Irish addresses are pathological!
![Page 50: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/50.jpg)
Second: go shopping for socioeconomic data
Irish census produced every 5 years 15 themes, 500+ features
Captures almost everything about daily life Aggregated to ‘small areas’ approx 200 households
![Page 51: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/51.jpg)
Census themesTheme Subject Theme Subject
1 Sex, Age & Migration 9 Social Class2 Ethnicity & Language 10 Education3 Irish Langage 11 Commuting4 Families 12 Health5 Private Housholds 13 Occupation6 Housing 14 Industries7 Hospitals & Prisons 15 PC & Internet8 Principal Status
![Page 52: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/52.jpg)
We could do what Experian does, and also:
We would own the code We could integrate with any internal project
We could tune it to fit our needs
![Page 53: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/53.jpg)
Lets take a look at the data
Not a trivial task… What we have is a really big matrix
18,488 rows x 767 columns
![Page 54: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/54.jpg)
Data Compression Visualisation
Clustering (unsupervised learning)
![Page 55: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/55.jpg)
Data Compression
Singular Value Decomposition Rotate and scale data into new frame of reference
Compress into fewer features while maintaining information
Compressed 500+ columns into 100
![Page 56: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/56.jpg)
Data Visualisation
t-Distributed Stochastic Neighbor Embedding (t-SNE) Visualise 100D in 2D space
View natural clustering in the data
![Page 57: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/57.jpg)
Clustering
Hierarchical Agglomerative Clustering (Ward Clustering)
Progressively group nearby datapoints into larger clusters Cut nested hierarchy of clusters to fit
![Page 58: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/58.jpg)
Interpreting the Clusters
![Page 59: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/59.jpg)
…carefully
![Page 60: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/60.jpg)
Now we can place each small area on a map
Using shapefiles and PostGIS
![Page 61: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/61.jpg)
Dublin, Ireland 2011
![Page 62: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/62.jpg)
Interactive dashboard showing each Small Area (200 people),
plotted by location and cluster id
![Page 63: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/63.jpg)
Data Curation• A centralised, up-to-date, traceable,
documented repository for structured text, tabular & image datasets
• Augment with public data to keep up with competitors and gain an edge
• Update, maintain and optimise your primary data sources to allow for high risk/reward POC projects
![Page 64: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/64.jpg)
Machine Learning
![Page 65: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/65.jpg)
Learning from data to predict outcomes and infer behaviours
Supervised (classification, regression) Unsupervised (clustering, pattern matching)
Reinforcement (behavioural rewards)
![Page 66: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/66.jpg)
Hot new area, thus word soup
artificial intelligence machine intelligence statistical modelling
robotic process automation cognitive computing
deep learning …
![Page 67: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/67.jpg)
Statistics <3 Machine Learning
![Page 68: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/68.jpg)
Example 1: time to event modelling
“What’s our projected customer churn (and thus projected credit risk)
Supervised Regression
![Page 69: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/69.jpg)
Basic idea: estimate this curve
![Page 70: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/70.jpg)
Counts: Kaplan Meier
![Page 71: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/71.jpg)
Parametric (or semi-parametric) models Exponential, Weibull, Cox PH Regression etc
![Page 72: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/72.jpg)
Time-varying coefficients Piecewise, Aalen-Additive Regression etc
![Page 73: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/73.jpg)
Sidenote: Bayesian Inference is perfect for time-based regression
![Page 74: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/74.jpg)
Treat observed values as a realisation of a probability distribution
![Page 75: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/75.jpg)
Big wins: capture prior knowledge, preserve uncertainty, model introspection and inference
![Page 76: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/76.jpg)
Create predictions with qualified uncertainty: “credible regions”
![Page 77: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/77.jpg)
Straightforward to extend models e.g. time-varying effects
![Page 78: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/78.jpg)
Straightforward to make models robust e.g. outlier detection, mixture models
![Page 79: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/79.jpg)
Example 2: topic modelling
“Can we learn the topics of conversation in broker communications?
Unsupervised Clustering
![Page 80: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/80.jpg)
NLP upon business data sources
![Page 81: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/81.jpg)
After careful cleaning, anonymisation, preprocessing
![Page 82: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/82.jpg)
Find the ‘topics’ of conversation Words that seem to co-occur
![Page 83: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/83.jpg)
Use topics as a shortcut to categorise and correlate documents to activity
![Page 84: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/84.jpg)
Create the communications graph Learn social & organisational structure
![Page 85: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/85.jpg)
Design for interactive investigation
![Page 86: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/86.jpg)
Example 3: anomaly detection
“Can we spot fraudulent activity in claims?”
Un / Supervised Learning
![Page 87: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/87.jpg)
Supervised Learning: function estimation
Classification: Log. Reg, Neural / Deep Nets, Trees, Random Forests Regression: Linear, Non-Linear, Time-Series
![Page 88: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/88.jpg)
Unsupervised Learning: pattern finding
Clustering, distance measures, topologies
![Page 89: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/89.jpg)
Feature engineering is critical
Understand the data shape, size, behaviours and the processes that generated it
![Page 90: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/90.jpg)
Machine Learning• Sophisticated statistical techniques,
good software dev practices and research-grade, open-source software
• Document and share knowledge to become technical centre of excellence
• Validate, test, review & maintain your data pipelines, software and models to mitigate risk and allow for audit
![Page 91: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/91.jpg)
Business Integration
![Page 92: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/92.jpg)
Learning from data benefits the whole business
Increase Revenue
tune risk profileunderstand the competition
optimise business processesimprove customer retention
inform & adapt to regulatory changedemonstrate leadership
innovate product-market fitincrease customer base
Reduce Cost
Manage Risk Meet Compliance
![Page 93: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/93.jpg)
How to integrate data science into business activities?
![Page 94: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/94.jpg)
Tooling
![Page 95: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/95.jpg)
Open Source
![Page 96: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/96.jpg)
Reproducibility and Documentation
![Page 97: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/97.jpg)
Wider Communication
![Page 98: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/98.jpg)
APIs and Integration
![Page 99: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/99.jpg)
The Team
![Page 100: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/100.jpg)
Data scientist skill set
Drew Conway’s (in)famous Venn Diagram
![Page 101: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/101.jpg)
Not so different from a software development team
![Page 102: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/102.jpg)
Communicate
![Page 103: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/103.jpg)
Iterate
![Page 104: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/104.jpg)
and another thing…
![Page 105: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/105.jpg)
The practice of data science can offer powerful insight and prediction…
![Page 106: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/106.jpg)
… it’s only a model
![Page 107: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/107.jpg)
Business Integration• Clear path from model inference and
predictions to the extrapolation of business actions and impacts
• Communicate results with non-technical stakeholders via engaging dashboards and visualisations
• Integrate an automated, live, on-demand prediction service with business systems
![Page 108: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/108.jpg)
Using a “Data Science” approach: - Motivations - A Maturity Model - An Ecosystem Model
Practical Examples & Advice
![Page 109: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/109.jpg)
Learning from data benefits the whole business
Increase Revenue
tune risk profileunderstand the competition
optimise business processesimprove customer retention
inform & adapt to regulatory changedemonstrate leadership
innovate product-market fitincrease customer base
Reduce Cost
Manage Risk Meet Compliance
![Page 110: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/110.jpg)
![Page 111: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/111.jpg)
![Page 112: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/112.jpg)
Further reading•Blogs with good technical articles, insights etc
•http://blog.applied.ai •http://www.magesblog.com •https://planet.scipy.org •http://andrewgelman.com •http://blog.kaggle.com
• Books / technical articles •https://www.oreilly.com/ideas/what-is-hardcore-data-science-in-practice •http://www.oreilly.com/data/free/ten-signs-of-data-science-maturity.csp •Machine Learning for Hackers http://shop.oreilly.com/product/0636920018483.do
![Page 113: Demystifying Data Science](https://reader031.fdocuments.in/reader031/viewer/2022011722/586e72f71a28ab99598b52bb/html5/thumbnails/113.jpg)
Thank you
www.applied.ai @applied_ai @jonsedar