Data Science State Of Mind
-
Upload
vijaysbhat -
Category
Documents
-
view
121 -
download
0
Transcript of Data Science State Of Mind
State Of MindVijay Bhat
A Data Science
Matching Needs With Talents
@vijaysbhat/in/vijaysbhat
About me13 Years In Tech
Mobile Financial Services Smart Meter Analytics
Data Science Applications
Forecasting Fraud DetectionRecommendation Systems
Social Media
Growth Analytics
@vijaysbhat/in/vijaysbhat
@vijaysbhat/in/vijaysbhat
What I’ll cover
Job trends
Skills your business needs
Wrapup
Key data scientist mindsets
Dig into some data for insights
@vijaysbhat/in/vijaysbhat
Skills shortage!
@vijaysbhat/in/vijaysbhat
Look closer at the dates
Things haven’t improved in 3 years?
@vijaysbhat/in/vijaysbhat
Job Trends - Big Data Postings
Source: Indeed.com
@vijaysbhat/in/vijaysbhat
What about Data Science?
Source: Indeed.com
@vijaysbhat
How do the two trends compare?
Source: Indeed.com
/in/vijaysbhat
@vijaysbhat/in/vijaysbhat
Sample Big Data Postings
Hadoop
AWS
Distributed Systems
@vijaysbhat/in/vijaysbhat
Stages of Organizational Data Maturity
Descriptive
Diagnostic
Predictive
Prescriptive
Most ‘Big Data’
positions
Storytelling
Machine Learning
Data Scientist
SkillsSoftware Engineering
Data EngineeringStatistics
Data Visualization@vijaysbhat/in/vijaysbhat
@vijaysbhat/in/vijaysbhat
Anticipate future needs
@vijaysbhat/in/vijaysbhat
What skills does your business need?
Machine Learning Statistics
Data Engineering
Software Engineering
Data Visualization Storytelling
Descriptive
Diagnostic
Predictive
Prescriptive
@vijaysbhat/in/vijaysbhat
Let’s look at some data
@vijaysbhat/in/vijaysbhat
Data Scientist Profiles - LinkedIn Data
Chart Junk! We can do
better.
Data Scientist Profiles - Minus Chart Junk
@vijaysbhat/in/vijaysbhat
Let’s have some fun - Frequent Itemsets
@vijaysbhat/in/vijaysbhat
SQL
Hadoop
Statistics
Java
@vijaysbhat/in/vijaysbhat
Any signal?
Anomalously high co-occurrence of
skill pairs
@vijaysbhat/in/vijaysbhat
Any signal?
Anomalously low co-occurrence of
skill pairs
@vijaysbhat/in/vijaysbhat
Skills cluster based on career path
Analyst● R● Statistics● SQL
Big data engineer● Hadoop● Big data● Java
Software engineer● Python● Algorithms● SQL
...
Why not expand your search?
@vijaysbhat/in/vijaysbhat
There’s a lot of opportunity
Source: RJ Metrics
52% of data scientists started within last 4
years
@vijaysbhat/in/vijaysbhat
But matching talent
shouldn’t feel like this
@vijaysbhat/in/vijaysbhat
Be creative in combining skills
@vijaysbhat/in/vijaysbhat
Key Mindsets
@vijaysbhat/in/vijaysbhat
Always Be Automating
@vijaysbhat/in/vijaysbhat
A Case For Automation
@vijaysbhat/in/vijaysbhat
Automation Example: Web Scraping
+
+
Automation Example: ETL
@vijaysbhat/in/vijaysbhat
@vijaysbhat/in/vijaysbhat
Key MindsetAlways Be Learning
@vijaysbhat/in/vijaysbhat
Personal SWOT Analysis
Strengths Weaknesses
Opportunities Threats
Where should I be focusing my efforts?
@vijaysbhat/in/vijaysbhat
Personal SWOT Analysis Example
Strengths❏ Domain knowledge❏ Industry relationships❏ Statistics training
Weaknesses❏ No formal CS training❏ Limited professional data
science experience
Opportunities❏ Identify industry trend
sweetspots for employer❏ Present at conferences
Threats❏ Might hate data janitor work❏ Oversupply of data scientists
@vijaysbhat/in/vijaysbhat
Resources
@vijaysbhat/in/vijaysbhat
Resources
@vijaysbhat/in/vijaysbhat
Sample Study Plan - Beginner● Scraping
○ BeautifulSoup○ Scrapy
● SQL○ SQLAlchemy
● Data Frames○ pandas
● Machine Learning○ Scikit-learn
■ Linear Regression■ Naive Bayes■ Random Forests
● Visualization○ matplotlib
@vijaysbhat/in/vijaysbhat
Key MindsetAlways be Skeptical
@vijaysbhat/in/vijaysbhat
Ask Fermi questions
?
@vijaysbhat/in/vijaysbhat
Watch out for data traps
If it’s too good to be true...
Data Leakage p-Value Hacking Overfitting
@vijaysbhat/in/vijaysbhat
Key mindsets
Thank You.@vijaysbhat