"What is Data Science?" High School Version

Post on 27-Jan-2017

325 views 5 download

Transcript of "What is Data Science?" High School Version

{

What is Data Science?

Renée Teate, March 2016Harrisonburg High School

Let’s start with: “What is Data?”

http://upload.wikimedia.org/wikipedia/commons/f/f0/DARPA_Big_Data.jpg

Bitshttps://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcS9dKu3_Tzi-sWW-yAqee5y0EhuvoIZNSya_rAKnuBBd0JYxPX7pw

Numbers

http://fc01.deviantart.net/fs71/i/2012/326/3/4/cute_dog_by_thomasmeadows345-d5lsah9.jpg

Imageshttp://www.freefoto.com/images/1351/06/1351_06_2---Books--Shakespeare-and-Company-Bookstore--The-Latin-Quarter--Paris_web.jpg

Text

Created & Collected

https://c2.staticflickr.com/

4/3273/3017878633_65beb1c7d6.jpg

http://upload.wikimedia.org/wikipedia/commons/e/e4/

Green_Bank_100m_diameter_Radio_Telescope.jpg

https://c1.staticflickr.com/1/2/1349370_0703fce74c.jpg

http://upload.wikimedia.org/wikipedia/commons/9/96/Bill_Nye,_Barack_Obama_and_Neil_deGrasse_Tyson_selfie_2014.jpg

Analyzed and Visualized

http://upload.wikimedia.org/wikipedia/commons/1/1c/CMS_Higgs-event.jpg

http://upload.wikimedia.org/wikipedia/commons/9/90/Kencf0618FacebookNetwork.jpg

http://upload.wikimedia.org/wikipedia/commons/b/bf/

USDA_Hardiness_zone_map.jpg

https://c1.staticflickr.com/3/2300/2596366618_2d6cb01735.jpg

“Big Data”https://web-assets.domo.com/blog/wp-content/uploads/2014/04/DataNeverSleeps_2.0_v2.jpg

Stored in Databases on Servers in Data Centers

http://pixabay.com/static/uploads/photo/2014/03/13/01/12/datacenter-286386_640.jpg

https://c2.staticflickr.com/2/1296/533233247_b6baa30fdb_z.jpg?zz=1“The Cloud”

What is a database?

Database[dey-tuh-beys] nounA comprehensive collection of related data organized for convenient access, generally in a computer.

-dictionary.com I used a database to look up this definition!

Types of Databases

http://www.oaddo.org

Relational DMBS

Graph Database

Databases You Use

https://www.google.com/maps/@38.8905569,-77.1721577,13z/data=!5m1!1e1

http://upload.wikimedia.org/wikipedia/commons/6/69/Netflix_logo.svg

https://c2.staticflickr.com/4/3324/3507973704_563846fe14_z.jpg?zz=1

How is data collected about you used to help

you?

Who builds these systems?

Data ScientistComputer Scientist

• Gathering data• Writing Code• Designing

Interfaces• Design / Manage /

Query Databases• Data Mining

Mathematician• Statistics• Predictive

Analytics• Data

Visualizations• Evaluating

Results

Business Person

• Domain Expertise• Knowing what

questions to ask• Interpreting

results for business decisions

• Presenting outcomes

No one person needs to have all of these skills. More organizations are now building data science

teams.

Becoming a Data Scientist Podcast

How have I learned data science?

Statistician Data Mining Specialist Biostatistician Social Science Researcher Big Data Analyst Spatial/GIS Analyst Natural Language

Processing Researcher Computational Physicist

Some other names for “Data Scientist”

Pythonista Financial Analyst Recommendation System

Engineer Information Architect Artificial Intelligence

Researcher Neuroscientist Data Visualization

Designer

Data Science jobs pay an average of $118,000 per

yearIt is estimated that by 2018, US could

have a shortage of 140,000+ people with advanced analytical skills & need 1.5M

managers/analysts that can make decisions based on data analysis

Examples Galaxy Classification from Images

http://benanne.github.io/2014/04/05/galaxy-zoo.html

Choosing Audience for Content Promotion on Facebookhttp://citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics/

Predicting Seizureshttps://www.kaggle.com/c/seizure-detection

March Madness Pickshttps://www.kaggle.com/c/march-machine-learning-mania-2015

Facial Recognition (auto-tagging)

http://www.mirrordaily.com/facebook-and-google-develop-amazing-facial-recognition-algorithms/22402/

What other things can facial recognition be

used for?

What are the ethical questions about this?

http://xkcd.com/1425/

It’s actually really hard for computers to do things humans consider simple! We train them using

“machine learning”

https://www.linkedin.com/pulse/machine-learning-image-detectioncats-vs-dogs-amrith-kumar

"Once you start working in robotics, you realize that things that kids learn to do up to age 10 ... are actually the hardest things to get a robot to do.“

-Pieter Abbeel, AI Researcher and Professor at UC Berkeley

https://www.youtube.com/watch?v=gy5g33S0Gzo

AI and Robotics

Programming Any language is

good to start with! Just start coding!

Most common: Python or R

Database design, SQL

Math Statistics Linear Algebra

Different places you can start if you’re interested in data science

Research and Analysis Science involving data

collection and interpretation

Working with “messy” real life data

Business Analytics Data Mining

Others Business /

Communication Graphic Design

Doing Data Science by Cathy O’Neil* & Rachel Schutt Data Smart by John Foreman* (uses Excel) Blogs & News Feeds (FlowingData.com is a good one to

start with) Podcasts Twitter – look for curated lists of people to follow

https://twitter.com/BecomingDataSci/lists/women-in-data-science/members

Online courses like those on DataCamp, Codecademy, Coursera

TED talks on Data http://www.ted.com/search?q=data Practice w/public data sets on sites like data.gov Volunteer opportunities via DataKind Ask me…. I have plenty more!

Learning Resources

Find me online:@becomingdatasci

“Data Science Renee”on Twitter

BecomingADataScientist.com

DataSciGuide.com

Becoming a Data Scientist Podcast & Learning Club

Questions?Or want a copy of these slides and

links?

Renée Teaterenee@becomingadatascientist.

com

@becomingdatasci on Twitter