Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… ·...

34
Data Analysis

Transcript of Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… ·...

Page 1: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Data Analysis

Page 2: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

iClicker QuestionI know a lot about analyzing data.

A. Strongly agreeB. AgreeC. Don’t agree or disagreeD. DisagreeE. Strongly disagree

Page 3: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Overview

• What is data analysis?

• Data collection methods

• Big data

• Practical examples

• Analyzing data

• How can you get started?

Page 4: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Workspace 1

What do you think when you hear the words, “analyze data?”

Page 5: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Data analysis

• Data: information collected– Can be numerical or lexical

– Datum: single piece of information

• Analysis: careful study of information or a close look at information

• Data analysis: Careful study of information collected to create new information

Page 6: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Data collection methods

• Surveys

• Experiments

• Web clicks

• Search engine searches

• Purchases at a store

• Any possible recordable action

Page 7: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Big Data

Very large complex sets of data that are not easily analyzed.

Page 8: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Big data examples

• Google searches– Over 3 billion searches per day

– Ensure high ad revenue

• Retail stores– Predicting sales trends instead of reacting to them

– Identify hot holiday sales items

– Market to people based on geolocation, social media, Web browsing patterns, customer loyalty programs

Page 9: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Practice problems

• What is the difference between data, datum, analysis, and data analysis?

• How can data be collected?

• What is big data?

• How do search engine companies and retail stores use big data?

• Think of other possible examples, where big data can be applied.

Page 10: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Data analysis examples

Courtesy of Tony Felix

Page 11: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Car insurance – Thought questions

• Why do some states have variable car insurance rates?

• Which factors affect car insurance rates and why?– Men

– Women

– Age

– Having children

Page 12: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Car insurance

• High insurance rates are typically due to probability of accidents– Age (teenagers) and elderly typically have higher rates

• Individuals in the age range 25-50 typically have lower rates– Mature and typically drive safer

• Men typically have higher rates than women• Women with children typically have the lowest

rates due to safest driving• Prior tickets received

Page 13: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Supermarkets

• Club cards are used to collect data about customers

– Receive coupons for similar or related items

– May receive a coupon for creamer if you purchase coffee

• Question: What do you think data analysts found regarding their customers on Thursday evenings?

Page 14: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Supermarkets

• There was a correlation for men between beer and diapers.

– Typically purchased beer for the weekend

– Needed diapers for their children

• How did supermarkets respond? How would you respond?

Page 15: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Supermarkets

• There was a correlation for men between beer and diapers.– Typically purchased beer for the weekend

– Needed diapers for their children

• How did supermarkets respond? How would you respond?– Move beer to a physical location that is visible from

the diapers section

– Increase the cost of diapers or ensure it was not on sale

Page 16: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Amazon.com and Electronic Arts

Thought question: Who has richer data to mine regarding their

customers?

Page 17: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

iClicker QuestionWho has “richer” data?

A. Amazon.comB. Electronic Arts

Page 18: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Amazon.com and Electronic Arts

Amazon.com• Knows what customers

purchase• When customers purchase

items• Frequency at which

replacement items are purchased

• Customer reviews of items purchased

• Broad view of customer preferences based on wide product variety

Page 19: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Amazon.com and Electronic Arts

Amazon.com• Knows what customers

purchase• When customers purchase

items• Frequency at which

replacement items are purchased

• Customer reviews of items purchased

• Broad view of customer preferences based on wide product variety

Electronic Arts

• Knows which video games customers own (by EA)

• Collects information regarding actions taken

• Subsequent games purchased

• Links between purchases and actions

Amazon.com does not know what you did with a

product after you purchase it.

Page 20: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Practice problems

• What factors affect car insurance rates?• Why does each factor affect care insurance rates?• How can supermarkets collect information about

customers?• What can supermarkets do with information they

collect regarding customers?• How do supermarkets use customer information to

increase sales?• What is the differences between the type of data

Amazon.com and Electronics Arts collect from their customers?

• Who (Amazon.com or EA) has richer data and why?

Page 21: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Analyzing data

Page 22: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Statistics

• Mean: average

• Mode: most frequent number

• Median: middle number

• Many more (online lecture): variance, standard deviation, kurtosis, skewness, etc.

5 10 1 3 2 3 4 5 10 2

Page 23: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Statistics

• Mean: 5

• Mode: 3

• Median: 3.5

5 10 1 3 2 3 4 5 10 2

0

0.5

1

1.5

2

2.5

3

3.5

0 2 4 6 8 10 12

rating

# of people

Page 24: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Bell curve

Page 25: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

0

1

2

3

4

5

6

0 2 4 6 8 10 12

0

0.5

1

1.5

2

2.5

3

3.5

0 2 4 6 8 10 12rating

# of people

rating

# of people

mean, mode, median

mode

median

mean

Page 26: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Statistics

• Mean: 5

• Mode: 3

• Median: 3.5

5 10 1 3 2 3 4 5 10 2

0

0.5

1

1.5

2

2.5

3

3.5

0 2 4 6 8 10 12

Page 27: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Net Promoter Score (NPS)

• Customer loyalty metric (Reichheld)

• Calculated based on a single question (0-10 scale):How likely is it that you would recommend our company/product/service to a friend or colleague?

• Respondents are classified into 3 groups (promoter, passive, detractor)

Page 28: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?
Page 29: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

iClicker QuestionWhat score is needed to be considered a promoter?

A. 10B. 9-10C. 8-10D. 7-10E. 6-10

Page 30: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Net Promoter Score (NPS)

• Promoter: 9-10

• Passive: 7-8

• Detractor: 0-6

Page 31: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

Practice problems

• What are mean, mode, and median?

• How are mean, mode, and median related?

• How can mean, mode, and median be interpreted (or incorrectly interpreted)?

• What is NPS?

• What are the groupings for NPS?

• How can you use the different NPS groups?

Page 32: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

How can you get started?

• Look at different types of data and consider how you can analyze them

• Think about how you can summarize data to make sense to others

• Determine different ways to create new information from data you have

• Excel is a powerful tool that can get you started

• We will be covering more details in the next lecture

Page 33: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

On-line lecture

• Additional statistics

• Data analysis tools

• NPS usage

Page 34: Data Analysis - laulima.hawaii.edu › access › content › user › smosier › ics › 10… · •Big data •Practical examples •Analyzing data •How can you get started?

iClicker QuestionI learned a lot about analyzing data.A. Strongly agreeB. AgreeC. NeutralD. DisagreeE. Strongly disagree

Please do not pack up yet. We have a short iClicker midterm review on the next few slides.