Correlation

14
WHICH NUTRITIONAL VALUE AFFECTS THE AMOUNT OF CALORIES IN CANDY THE MOST? OR WHY YOU GET SO MANY CALORIES FROM A CANDY WITH HIGH CONTENTS OF TOTAL FAT Total Fat Prepared by: Alexander Voronin Bonnie Pang Kayla Mina Stephanie Heintzman Image source: Google Images

Transcript of Correlation

WHICH NUTRITIONAL VALUE AFFECTS THE AMOUNT OF CALORIES IN CANDY THE MOST?OR WHY YOU GET SO MANY CALORIES FROM A CANDY WITH HIGH CONTENTS OF TOTAL FAT

Total Fat

Prepared by:Alexander VoroninBonnie PangKayla MinaStephanie HeintzmanVladyslav Akimenko

Image source: Google Images

EXECUTIVE SUMMARY

Calories in candy products are highly impacted by Total Fat (0,8 correlation)

High levels of cholesterol in candies aren’t connected to the contents of saturated fat (0,47 correlation)

2

TABLE OF CONTENTS

1. Our Data….……………………………………………………………4-5

2. Data Assumptions……………………………………………………….6Correlation coefficient as method of research…………………………..7-8

Using SAS to get insights about the data………………………………..9-12

3. Conclusions about the data.………………………………………..........13

4. How you can reach us…………………………………………………..14

3

BACKGROUND

Our data set consists of 75 candies.

For each candy, the following information is available: Servings, Weight, Calories, Total Fat, Saturated Fat, Cholesterol, Sodium, Carbohydrates, Fiber, Sugars, Protein, Vitamin A, Vitamin C, Calcium, and Iron.

4

HERE’S HOW OUR DATA LOOKS IN SAS

5

WE HYPOTHESIZE THAT…

1. The more Saturated Fat there is, the more Cholesterol there is.

2. Calories are impacted more by Total Fat than by Sugar.

6

METHOD

We will use the correlation coefficient (r) to indicate the strength of the relationships

This will be done using the analytical software, SAS Enterprise Guide

7

HOW WE USE THE CORRELATION COEFFICIENT? It measures the strength and the direction of a linear relationship between two variablesWhen the correlation is positive (r > 0), it means that as the value of one variable increases, so does the other.

If a correlation is negative (r < 0), it indicates that when one variable increases, the other variable decreases. This means there is an inverse relationship between the two variables.

[Shen, David. "Computation of Correlation Coefficient and It's Confidence Interval in SAS." Sas.com. Web. ]

8

STEP 1 Open data set

Analyze > Multivariate > Correlations…

9

STEP 2 Window opens

Drag the indicated variables under “Variables to assign” to the Analysis variable <variable required> under “Task roles”

10

STEP 3 Under the Results tab to the left, check “Create a scatter plot for each correlation pair” and uncheck “Show significance probabilities associated with correlations”.

Finally, Run the correlation

11

RESULTS Each cell in the following data output shows the strength of the relationship between the variables listed in the corresponding rows and columns.

The higher the number the stronger the relationship is

Numbers in the orange box link to hypothesis 1Numbers in the red box link to hypothesis 2 12

Highest number in the dataset – strongest relationship

CONCLUSIONSHypothesis 1- The more Saturated Fat there is, the more Cholesterol there is.

0.47270 – SatFat & Cholesterol

The relationship between Saturated Fat and Cholesterol is 47%. This is a weak correlation.

This means that having more Cholesterol does not indicate higher Saturated Fat levels, and vice versa.

Therefore we reject this hypothesis.

Hypothesis 2 – When a candy is high in calories, there is more likely to be higher levels of Total Fat than Sugar.

0.80707 – Calories & Total Fat

0.41692 – Calories & Sugar

The relationship is about 2X stronger between calories and Total Fat (80%), than Calories and Sugar (41.6%).

This means that the higher levels of Calories can be more likely determined by the levels of Total Fat than levels of Sugar.

Therefore we accept this hypothesis.

13

HOW YOU CAN REACH US

Our email address: [email protected]

Feel free to contact us for data research using other analytical tools and approaches

Using data analysis we’re able to find other correlations in your dataset

14