Excel and research

Post on 07-May-2015

1.687 views 0 download

Transcript of Excel and research

USING MICROSOFT EXCEL WITH BUSINESS RESEARCH METHODS

www.drjayeshpatidar.blogspot.com

TITLE BAR

MENU BAR

FORMULA BAR

STANDARD TOOLBAR

FORMATTING TOOLBAR

ACTIVE CELL

PASTE FUNCTION

TOOLS MENU

The Paste Function Provides Numerous Statistical

Operations

The Statistical Function Category

Data Analysis Dialog Box

• Click on “Tools”• Select “Data Analysis”• Select statistical operation

o such as Histogram

Functions

• Functions are predefined formulas for mathematical operations

• They perform calculations by using specific values, called arguments

• Arguments indicate data or a range of cells

• Arguments are performed, in a particular order, called the syntax.

Functions

• Functions are predefined formulas for mathematical operations

• They perform calculations by using specific values, called arguments

• Arguments are performed, in a particular order, called the syntax.

• For example, the SUM function adds values or ranges of cells

Easy to Use Paste Functions

• AVERAGE (MEAN)• MEDIAN• MODE• SUM• STANDARD DEVIATION

Functions

• The syntax of a function begins with the function name

• followed by an opening parenthesis• the arguments for the function • separated by commas• a closing parenthesis. • If the function starts a formula, an equal

sign (=) is typed before the function name.

The Equal Sign Then The Function Name And

Arguments

• =FUNCTION (Argument1)• =FUNCTION (Argument1,Argument2)

Arguments

• Typical arguments are numbers, text, arrays, and cell references.

• Arguments can also be constants, formulas, or other functions.

The AVERAGE Function Located in the Statistical Category

Data Array

• The data appear in cells A2 through 14• A2:A14• Sometimes written with dollars signs• $A$2:$A$14

Sum, Average, and Standard Deviation

• =FUNCTION (Argument1)• =SUM(A2:A9)• =AVERAGE(A2:A9)• =STDEVA(A2:A9)

SUM FunctionSales Call Example

AVERAGE (Mean) FunctionSales Call Example

Standard Deviation FunctionSales Call Example

Variance s2: (algebraic, scalable computation)

Standard deviation s is the square root of variance s2

n

i

n

i

ii

n

i

i xn

xn

xxn

s1 1

22

1

22 ])(1

[1

1)(

1

1

• Variance

• Standard deviation: the square root of the variance

– Measures spread about the mean

– It is zero if and only if all the values are equal

– Both the deviation and the variance are algebraic

26www.drjayeshpatidar.blogspot.com

27

Data Dispersion Characteristics

• Motivation

– To better understand the data: central tendency, variation and spread

• Data dispersion characteristics

– median, max, min, quantiles, outliers, variance, etc.

• Numerical dimensions correspond to sorted intervals

– Data dispersion: analyzed with multiple granularities of precision

– Boxplot or quantile analysis on sorted intervals

• Dispersion analysis on computed measures

– Folding measures into numerical dimensions

– Boxplot or quantile analysis on the transformed cube

www.drjayeshpatidar.blogspot.com

28

Measuring the Central Tendency

• Mean

– Weighted arithmetic mean

• Median: A holistic measure

– Middle value if odd number of values, or average of the middle two

values otherwise

– estimated by interpolation

• Mode

– Value that occurs most frequently in the data

– Unimodal, bimodal, trimodal

– Empirical formula:

n

i

ixn

x1

1

n

i

i

n

i

ii

w

xw

x

1

1

)(3 medianmeanmodemean

www.drjayeshpatidar.blogspot.com

29

Measuring the Dispersion of Data

• Quartiles, outliers and boxplots

– Quartiles: Q1 (25th percentile), Q3 (75th percentile)

– Inter-quartile range: IQR = Q3 – Q1

– Five number summary: min, Q1, M, Q3, max

– Boxplot: ends of the box are the quartiles, median is marked, whiskers,

and plot outlier individually

– Outlier: usually, a value higher/lower than 1.5 x IQR

• Variance and standard deviation

– Variance s2: (algebraic, scalable computation)

– Standard deviation s is the square root of variance s2

n

i

n

i

ii

n

i

i xn

xn

xxn

s1 1

22

1

22 ])(1

[1

1)(

1

1

www.drjayeshpatidar.blogspot.com

30

Boxplot Analysis

• Five-number summary of a distribution:

Minimum, Q1, M, Q3, Maximum

• Boxplot

– Data is represented with a box

– The ends of the box are at the first and third quartiles, i.e., the height of the box is IRQ

– The median is marked by a line within the box

– Whiskers: two lines outside the box extend to Minimum and Maximum

www.drjayeshpatidar.blogspot.com

31

A Boxplot

A boxplot

www.drjayeshpatidar.blogspot.com

32

Visualization of Data Dispersion:

Boxplot Analysis

www.drjayeshpatidar.blogspot.com

33

Mining Descriptive Statistical Measures in Large

Databases

• Variance

• Standard deviation: the square root of the variance

– Measures spread about the mean

– It is zero if and only if all the values are equal

– Both the deviation and the variance are algebraic

22

1

22 1

1

1)(

1

1ii

n

ii x

nx

nxx

ns

www.drjayeshpatidar.blogspot.com

34

Histogram Analysis

• Graph displays of basic statistical class descriptions

– Frequency histograms

• A univariate graphical method

• Consists of a set of rectangles that reflect the counts or frequencies of

the classes present in the given data

www.drjayeshpatidar.blogspot.com

35

Quantile Plot

• Displays all of the data (allowing the user to assess both the overall behavior and unusual occurrences)

• Plots quantile information

– For a data xi data sorted in increasing order, fi indicates that approximately 100 fi% of the data are below or equal to the value xi

www.drjayeshpatidar.blogspot.com

36

Quantile-Quantile (Q-Q) Plot

• Graphs the quantiles of one univariate distribution against

the corresponding quantiles of another

• Allows the user to view whether there is a shift in going from

one distribution to another

www.drjayeshpatidar.blogspot.com

37

Scatter plot

• Provides a first look at bivariate data to see clusters of

points, outliers, etc

• Each pair of values is treated as a pair of coordinates and

plotted as points in the plane

www.drjayeshpatidar.blogspot.com

38

Loess Curve

• Adds a smooth curve to a scatter plot in order to provide

better perception of the pattern of dependence

• Loess curve is fitted by setting two parameters: a smoothing

parameter, and the degree of the polynomials that are fitted

by the regression

www.drjayeshpatidar.blogspot.com

39

Graphic Displays of Basic Statistical

Descriptions

• Histogram: (shown before)

• Boxplot: (covered before)

• Quantile plot: each value xi is paired with fi indicating that

approximately 100 fi % of data are xi

• Quantile-quantile (q-q) plot: graphs the quantiles of one

univariant distribution against the corresponding quantiles of

another

• Scatter plot: each pair of values is a pair of coordinates and

plotted as points in the plane

• Loess (local regression) curve: add a smooth curve to a

scatter plot to provide better perception of the pattern of

dependence www.drjayeshpatidar.blogspot.com

Proportion

• =COUNT• =COUNTIF• DIVIDE COUNTIF BY COUNT• =D3/D2

Frequency Distributions

• There are alternative ways of constructing frequency distributions

• COUNTIF function• HISTOGRAM function

=COUNTIF(A6:A134,1)=D4/D9*100

Histogram Function

• Tools -Data Analysis-Histogram• Bins

The bins are thefrequency categories

Insert Input and Bin Ranges

Text Labels Can Be Included or Excluded From Input Range

The Chart Wizard

The Descriptive Statistics Function

SEVERAL ROWS OF DATA ARE HIDDEN

SEVERAL ROWS OF DATA ARE HIDDEN

Correlation

Correlation Coefficient, r = .75

Regression Analysis