Johnson Statistics 6e TIManual

TI-84 Plus

GRAPHING CALCULATOR MANUAL

James A. Condor

Manatee Community College

Deanna L. Voehl

Indian River State College

to accompany

Statistics: Principles and Methods

Sixth Edition

by

?? Johnson ?? State University

2

JOHN WILEY & SONS, INC.

3

Contents

Preface

1 Introduction

2 Organization and Description of Data

3 Descriptive Study of Bivariate Data

4 Probability

5 Probability Distributions

6 The Normal Distribution

7 Variation in Repeated Samples - Sampling Distributions

8 Drawing Inferences from Large Samples

9 Small Sample Inferences for Normal Populations

10 Comparing Two Treatments

11 Regression Analysis I; Simple Linear Regression

12 Regression Analysis II; Multiple Linear Regression and Other Topics

4

Preface

Statistics is an important field of study, now more so than ever. We are surrounded by statistical

information in work and in our everyday lives. Many of us work in professions that require us to

understand statistical summaries, and some of us work in areas that require us to produce

statistical information.

The art of teaching statistics has changed dramatically in recent years, with computational

software eliminating the need for many of the previously taught techniques. The answers to

many of the complex computations come easier and faster to students with today’s calculators

performing most of the work in elementary statistical analysis. The challenge to the instructor is

to get the student to acquire a greater understanding of what (s)he is doing, with the calculator

tending to the details of the computations.

The TI-84 Plus calculator, by Texas Instruments, is a leading example of the progress in

statistical technology. Texas Instruments has provided us with an advanced device at an

affordable price that is capable of powerful statistical work and yet is still easy to use. This text

will run through the statistical capabilities of the TI-84 Plus calculator.

This text will follow the order of topics presented in Johnson’s Statistics, sixth edition, published

by John Wiley & Sons, Inc., but should also prove useful with other texts. It will not explain the

underlying statistics but instead focus on how best to use the TI-84 Plus calculator in computing

them.

Chapter

1 Introduction

Use of Technology

Statistics is a field that deals with sets of data. After the data is collected, it needs to be organized

and interpreted. There is a limit to how much of the work can be done effectively without the

help of some type of technology. The use of technology, such as a calculator with enhanced

statistical functions, can take care of most of the details of our work so that we can spend more

time focusing on what we are doing and how to interpret the results.

Technology can help us not only to store and manipulate data, but also to visualize what the data

is trying to tell us. As we work with a calculator, we will be able to:

• Enter, revise, and store data.

• Perform statistical computations on stored data or entered statistics.

• Draw pictures, based on the data, to help us to understand what useful information can be

inferred from that data.

Advantages of Using a Calculator

There are many good statistical software packages available, such as MINITAB, SAS, and

SPSS. Excel also contains many statistical built-in functions, as well as supporting plug-in’s for

statistical work. Still, for the student starting to learn statistics, it’s hard to beat the advantages of

using a powerful hand-held calculator.

• It is portable and easy to use in many different work environments.

• It has battery power that lasts far longer than that of a laptop computer.

• It is less expensive than a computer.

• It is less expensive than a statistical software package.

2

Advantages to Using the TI-84 Plus

This calculator manual will focus on how to get the most out of using the TI-84 Plus calculator

by Texas Instruments. The TI-83 was first released in 1996, improving upon its predecessors (the

TI-81 and TI-82) with the addition of many advanced statistical and financial functions. The TI-

83 Plus and the TI-84 Plus have essentially the same features as the TI-83, but with increased

memory capacity and a few extra statistical features. They are powerful calculators with

advanced functions, but at the same time easy to use.

• Most complicated statistical computations are handled through menus which prompt

you for the necessary input.

• Data entry and revision is handled through a Statistical List Editor that is similar to a

spreadsheet in how it is used.

• Statistical graphs are handled through menus and important parts of the graph can be

read by tracing along with the arrow keys.

• The calculators are built sturdily, and can withstand many falls off of student desks.

Entering and Revising Data

This chapter focuses on getting numbers into your calculator and storing them for the

organization, interpretation, and analysis part of statistics. When you are not given the necessary

statistics to perform calculations, you will need to enter data into the calculator to generate the

statistics. We will learn how to do statistical calculations with the calculator in future chapters.

Using the Statistical List Editor

The Statistical List Editor in the TI-84 Plus calculator provides a convenient way to enter

numbers and review them. Numbers from a data set can be stored in a list in the calculator so

that we can keep numbers that are related to each other together.

Example: Calories Consumed

An individual is modifying eating habits and has kept track of calories consumed for the last 10

days, as follows:

1474, 1633, 1686, 1748, 1326, 1112, 1245, 1539, 1220, 1561

If we want to do any sort of analysis on these numbers, we will need to get them into the

calculator and keep them together as a group.

3

Accessing the Statistical List Editor

Press the STAT key.

To input data or to make changes to an existing set of data values use the Edit function. (number

one under the EDIT list).

Press the number 1 key or

press the ENTER key if 1: is highlighted.

Resetting the Statistical List Editor

If the Statistical List Editor does not show the columns labeled as L1, L2, and L3, you can reset

the Editor to its default settings by selecting SetUpEditor as follows:

Press the STAT key.

Press the 5 key.

Press the ENTER key.

Once the Editor is set up, return to the Edit function.

Entering Data in the Statistical List Editor

Type in the ten calorie counts under the column labeled L1. Press the ENTER key when you are

done with one number and ready to move on to the next number.

4

Type in 1474 and press the ENTER key.



…continue until all the data values have been entered.

Use the up and down arrow keys to go back and forth between the numbers. Try changing the

value of one of the entries by typing in a new calorie count.

Clearing a List of Data Values

After a list of data values is no longer needed, you can delete the values by using one of the

following methods:

You can highlight each data value and use the DEL key. This method is slow and clears

the list one data value at a time.

You can highlight the list name, for example L1 at the top of the column, and press the

CLEAR key and then press the ENTER key.

You can go to the EDIT menu and press the number 4 key to clear list.

Press the STAT key.

Press the number 4 key.

Press the 2nd

key and then press the 1 key to get L1.


Entering Lists Directly to the Statistical Editor List

The home screen is where you do most of your calculator work that doesn’t involve menus.

Wherever you are on your calculator, you can always get back to the home screen by pressing

the 2nd

key and then the MODE key to access the QUIT function. From the home screen, you

can enter data into a list by typing it between a set of braces { and }, and separating the numbers

by commas:

{1474, 1633, 1686, 1748, 1326, 1112, 1245, 1539, 1220, 1561}

5

Once you’ve typed the numbers in, you will want to save them.

Use the store button STO► followed by L1, L2 or any other list.

( L1 – L6 are above the 1–6 keys, using the 2nd

key.) When you

press the STO► key, the screen will display an arrow going to

the right.

Once you’ve entered a list from the Statistical List Editor, you

can see the list by typing its name. For example, if you stored the

calories in L1, typing L1 (2nd

key then 1 key) on the home screen will display the list’s contents.

(You will need to use the left and right arrow keys to see all of the list’s contents.)

Entering Lists Directly to a Name

You can also store data to a name that you create. From the home screen, you can save a list to a

name by using the STO► key.

Moving the Named List to the Statistical List Editor

The data is stored in the name CAL, but it is not stored as a list. In

order to access CAL from the home screen, a list must be created

as follows:

Press the STAT key.


Press the ▲ key to highlight the name at the

top of one of the columns.

Press the 2nd

key and then the DEL key to get to

the INS (insert) function.

Type in the name of your new list - CAL

(note that the A-LOCK is turned on so you do not need to press

the ALPHA key before each letter),


The numbers that you stored in CAL should now appear in the list.

Create a Named List Within the Statistical List Editor

The lists L1 through L6 are good places to work with data if you do not need to save the data. If

you may need the data later and do not want to accidentally over-write it, you can give the data a

name. A list can be named with 1–5 characters. The first character must be a letter A - Z or the

angle symbol θ “theta”. The other characters can be a letter, a θ, or a number 0 - 9.

To get letters from the keyboard, press the ALPHA key before each letter. The letters appear

above and to the right of most of the keys. If you are typing several letters in a row, press A-

LOCK (above the ALPHA key), type the letters, and then press the ALPHA key again to release

the lock.

6

To create a new list named BURN within the Statistical List Editor,

Press the STAT key.


Press the ▲ key to highlight the name at the

top of one of the columns.

Press the 2nd

key and then the DEL key to get to

the INS (insert) function.

Type in the name of your new list

(note that the A-LOCK is already turned on for you, so

you don’t need to press the ALPHA key before each letter),


BURN should now appear at the top of a column.

The individual also kept track of the number of calories burned by

exercising for each of the last 10 days. Enter the following data into

the newly named list, BURN. {128, 37, 440, 128, 258, 486, 325, 171, 0, 529}

Getting the Names of Lists

Some of the calculator commands require that you type in the name

of a list.

If the name of the list is one of L1 – L6, then you can type it quickly

from the keyboard above the 1 – 6 keys.

If the list has a specific name, you cannot just type the name of a list

from the keyboard using the ALPHA key. List names on the TI-84

Plus calculator are distinguished from the names of other variables by

a small L to the left of the name. Select LIST (2nd

STAT) and use the arrow keys to choose one

of the list names. Then press the ENTER key.

Choosing Lists to Edit

There are several ways to control

which lists are displayed in the

Statistical List Editor. Press the

STAT key and then the number 5

key to get to SetUpEditor and

type in the desired list of names,

separated by commas. The

example shown will configure the Statistical List Editor to display

only lists CAL and BURN. Lists L1 – L6 will not be displayed. To get back to the default list of

names, L1 – L6, use command SetUpEditor without any list names.

7

Remove a Named List Within the Statistical List Editor

In the Statistical List Editor, use the arrow keys to move to the name of the list to be removed.

Press the DEL key for delete. The list disappears, but the contents of the list have not been

deleted. To erase the contents of a list, highlight the name of the list and press the CLEAR

button and then the ENTER key. This will leave the list name in the editor and clear its entries.

Deleting Lists

If you store many lists, programs, etc. on your calculator, you may run out of memory. Go to the

MEM menu (above the + key) and select 2:Mem Mgmt/Del.. Select 4: List to see all of the

current lists. Move the cursor to the list that you want to delete and press the DEL key, then

choose YES.

Numeric Operations on Lists

The TI-84 can perform various mathematical operations on data that are stored in lists. Enter the

following data in to Lists L1 and L2.

m 2 5 7 10 14 17 20 25 30

f 4 2 9 12 10 7 8 3 1

Press: STAT > 1:Edit > ENTER

Enter the m values in to List L1.

Enter the f values in to List L2.

Finding the sum of a List

The TI-84 can quickly calculate the sum of a List.

Press: 2nd

> MODE (QUIT) to return to the Home

Screen.

To find Σm, find the sum of List L1.

Press: 2nd

> STAT (LIST)

Use the right arrow key to move the cursor to MATH.

Press: 5:sum(

Press: 2nd

> 1(L1) > ) and press ENTER.

Repeat the above steps, using L2, to find Σf.

8

Find Σmf by creating a new List from an existing List.

The TI-84 can create a new List from one or more existing Lists.

One way to calculate Σmf is to create a new list by multiplying the corresponding

values of m and f.

Press: STAT > 1:Edit and press ENTER.

Use the right arrow key to move the cursor to the L3

column.

Use the up arrow key to move the cursor to the

column header, L3.

Press: 2nd

> 1(L1) > * > 2nd

2(L2) and press ENTER.

List L3 now contains the product, mf.

To calculate Σmf, use the Sum command as follows:

Press: 2nd

> STAT (LIST) > MATH > 5:sum( and

press ENTER.

Press: 2nd

> 1(L3) > ) and press ENTER.

Find Σmf by using the Sum( command and mathematical operations.

An alternative to creating a new List, is to perform the

multiplication of m and f within the Sum( command

itself.

At the HomeScreen,

Press: 2nd

> STAT > MATH >5:sum( and press

ENTER.

Press: 2nd

> 1 > * > 2nd

> 2 > ) and press ENTER.

9

Find Σf2 by using the Sum( command and mathematical operations.

In a similar fashion:

Press: 2nd

> STAT > MATH > 5:sum( and press

ENTER.

Press: 2nd

> 2 > x2 > ) and press ENTER.

Chapter

2 Organization and

Description of Data

One of the tasks of a statistician is to try to make sense of the data by organizing it. In today’s

world of technology, you are presented with tables and graphical displays of data on a daily

basis. This chapter focuses on ways of organizing data on the TI-84 Plus calculator.

Frequency Distributions

One way to organize data is to group similar values together. We can then count the number of

elements in each group.

To help us see how the values are distributed, we will split them into intervals or classes, all of

the same width, and count how many are in each class. Listing the classes and their frequencies

will give us the information needed to create a frequency distribution. The calculator does not

create a frequency distribution table, but it will construct a frequency histogram which will

display the information needed to create one.

Example: Hours of Sleep

Example 5, page 32, gives the number of hours of sleep the

previous night for 59 students at a large Midwest university.

Enter the hours of sleep into a list labeled HOURS.

Create a Frequency Histogram

The data is grouped in to 5 classes beginning with 4.3 and

ending with 10.3. The class width is 1.2.

Press the WINDOW key.

Set Xmin to 4.3 and Xmax to 10.3.

Set Xscl to our class width 1.2.

Graphing Calculator Manual 2

Set Ymin to 0 and Ymax to 25.

Now that we have told the calculator how to organize the data into classes, we are ready to set up

the frequency histogram.

Press the 2nd

key.

Press the Y= key to get to STAT PLOT.


With the flashing cursor over On,

Press the ENTER key to turn the plot On.

Use the arrow keys to select the histogram.

Enter the name of the list HOURS for Xlist (use the LIST

menu).

Leave Freq at 1, since each data value represents only one point.

Now we are ready to draw the frequency histogram.

Press the GRAPH key.

Press the TRACE key.

Use the arrow keys to move from one bar to the next. The screen

will display the frequency in each class (n = 20) and the range of values in each class (min = 6.7;

max < 7.9).

The histogram provides us with the following information.

_Class_ Frequency

4.3 - 5.5 5

5.5 – 6.7 15

6.7 – 7.9 20

7.9 – 9.1 16

9.1 – 10.3 3

Summary of How to Create a Histogram.

1. Enter the data into a list.

2. Determine the range of values for your data, as well as your desired class width.

3. Press the WINDOW key and set Xmin, Xmax, Xscl to the range of values and class

width. Set Ymin to 0 and Ymax to a value large enough for the tallest box in the

histogram. (You may need to revise this.)

Chapter 2: Organization and Description of Data 3

4. Press the 2nd

key and then the Y= key to get to STAT PLOT. Press 4: PlotsOff and

press ENTER. This will turn off all of the plots.

5. Return to STAT PLOT and select a plot. Turn the plot On by pressing the ENTER key

and select the third figure in the first row (the Histogram). Enter the name of your data

list in Xlist, and leave Freq as 1.

6. Press the GRAPH key. Press the TRACE key to display the information needed to

create the Frequency Distribution Table.

Enter the Frequency Distribution Table into the Statistical Editor

Since the class intervals cannot be entered into a List, the midpoint of each class is used. Such a

frequency distribution for the example of the number of hours slept would look as follows:

Create two new lists, MIDPT and FREQ. Enter the above table in to these lists, as shown.

Create a Polygon

The steps to create a Polygon are very similar to those needed to create a histogram. We will use

the data stored under the labels MIDPT and FREQ.

Press the 2nd

key and then the Y= key to get to STATPLOT.


Turn Plot1 On.

Highlight the xyline in Type (2nd

item in the 1st line).

Type MIDPT for the Xlist:

Type FREQ for the Ylist:

Select the square in Mark:

The same WINDOW that was used for the histogram is applicable to the Polygon. Another

option is to let the calculator determine the correct window by using the Zoom feature, as

follows:

Press the Zoom Key.

Press the number 9 key to

choose ZoomStat.

_Midpoint_ Frequency

4.9 5

6.1 15

7.3 20

8.5 16

9.7 3


Press the GRAPH Key.

Press the TRACE Key to see the data points.

Create a Relative Frequency Column

A Relative Frequency Column can be generated from the Frequency Column.

Create a new list named RELFR. With the cursor still highlighting

the name RELFR,

Press 2nd

STAT (LIST) and select the List named FREQ.

Press the ÷ Key.

Press 2nd

STAT (LIST), move the cursor to MATH.

Select 5:sum( and press ENTER.

Press 2nd


Press the ) Key and press ENTER.

The above sequence of commands calculates each value in the

RELFR column by dividing the corresponding value in the FREQ

column by the sum of all of the values in the FREQ column.

Create a Percentage Column

A Percentage Column can be generated from the Relative

Frequency Column.

Create a new list named PERC. With the cursor still highlighting the

name PERC,

Press 2nd

STAT (LIST) and select the List named RELFR.

Press the × (Multiplication) Key.

Type in 100 and press ENTER.


PERC column by multiplying the corresponding value in the

RELFR column by 100.

Create a Cumulative Frequency Column

A Cumulative Frequency Column can be generated from the Frequency Column.

Chapter 2: Organization and Description of Data 5

Create a new list named CUMFR. With the cursor still highlighting the name CUMFR,

Press 2nd

STAT (LIST) and move the cursor to OPS

Select 6:cumSum( and press ENTER.

Press 2nd


Press the ) Key and press ENTER.


CUMFR column by adding the corresponding value in the FREQ

column to the sum of the previous values.

The table at the right was displayed by using the SetUpEditor to

display just these 3 Lists.

Create an Ogive For The Cumulative Frequency

The steps to create an Ogive are very similar to those needed to create a Polygon. We will begin

by using the data stored under the labels MIDPT, FREQ, and CUMFR.

The Ogive uses the first lower and all of the upper boundaries rather than the midpoint. Thus it

does require an additional data value at the beginning of the list.

Insert a new row of data:

Move the cursor to the top value in the MIDPT list (the ‘4.9’).

Press 2nd

DEL (INS).

A ‘0’ will appear.

Repeat the above steps for the FREQ and CUMFR columns.

Overwrite the values in the

MIDPT column with the first

lower and all of the upper

boundaries, as shown at the right.

Press the 2nd

key and then the Y= key to get to STATPLOT.


Turn Plot1 On.

Highlight the xyline in Type (2nd

item in the 1st line).


Type MIDPT for the Xlist:

Type CUMFR for the Ylist:

Select the square in Mark:

Some modifications will be needed for the WINDOW to incorporate the lower boundary and the

increased y-values.

Press the GRAPH Key.

Press the TRACE Key to see the

data points.

Chapter

3 Descriptive Study of

Bivariate Data

Simple Linear Regression Models

A simple linear regression model is an equation describing how to use one variable, x, to predict

another variable, y, based on the relationship existing in the sample data. Since the predictions

made from the sample data may differ from the actual values in the population data, the symbol

y' is used for the predicted value of y. The simplest possible model is a linear one: y' = a + bx.

The graph is a line, where b is the slope of the line and a is the y-coordinate of the y-intercept.

Constructing a Scatter Diagram

In order to use the TI-84 Plus to find the linear regression model, the sample data must be

entered into the STAT editor. The relationship between the two quantitative variables can be

viewed in a scatter diagram.

You can create a scatter diagram by using the scatterplot option under STAT PLOT, which is

located above the Y = key. The steps to create the scatter diagram are similar to those needed to

create a histogram.

Example:

The data in the following table gives the English and Math scores for nine randomly selected

students.

Student A B C D E F G H I

English 77 90 85 62 71 88 95 67 75

Math 68 86 78 73 75 78 85 78 81

2

It is recommended, at this time, to clear Y = of any graphs. Enter the data in to the STAT editor.

Press STAT > 1 to get to the Stat editor.

Type the English scores in L1.

Type the Math scores in L2.

Press 2nd

> Y= to get to STAT PLOT.

Press 1 to get to Plot1.

Highlight ON and press ENTER.

Highlight the scatterplot (1st diagram) under Type and press ENTER.

Type L1 in for the Xlist:

Type L2 for the Ylist:

Select any one of the symbols under Mark: (the square tends to show up the best)

Press Zoom > 9 to get to ZoomStat.

Press the TRACE key and use the left and right arrow keys to

move among the points and see the coordinates of the marks.

Note: if any equations are stored on the Y = page, they may

also appear in your diagram. Delete them if they are in the

way.

Creating a Linear Regression Model

The TI-84 Plus calculator has two built-in functions,

LinReg(ax+b) and LinReg(a+bx) to compute a simple linear

regression model. They are both located on the STAT page in

the CALC list. These are two forms of the same function, one

that writes the equation as ax+b and the other that writes the

equation as a+bx. We will use the latter form, a+bx, but either

is okay.

We will create a linear regression model for the English and

Math scores from the previous example. They are stored in lists

L1 and L2.

Press STAT > CALC > 8: LinReg(a+bx).

Type: L1 > , > L2 > ,

Press VARS > Y-VARS > 1 > 1 to get Y1.

Press ENTER.

3

The LinReg output shows:

general model: y=a+bx

y-intercept: a=53.26159596

slope: b=.3135854034

coefficient of determination: r2=.3883069253

correlation coefficient: r=.6231427808

The equation of the linear regression model for the English and Math scores is

y = 53.2616 + 0.3136x

Coefficient of Determination and Correlation Coefficient

Note: In order to see the coefficient of determination, r2, and

the correlation coefficient, r, on the LinReg output, the

diagnostics must be turned On. Use the TI-84 DiagnosticOn

function to turn diagnostics on. The command only needs to

be executed once, and from then on r2 and r will be displayed

every time you compute a regression model.

Press 2nd

> 0 to get to CATALOG.

**Use the down arrow key until you find the command DiagnosticOn and then pressing

ENTER ENTER.

**Alternative: By pressing the first letter of the function you

are looking for, you can save time getting to the function.

The ALPHA key is engaged automatically when you go into

CATALOG, so to get to the letter D press the x-1

key.

Use the down arrow key to continue down until you find

DiagnosticOn and then press ENTER ENTER.

Enter the LinReg command again and the output should appear

as shown on the left.

Graphing the Linear Regression Line

The LinReg command we entered requested that the

equation of the linear regression model (least squares

line) be stored in Y1.

4

Execute the STAT PLOT command again and the least

squares line will appear on the scatter diagram.

Chapter

4 Probability

Generating Random Numbers

When working with probabilities, there is sometimes a need to generate numbers that you can’t

predict, but at the same time follow some standard rules. Computer simulations are a common

example of the need for random occurrences within a structured setting. These numbers are

called pseudo-random numbers since they are not totally random. Your calculator can generate

these types of random numbers. The numbers that will appear on your calculator screen are hard

to predict, but you will be able to attach probabilities to them. For each kind of pseudo-random

number, we will be able to say what the probability is that it will occur next.

Generating Random Numbers Between 0 and 1

Suppose that you would like to generate a number between 0 and 1. You want the number to be

unpredictable, but you want every number between 0 and 1 to have an equally likely chance of

being generated.

The numbers that are generated with the random number function

on your calculator will be very similar to those found in the

random number table in an Elementary Statistics textbook. The

function is on the MATH page and can found in the PRB list.

Press the MATH key.

Press the ► key until PRB is

highlighted.

Select 1:rand and press ENTER.

Press ENTER again.

If you continue to press the ENTER key you will generate a

different random number between zero and one each time you

press the ENTER key.

2

Generating Random Numbers Between Any Two Values

The TI-84 Plus does not have a built-in function to generate random real numbers that are

equally likely to occur and fall within a specified range of values, but they can be generated by

using the rand function with some additional commands. The following command will generate

random real numbers between 1 and 100.

Press the MATH key.

Press the ► key until PRB is highlighted.

Select 1:rand and press ENTER.

Type: * (100 - 1) + 1

Press ENTER.

If you continue to press the ENTER key, you will generate a different random number between

1 and 100 each time.

In general, the command used to generate a real number between values m and n is:

rand * (n – m) + m , where n is the larger number.

For example: rand * (10 – 1) + 1 will generate a real number between 1 and 10.

rand * (900 – 500) + 500 will generate a real number between 500 and 900.

Generating Random Integer Values Between Any Two Numbers

To generate random integer numbers (no decimals) that are

equally likely to occur and fall within a specified range of

values, the TI-84 Plus has the built-in function randInt. The

following sequence will generate 20 random integer values

between 1 and 100.

Select MATH > PRB.

Select 5:randInt( .

Type: 1, 100, 20) and press ENTER.

Use the right arrow key to see the remaining numbers. Press

ENTER to generate 20 more such random integers.

In general, the command used to generate n integer numbers between values j and k is:

randInt(j, k, n) If n = 1, then you do not need to enter it

For example: randInt(10, 50, 15) will generate 15 random integers between 10 and 50.

randInt(10, 50) will generate 1 random integer between 10 and 50.

3

Store Random Numbers in a List

The random numbers generated can be stored in a list to be used

with other statistical procedures.

The command in the screen shot

on the right will generate 15

randomly-generated integer

values between 10 and 50 and

store them in List L1.

Chapter

5

Probability Distributions

Mean and Standard Deviation of a Discrete Random Variable

Computing the mean and standard deviation of a discrete random variable is slightly different

than computing the mean and standard deviation of a set of data values. Each data value in the

data set weighs equally in the computation. However, in a discrete random variable, the possible

data values are given along with the likelihood of each value occurring on any given single trial.

As was the case for a set of data values, the TI-84 calculator can be used to calculate the mean

and standard deviation of a discrete random variable by either manually using the formulas or by

using a built-in function. We will begin with manually using the formulas.

Example: Number of Heads

Table 1 gives the probability distribution of the number of

heads in three tosses of a fair coin. Enter the number of heads

into a list named X, and the probability into a list named

PROBX.

Mean of a Discrete Random Variable

The formula to calculate the mean of a discrete

probability distribution is

.

Move the cursor to highlight the name of the empty

List next to List PROBX.

Type: List X * List PROBX

2

Press ENTER.

The formula now says to sum of this list of values.

Go to the homescreen (2nd

> Mode).

Select 2nd

> STAT > MATH > 5: sum(

Type: L1) and press ENTER.

µ = 1.5 heads in three tosses of a coin.

Standard Deviation of a Discrete Random Variable

The formula to calculate the standard deviation of a discrete probability

distribution is

Move the cursor to highlight the name of the empty List

next to List PROBX.

Type: List X ^ 2 * List

PROBX Press ENTER.

The formula now says to take the square root of the difference of the sum of this

list of values and the square of the mean.

Go to the homescreen (2nd

> Mode).

Press √ Key.

Select 2nd

> STAT > MATH > 5: sum(

Type: L1) – 1.5 ^ 2) and press ENTER .

σ =0.866 heads in three tosses of a coin.

3

Using TI-84 Plus Built-In Functions For Discrete Probability Distributions

The TI-84 Plus built-in function 1-Var Stats will also calculate

the numerical descriptive statistics for a Discrete Probability

Distribution.

We will use the same

probability distribution as

above, which we stored in Lists

X and PROBX.

Select STAT > CALC > 1-Var Stats.

Press ENTER.

Select 2nd

> STAT >X , 2nd

> STAT > PROBX Press ENTER.

The screen will display the descriptive statistics, which includes

the population mean and standard deviation.

Generating Dependent Probabilities

Factorials

A common function needed to compute dependent probabilities is the factorial function. The

notation for the factorial of n is n! The “!” function is found on the MATH page under the PRB

list.

To find the number of ways six people could be arranged in six different chairs, you would

calculate six factorial (6!).

Type: 6 > MATH > PRB > 4: ! and press ENTER.

6! = 720.

Calculate 10! and 0!.

4

Combinations

The combination formula can also be used to compute dependent probabilities. The notation for

the number of combinations is nCr, where n is the total number of elements, and r is the number

being selected. Combinations are used when selecting a few elements from a larger number of

distinct elements.

Example: Ice Cream

An ice cream parlor offers 6 flavors of ice cream. Kristen would like to purchase 2 flavors of ice

cream. In how many ways can Kristen choose 2 flavors out of the 6 flavors?

In order to find the number of ways of choosing two flavors out of

six, we would need to calculate 6C2.

Type 6 > MATH > PRB > 3:

nCr and press ENTER.

Press the number 2 key and press

ENTER.

There are 15 different combinations of two flavors of ice cream.

Calculate 6C3 and 8C3 .

Permutations

The permutation formula can be used to compute dependent probabilities. The notation for the

number of permutations is nPr, where n is the total number of elements, and r is the number

being selected. Permutations are used when trying to find all possible arrangements of elements

taken from a larger selection. Arrangements involve putting the elements in a particular order.

If Kristen’s story changes as below, then permutations apply

rather than combinations.

An ice cream parlor offers 6 flavors of ice cream. Kristen

would like to purchase 2 flavors of ice cream and concerned as

to which flavor is on the top and which flavor is on the bottom

(i.e. Kristen is concerned about the arrangement of the flavors).

In how many ways can Kristen arrange 2 flavors out of the 6

flavors?

In order to find the number of arrangements of choosing two flavors out of six, we would need to

calculate 6P2.

5

Type 6 > MATH > PRB > 2: nPr and press ENTER.

Press the number 2 key and press ENTER.

There are 30 different arrangements of two flavors of ice cream.

Calculate 6P3 and 8P3 .

Binomial Distribution

Randomly Generating Number of Successes From a Binomial Distribution

There are many situations in statistics where you need to

generate numbers from distributions where the numbers are not

equally likely to occur. One of the most commonly used

distributions used in statistics is the discrete Binomial

distribution.

The TI-84 Plus has a built-in function to generate random real

numbers from a specific Binomial distribution. The random real

number represents an x value, the number of successes.

Select MATH > PRB > 7:randBin(

Type 3, 0.3,5) and press ENTER.

The screen shot on the left repeated the Binomial experiment 5

times. Each time there were 3 trials with a probability of success

were 2, 1, 0, 1, and 2 respectively.

Generate 5 random numbers from a Binomial distribution with 3 trials and 0.9 probability of

success.

The syntax for the randBin( function is randBin(n, p, r). This will generate r random numbers

representing x the number of successes from a binomial distribution with n number of trials and p

probability of success on a given trial. Note: if r = 1, you may omit it.

Compute Binomial Probabilities

The command for computing a probability at x successes for a discrete Binomial distribution is

binompdf(.

To find the probability of x successes out of n trials, each with probability p of success, type

binompdf(n, p, x).

6

Example: VCR’s

Suppose that 5% of all VCR’s manufactured by an electronics

company are defective. Three VCR’s are selected at random. What

is the probability that exactly one of them is defective? P(x = 1)

Select 2nd

> VARS (DISTR) > A:

binompdf( and press ENTER.

Type: 3, 0.05, 1) and press ENTER.

The result is 0.135375 or ≈ 13.5% chance that exactly one of

them is defective.

Calculate the same probability with 8 VCR’s selected at random, rather than 3.

Now there is ≈ 27.9% chance that exactly one of them is defective.

Compute Cumulative Binomial Probabilities

The command for the probability for a cumulative number of

successes from 0 to x for a discrete Binomial distribution with n

number of trials and p probability of success on any given single

trial is binomcdf( . P(number of successes ≤ x)

Using the same Binomial distribution of 3 VCR’s as above, what

is the probability that zero or one of them is defective? P(x ≤ 1)

Select 2nd

> VARS (DISTR) > B: binomcdf( and press

ENTER.

Type: 3, 0.05, 1) and press ENTER.

The result is 0.99275 or ≈ 99.3% chance that at most one of

them is defective.

Calculate the same probability with 8 VCR’s selected at random, rather than 3.

Now there is ≈ 94.3% chance that at most one of them is defective.

Compute Poisson Probabilities

7

The command for computing the probability of x occurrences within a given interval for a

discrete Poisson distribution with a mean number of occurrences λ is poissonpdf(λ, x).

P(number of occurrences = x)

Example: Telemarketing

Suppose that a household receives, on average, 9.5 telemarketing calls per week. Find the

probability that the household receives 6 calls this week.

Select 2nd

> VARS (DISTR) > C: poissonpdf( and press

ENTER.

Type: 9.5, 6) and press ENTER.

The result is 0.076420796 ≈ 7.6% chance that the household

receives 6 calls this week.

Find the probability that the household receives 10 calls this week.

There is ≈ 12.4% chance that the household receives 10 calls this week.

Compute Cumulative Binomial Probabilities

The command for computing the probability of at most x

occurrences (cumulative) within a given interval for a discrete

Poisson distribution with a mean number of occurrences λ is

poissoncdf(λ, x). P(number of occurrences ≤ x)

Using the same Poisson distribution of Telemarketing calls as

above, what is the probability that the household receives at

most 6 calls this week? P(x ≤ 6)

Select 2nd

> VARS (DISTR) > D: poissoncdf( and press

ENTER.

Type: 9.5, 6) and press ENTER.

There is ≈ 16.5% chance that the household receives at most 6

calls this week.

Find the probability that the household receives at most 10 calls this week.

8

There is ≈ 64.5% chance that the household receives at most 10 calls this week.

Geometric Probabilities

Your calculator can compute probabilities for a geometric random variable with probability of

success p using the geometpdf( command, located on the DISTR page. To find the probability of

the random variable taking the value x, type geometpdf(p, x).

Example: Car Ignition

Suppose that a car with a bad starter can be started 90% of the time by turning on the ignition.

What is the probability that it will take three tries to get the car started? Type geometpdf(0.9, 3);

the answer is 0.9%.

Cumulative Geometric Probabilities

As with the binomial and cumulative probability functions, there is a cumulative version

geometcdf( . It can be used to find the probability that a geometric random variable will take a

value of at most x by typing geometcdf(p, x).

Chapter

6 The Normal Distribution

Continuous random variables are used to approximate probabilities where there are many

possibilities or an infinite number of possibilities on a given trial. One of the most well-known

continuous distributions used to approximate probabilities is the normal distribution.

Traditionally normal distribution probabilities were figured using a normal distribution table.

The table method is being replaced with calculators such as the TI-84 Plus. The calculator

reduces the time needed to perform the calculations and reduces the rounding errors that occur

because of the brevity of the tables in elementary statistics textbooks.

Normal Distribution

Randomly Generating a Number From a Normal Distribution

Just as the TI-84 had a built-in function to generate random real

numbers from a Binomial distribution, it also has a built-in

function to generate random real numbers from a specific Normal

distribution with a mean µ and standard deviation σ. The random

real numbers represent x values. The general syntax is

randNorm(µ, σ, n), where n is the number of random real

numbers.

The following command will generate 30 numbers from a

Normal distribution with a mean of 45 and a standard deviation

of 8 and store them in L2.

Select MATH > PRB > 6:randNorm( and press ENTER.

Type: 45, 8, 30) > STO > L2

Generate 200 numbers from a Normal distribution with µ = 100 and σ = 15

and store them in L3.

2

Generate a histogram of the 200 numbers in L3 and observe that the histogram is beginning to

look like a normal distribution. Experiment with generating a larger number of data values.

Computing Normal Distribution Probabilities

The commands for the Normal distribution are normalpdf( , normalcdf( , and invNorm( .

They are located on the DISTR page. DISTR appears above the VARS key.

Compute Cumulative Normal Probabilities

The normalcdf( function stands for normal cumulative density function and gives the probability

of getting an x value that falls within an interval of values from the normal distribution. There

are three possibilities:

Finding the probability that a number will fall between two values under the Normal

distribution.

Finding the probability that a number will fall to the left of a value under the Normal

distribution.

Finding the probability that a number will fall to the right of a value under the Normal

distribution.

The syntax for the normalcdf( function is normalcdf(L, B, µ, σ), where L is the lower bound of

the interval, B is the upper bound of the interval, µ is the mean, and σ is the standard deviation.

The values for µ and σ may be omitted if it is the Standard Normal distribution.

Finding the Area Between Two Values

To find the area between two numbers a and b under the

Standard Normal curve, P(a < z < b) = normalcdf(a, b, 0, 1).

Find the probability of getting a value between 1.04 and 1.82

under the Standard Normal curve.

Select: 2nd

> VARS >2: normalcdf( and press ENTER.

Type: 1.04, 1.82, 0, 1) and press ENTER.

P(1.04 < z < 1.82) = 0.115.

Find the probability of getting a value between 0 and 3 under

the Standard Normal curve.

Find the probability of getting a value between 10 and 13 under

the Normal curve with a mean of 10 and a standard deviation of

2.

3

P(10 < x < 13) = 0.43

Find the probability of getting a value between 2 and 12 under the Normal curve with a mean of

10 and a standard deviation of 2.

Finding the Area to the Left of a Value

To find the area to left of b under the Normal curve, P(z < b) = normalcdf(-∞, b, µ, σ). The

problem is that the TI-84 calculator does not have a built-in key for negative infinity (-∞). Thus,

the value -1E99 is used, which represents a very large negative number. The letter E stands for

scientific notation and it is located above the comma (,) key (2nd

> ,). Thus, the command will

look like: normalcdf(-1E99, b, µ, σ).

Find the probability of getting a value less than 0 under the Standard Normal curve.

Select: 2nd


Type: -1 > 2nd

> , > 99,0) and press ENTER.

P(z < 0) = 0.5.

Find the probability of getting a value less than 32.45 under the

Normal curve with mean 25 and standard deviation 6.

Finding the Area to the Right of a Value

To find the area to right of a under the Normal curve, P(z > a) = normalcdf(a, 1E99, µ, σ).

Find the probability of getting a value greater than -1.08 under the Standard Normal curve.

Select: 2nd


Type: -1 > 2nd

> , > 99,0) and press ENTER.

P(z > -1.08) = 0.8599.

Find the probability of getting a value greater than 15.3 under

the Normal curve with mean 12 and standard deviation 4.

Inverse Normal Distribution Probabilities

4

There are times in statistics when we have a probability and

need a relevant z-score or raw score. The problem of this type

may look like: P(z > ?) = 0.8599. Such problems are known

as inverse normal distribution problems. Such computations

can be performed using tables of normal probabilities, but the

work is tedious, error-prone, and often has rounding errors.

Fortunately, the calculator has a function, invNorm(, that

performs the calculation.

We know from the previous section that the unknown in P(z >

?) = 0.8599 is -1.08.

Select: 2nd

> VARS > 3:invNorm( and press ENTER.

Type: 0.8599) and press ENTER.

The screen is telling us that the answer is positive 1.08. The

invNorm( function gives an answer based on a cumulative

probability of 0.8599 from -∞ to 1.08. Since the Normal distribution is symmetric, the same

cumulative probability applies to -1.08 to ∞. It is always advisable to draw the normal curve to

help in visualizing this concept.

Graph the Normal Probability Density Function

The function normalpdf( stands for Normal probability

density function and does not actually generate a probability,

since it applies to a single x value in a continuous distribution

and that probability is always zero. The main use of this

command is to draw the Normal curve. The syntax for the

function is normalpdf(x, μ, σ), where μ is the mean and σ is

the standard deviation.

The following sequence of commands will draw the standard normal curve (μ =0 and σ = 1).

Select: Y = > 2nd

> VARS > 1: normalpdf( and press ENTER.

Type: x, 0, 1) > ZOOM > 9

This command may be used

to draw any Normal

distribution curve with any

mean and standard deviation.

5

Shade the Normal Probability Density Function

When calculating the probability of an area under the Normal curve, it is often helpful to shade

the area. The syntax for the TI-84 Plus command to do this is ShadeNorm(a, b, µ, σ).

Example

To shade the area under the Standard Normal curve for

P(1.04 < z < 1.82) = 0.115, begin by turning off all other

graphs (STATPLOT or Y =).

Adjust the WINDOW to view the Standard Normal curve,

as shown on the right.

Select: 2nd

> VARS > DRAW > 1: ShadeNorm( and press ENTER.

Type: 1.04, 1.82, 0, 1) and press ENTER.

Notice that the area of

the shaded region is also

shown on the graph and

it is the same value

calculated from the

normalcdf( command.

Thus, the ShadeNorm(

is an alternative

command for normalcdf(, with the added benefit of the shading of the area.

Example

Find the probability of getting a value greater than 15.3 under the Normal curve with mean 12

and standard deviation 4.

Adjust the WINDOW as shown on the right.

Type: ShadeNorm(15.3, 1E99, 12, 4) and press ENTER.

P(x > 15.3) = 0.2047

Chapter

7 Variation in

Repeated Samples -

Sampling Distributions

A large part of statistics consists of analyzing the probability of getting a specific sample mean

or sample proportion from a repeated number of samples drawn from the same population.

Usually we focus on two kinds of statistics from those samples: the sample mean Ë, if the data is

quantitative, or the sample proportion ê, if the data is categorical. For large sample sizes, both

Ë and ê have normal distributions, which have already been discussed. The normalcdf(

function on the TI-84 Plus calculator will be used, with a slight modification. As before, the

answers using normalcdf( function will differ slightly from the answers found from a table of

normal probabilities, since the latter involves rounding.

Probabilities for Sample Means

For a large sample size, the Central Limit Theorem states that the sampling distribution Ë is

normally distributed with µË = µ and σË = σ/ n .

The syntax normalcdf(a, b, µ, σ/ n ) is used to find the

probability that a < Ë < b. The procedure is the same as

finding the probability of x with a given mean and standard

deviation. As before, if you are finding the area to the left of

some value b, we use -1E99 for a. If you are finding the area

to the right of some value a, we use 1E99 for b. The key

stroke for E is 2nd

> comma.

Example: Cookie Weights

2

Assume that the weights of all packages of a certain brand of cookies are normally distributed

with a mean of 32 ounces and a standard deviation of 0.3 ounces. Find the probability that the

mean weight, Ë, of a random sample of 20 packages of this brand of cookies will be less than

31.8 ounces. The sample size here is not large, but the distribution of all such cookies is normal,

so the sample mean will be normal as well.

Select 2nd

> VARS > 2: normalcdf( and press ENTER.

Type: -1E99, 31.8, 32, 0.3/ 20 )) and press ENTER.

= 0.9986

An alternative approach would be to

adjust the WINDOW as shown on

the right, and use the ShadeNorm(

function.

Type: ShadeNorm(-1E99, 31.8, 32, 0.3/ 20 )

Probabilities for Sample Proportions

For a large sample size, the Central Limit Theorem states that the sampling distribution for ê is

normally distributed with µê = p and σê = npq /

To find the probability that a < ê < b on the calculator, use normalcdf(a, b, p, pq / n ). Note

that it is more accurate to type npq / directly into the normalcdf( command than to compute it

separately and type it in. Any time you find yourself typing in an intermediate result in a

computation you may be performing some unnecessary rounding.

Example: Voters

A candidate for mayor in a large city claims that she is favored by 53% of all eligible voters of

that city. Assume that the claim is true. What is the probability that in a random sample of 400

registered voters taken from this city, between 49% and 51% will favor the candidate?

3

Select 2nd

> VARS > 2: normalcdf( and press ENTER.

Type: 0.49, 0.51, 0.53, √(0.53 * 0.47 / 400 )) and press ENTER.

= 0.1570

In other words,

normalcdf(0.49, 0.51, 0.53, 400/47.*53. ) = 0.1570 = 15.7%.

Chapter

8 Drawing Inferences from

Large Samples

In statistics, we collect samples to know more about a population. If the sample is representative

of the population, the sample mean or proportion should be statistically close to the actual

population mean or proportion. A way to judge how close the sample statistic may be, is to

create a confidence interval. This chapter will describe how to use the calculator to compute

confidence intervals and tests of hypothesis for population means and proportions, drawn from

large samples.

Confidence Intervals for Population Means

The function used to compute confidence intervals for the population mean, µ, is ZInterval for

when σ is known and the sample is large. It is found on the STAT page under TESTS.

Known Population Standard Deviation

If you are fortunate enough to know the population standard deviation σ, either from theory or

from a pilot study, then you would use a Z-based confidence interval, ZInterval, to estimate the

population mean, µ. There are two different syntaxes for the ZInterval command.

If you know the statistical information (population

standard deviation, sample mean, and sample size),

then the syntax is ZInterval σ, , n, confidence level.

If you have the sample data stored in a list, then the

syntax is ZInterval σ, List name, Frequency list,

confidence level.

2

Example: Textbook Price

A publishing company has just published a new college textbook. Before the company decides

the price at which to sell this textbook, it wants to know the average price of all such textbooks

in the market. The research department at the company took a sample of 36 such textbooks and

collected information on their prices. This information produced a mean of $70.50 for this

sample. It is known that the standard deviation of the prices of all such textbooks is $4.50.

Construct a 90% confidence interval for the mean price of all such college textbooks.

We have the population standard deviation, σ, so we will use ZInterval command. We do not

have the data itself, so we will select Stats where it asks for input. We enter 4.5 for σ, 70.5 for

Ë, 36 for n, and .90 for C-Level.

Press STAT > TESTS > 7: ZInterval and press ENTER.

Select Stats by moving the cursor over Stats and press the ENTER key.

Type in 4.5 for σ.

Type in 70.5 for Ë.

Type in 36 for n.

Type in .90 for C-Level.

Highlight Calculate and then press the Enter key.

The ZInterval output shows

the 90% confidence level, as

well as the sample mean and sample size.

The confidence interval is between $69.27 and $71.73.

With 90% confidence, we believe that the true population mean

price is between $69.27 and $71.34.

Example: Randomly Generated Sample Data

Generate 50 random numbers from a Normal distribution with a

mean of 45 and a standard deviation of 8 and store them in L1.

Select MATH > PRB > 6:randNorm( and press ENTER.

Type: 45, 8, 50) > STO > L1

We have the population standard deviation, σ = 8, so we will use ZInterval command. We have

the data, so we will select Data where it asks for input. We enter L1 for List, 1 for Freq, and

.90 for C-Level.

3

Press STAT > TESTS > 7: ZInterval and press ENTER.

Select Data by moving the cursor over Data and press the ENTER key.

Type: 4.5 for σ.

Type: L1 for List.

Type: 1 for Freq.

Type: 0.90 for C-Level.


The ZInterval output

shows the 90% confidence

level, as well as the sample mean, sample standard deviation,

and sample size.

The confidence interval is between 45.001 and 47.095.

With 90% confidence, we believe that the true population mean

price is between 45 and 47.1.

Hypothesis Tests About Means

A hypothesis test about a population mean can be Z-based (if σ is known) or T-based (if σ is

unknown and either the population is normal or the sample size is over 30). The TI-84 Plus

calculator provides functions for both the Z-Test and the T-Test, which are located on the STAT

page in the TESTS list.

The menus for Z-Test and T-Test are very similar to the ones for

ZInterval and TInterval described in the last chapter. The

functions work with either the data or the summary statistics.

Both functions ask for the null hypothesis and an alternative

hypothesis. Both functions provide a p-value for comparison with

the test’s significance level.

Example: One Sided Test; Know Sigma

H0: µ = 50; H1: µ > 50; n = 200; Ë = 52.7; σ = 16.2; α = .05

Since sigma is known, Z-Test will be used.

Select: STAT > TESTS . 1: Z-Test and press ENTER.

Highlight Stats and press ENTER.

Type: 50 for μ0.

Type: 16.2 for σ.

Type: 52.7 for Ë.

4

Type: 200 for n.

Move the cursor over >μ0 and press ENTER.

Move the cursor over Calculate and press ENTER.

The Z-Test output shows the

alternative hypothesis: μ>50

test statistic: z=2.357022604

p-value: .0092110461

sample mean: Ë=52.7

sample size: n=200

The p-value is .0092110461, which is less than 5%. We reject H0 and conclude that the

population mean is statistically-significantly higher than 50.

Example: Two Sided Test; Unknown Sigma

H0: µ = 112; H1: µ ≠ 112; n = 85; Ë = 108.5; Sx = 12.4; α = .005

Since sigma is not known, T-Test will be used.

Select: STAT > TESTS > 2: T-Test and press ENTER.


Type: 112 for μ0.

Type: 108.5 for Ë.

Type: 12.4 for Sx.

Type: 85 for n.

Move the cursor over ≠μ0 and press ENTER.


The T-Test output shows the

alternative hypothesis: μ≠112

test statistic: t=-2.602290774

p-value: .0109440164


sample size: n=85

The p-value is .0109440164, which is greater than 0.5%. We do not reject H0 and conclude that

the population mean is not statistically-significantly different than 112.

5

In the above examples, the Calculate option for T-Test and Z-Test was chosen. If the Draw

option is chosen, the calculator will draw the curve and state the z/t value and the p-value.

Select: STAT > TESTS . 2: T-Test and press ENTER.

The original information should still be there.

Highlight Draw and press ENTER.

Confidence Intervals for Population Proportions

The function 1-PropZInt computes Z-based confidence intervals for a population proportion

when the sample size is large enough, i.e., the number of successes, x, is greater than 5 and the

number of failures, n – x, is greater than 5.

1-PropZInt is found on the STAT page under TESTS. To

use 1-PropZInt, enter in the number of successes as x, the

sample size as n, and the confidence level as C-Level.

Note: x must be a whole number. If you are finding x by

multiplying ê by n, you will need to round to the nearest

whole number.

Example: Legal Advice

A recent sample of 500 college students revealed that 82% of them owned a graphing calculator.

Find a 95% confidence interval for the percentage of all college students who own a graphing

calculator.

In our sample of 500 college students, there were 82% or 410 successes and 90 failures. We can

use 1-PropZInt with x = 410, n = 500, and our C-Level set at 0.95.

Press STAT > TESTS > A: 1-PropZInt and press ENTER.

Type: 410 for x.

Type: 500 for n.

Type: 0.59 for C-Level.

Highlight Calculate and press the ENTER key.

6

The 1-PropZInt output shows the 95% confidence interval, the

sample proportion, and the sample size.

The 95% confidence interval is from 0.786 to 0.854.

With 95% confidence, we believe that the true population

proportion is between 78.6% and 85.4%

Hypothesis Tests About Proportions

Hypothesis tests about proportions are Z-based when both

np and nq are greater than 5. The TI-84 Plus has function

1-PropZTest to compute a test statistic and p-value. The

1-PropZTest is located on the STAT page under the

TESTS list. The menu for 1-PropZTest is very similar to

the one for 1-PropZInt described in the last chapter.

Again you need to enter x, the number of successes, and n,

the sample size. You also need to enter p0, the number that

p is being compared with, and the alternative hypothesis.

Example: One Sided Test

H0: p = .41; H1: p < .41; n = 300; x = 97

Select: STAT > TESTS > 5: 1-PropZTest and press ENTER.

Type: .41 for p0.

Type: 97 for x.

Type in 300 for n.

Highlight <P0 and press ENTER.

Highlight Calculate and press ENTER.

The 1-PropZTest output shows the

alternative hypothesis: prop < .41

test statistic: z=-3.052072083

p-value: p =.0011364066

sample proportion: ê=.323

sample size: n=300

7

In the above example, the Calculate option for PropZ-Test was chosen. If the Draw option is

chosen, the calculator will draw the curve and state the z value and the p-value.

Select: STAT > TESTS > 5: 1-PropZTest and press ENTER.



Chapter

9 Small Sample Inferences

For Normal Populations

Confidence Interval for Mean – Small Sample Size

The population standard deviation is usually not known. When this is the case, the sample mean

has a t-distribution, rather than a Normal distribution. Thus, a t-based confidence interval,

TInterval, is used to estimate the population mean, µ. Another consideration is that either the

population is normal or the sample size is larger than 30. Just as for the ZInterval command,

there are two different syntaxes for the TInterval command.

If you know the statistical information (sample mean,

sample standard deviation, and sample size), then the

syntax is TInterval , , n, confidence level.

If you have the sample data stored in a list, then the syntax

is TInterval List name, Frequency list, confidence level.

Example: Household Debt

A local orange grove sells oranges at Saturday’s Downtown Farmer’s Market. They wanted to

estimate the average number of oranges sold on a given Saturday. They took a sample of 35

Saturday’s and found that the average number of oranges sold for this sample is 256 with a

standard deviation of 40. Construct a 99% confidence interval for the population mean µ.

We do not have the population standard deviation, σ, so the TInterval command will be used.

Press STAT > TESTS > 8: TInterval and press ENTER.

2

We do not have the sample data, so we will select Stats, and enter 256 for Ë, 40 for sx, 35 for n,

and .99 for the C-Level.

Select Stats for Inpt: and press the ENTER key.

Type in 256 for Ë.

Type in 40 for Sx.

Type in 35 for n.



The TInterval output shows the 99% confidence interval along with the

sample mean, sample standard deviation, and sample size.


With 99% confidence, we believe that the true population mean number of

oranges sold is between 237.6 and 274.5 oranges.

Hypothesis Test for Mean – Small Sample Size

A hypothesis test about a population mean is T-based if σ is

unknown and either the population is normal or the sample size

is over 30. The TI-84 Plus calculator provides the T-Test, which

is located on the STAT page in the TESTS list.

The T-Test function works with either the data or the summary

statistics, requesting null hypothesis and an alternative

hypothesis. A p-value is provided for comparison with the test’s

significance level.

Example:

H0: µ = 112; H1: µ ≠ 112; n = 85; Ë = 108.5; Sx = 12.4; α = .005

Since sigma is not known, T-Test will be used.

Select: STAT > TESTS > 2: T-Test and press ENTER.


Type: 112 for μ0.

Type: 108.5 for Ë.

Type: 12.4 for Sx.

Type: 85 for n.

3

Move the cursor over ≠μ0 and press ENTER.


The T-Test output shows the

alternative hypothesis: μ≠112

test statistic: t=-2.602290774

p-value: .0109440164


sample size: n=85

The p-value is .0109440164, which is greater than 0.5%. We do not reject H0 and conclude that

the population mean is not statistically-significantly different than 112.

In the above examples, the Calculate option for T-Test and Z-Test was chosen. If the Draw

option is chosen, the calculator will draw the curve and state the z/t value and the p-value.

Select: STAT > TESTS . 2: T-Test and press ENTER.



Chi-Square Tests

Computing Chi-Square Distribution Probabilities

The computation commands for the Chi-Square distribution

are χ2pdf( , and χ

2cdf( . They are located on the DISTR page.

DISTR appears above the VARS key.

Compute Cumulative Chi-square Probabilities

The χ2cdf( function stands for chi-square cumulative density function and gives the probability

of getting an x value that falls within an interval of values from the chi-square distribution for the

specified degrees of freedom. There are three possibilities:

Finding the probability that a number will fall between two values under the Chi-square

distribution.

4

Finding the probability that a number will fall to the right of a value under the Chi-square

distribution (in the right tail).

Finding the probability that a number will fall to the left of a value under the Chi-square

distribution (in the left tail).

The syntax for the χ2cdf( function is χ

2cdf(L, B, df), where L is the lower bound of the interval,

B is the upper bound of the interval, and df is the degrees of freedom.

Finding the Area Between Two Values

To find the area between two numbers a and b under the Chi-square curve, P(a < χ2 < b) =

χ2cdf(a, b, df).

Find the probability of getting a value between 5.14 and 7.28

under the Chi-square curve with 8 degrees of freedom.

Select: 2nd

> VARS > 8: χ2cdf( and press ENTER.

Type: 5.14, 7.28, 8) and press ENTER.

P(5.14 < χ2 < 7.28) = 0.236.

Finding the Area in the Left Tail

To find the area to the left of b (in the left tail) under the Chi-square curve, use P(χ2 < b) =

χ2cdf(-∞, b, df). The TI-84 calculator does not have a built-in key for negative infinity (-∞).

Thus, the value -1E99 is used, which represents a very large negative number. The letter E

stands for scientific notation and it is located above the comma (,) key (2nd

> ,). Thus, the

command will look like: χ2cdf(-1E99, b, df).

Find the probability of getting a value less than 20 under the

Chi-square curve with 17 degrees of freedom.

Select: 2nd

> VARS >8: χ2cdf( and press ENTER.

Type: -1 > 2nd

> , > 99,20,17) and press ENTER.

P(χ2 < 20) = 0.7258

Finding the Area in the Right Tail

To find the area to right of a (in the right tail) under the Chi-square curve, use

P(χ2 > a) = χ

2cdf(a, 1E99, df).

Find the probability of getting a value greater than 31.08

under the Standard Chi-square curve with 25 degrees of

freedom.

Select: 2nd

> VARS >8: χ2cdf( and press ENTER.

Type: 31.08 > , > 1 > 2nd

> , 99,25) and press ENTER.

5

P(χ2 > 31.08) = 0.1864.

Graph the Chi-square Probability Density Function

The function χ2pdf( stands for Chi-square probability density function and does not actually

generate a probability, since it applies to a single x value in a continuous distribution and that

probability is always zero. The main use of this command is to draw the Chi-square curve. The

syntax for the function is χ2pdf(x, df), where df is the degrees of freedom.

The following sequence of commands will draw the chi-square curve with 7 degrees of freedom.

Select: Y = > 2nd

> VARS > 7: χ2pdf( and press ENTER.

Type: x, 7) > ZOOM > 0

This command may be used to draw any Chi-square

distribution curve with any degrees of freedom. It may be

necessary to adjust the window. For example, changing Xmax

to 30 gives a better Chi-square distribution curve (see below).

Shade the Chi-square Probability Density Function

When calculating the probability of an area under the Chi-

square curve, it is often helpful to shade the area. The syntax

for the TI-84 Plus command to do this is Shade χ2 (a, b, df).

To shade the area under the

Standard Chi-square curve

6

with 8 degrees of freedom for P(5.14 < χ2 < 7.28) = 0.236, begin by turning off all other graphs

(STATPLOT or Y =).

Adjust the WINDOW to view the Standard Chi-square curve with 8 degrees of freedom, as

shown on the left.

Select: 2nd

> VARS > DRAW > 3: Shade χ2 ( and press

ENTER.

Type: 5.14, 7.28, 8) and press ENTER.

Notice that the area of the

shaded region is also

shown on the graph and it is the same value calculated from the

χ2cdf( command. Thus, the ShadeNorm( is an alternative

command for χ2cdf(, with the added benefit of the shading of

the area.

A Goodness-of-Fit Test A goodness-of-fit test is used to make a test of hypothesis about

experiments with more than two possible outcomes (or

categories). These are called multinomial experiments. The

frequencies of each possible outcome obtained from the

experiment are called the observed frequencies. A goodness-of-

fit test tests the difference between the observed frequencies

and the expected frequencies (npi). This difference follows a

chi-square distribution. The sample size should be large enough

so that the expected frequency for each category is at least 5.

The TI-84 Plus command for the goodness-of-fit test is χ2GOF-Test, which performs a test to

confirm that sample data is from a population that conforms to a specified distribution (expected

frequencies). The command requires that both the observed and expected frequencies are put in

lists. The degrees of freedom, df, is the number of outcomes minus 1. The command is located

on the DISTR page. DISTR appears above the VARS key.

Example 1:

The following table lists the frequency

distribution of 90 rolls of a die. Test at

the 5% significance level whether the

null hypothesis that the given die is

fair is true.

Outcome of roll 1 2 3 4 5 6

Observed Frequency 16 21 15 12 14 12

Expected

Frequency

15 15 15 15 15 15

7

Enter the observed frequencies into List L1 and the expected

frequencies into List L2, as shown on the left.

There are 6 possible outcomes, so the degrees of freedom, df,

is 6 – 1 = 5.

Select: STAT > TESTS > D: χ2GOF-Test and press

ENTER.

Type in 2nd

> 1 for Observed.

Type in 2nd

> 2 for Expected.

Type in 5 for df:


The output for the χ2GOF-Test

shows the:

Chi-square value: χ2

= 3.7333

p-value = 0.5884

Degrees of freedom: df = 5

CNTRB = {.066666667, 2.4, 0, .6, .066666667}

Note: CNTRB= provides a list of the contributions of each

category to the overall value of χ2. This would be the values in the column. Notice that a

roll of ‘2’ had the largest contribution.

Since the p-value is greater than 5%, reject the null hypothesis and conclude that the dice is not

fair.

Contingency Tables

When measuring the relationship between two categorical variables, one of the most important

tools for analyzing the results is a two-way classification table, also known as a contingency

table.

A Test of Independence

The contingency table can be used to see if the variables are

independent, by comparing observed frequencies with the

frequencies that would be expected from such a sample if they

were independent. A Chi-Square (Χ2) test-statistic can be

computed from the observed and expected frequencies. The TI-

84 function Χ2-Test is used for the Test of Independence and is

located on the STAT page in the TESTS list. The Χ2-Test

function works differently than the other tests on this menu. It

8

requires you to enter the observed frequencies into a matrix. The computed expected frequencies

are then stored automatically in another matrix.

Example: School Referendum

A random sample of

300 adults was selected

and asked if they were

in favor of the new

school referendum. The two-way classification table of the

responses of the adults is presented in the table.

Test at a 5% significance level whether or not Gender and Opinion are independent.

First the data has to be stored differently than in previous

statistical tests on the calculator. The data for a Chi-Square test

of independence has to be stored in a Matrix.

Type: 2nd

> x-1

to get to MATRIX.

Highlight EDIT and press the ENTER key.

After MATRIX [A] type in 2 x3.

The 2 represents the number of rows in the table.

The 3 represents the number of columns in the table.

Enter the data values into the matrix as they appear in the table.

Select: STAT > TESTS > C: χ2-Test and press ENTER.

The χ2-Test screen shows:

That the Observed values are in matrix A

That the Expected values will be put in matrix B


The output for the χ2-Test shows the:

Test statistic: χ2 =8.252773109

P-value: p =.0161410986

Degrees of freedom: df=2

In Favor Against No Opinion

Men 93 70 12

Women 87 32 6

9

Since the p-value of .016 is less than the significance level of .05, we reject the null hypothesis

that row and column variables are independent. We have significant evidence that gender and

opinion concerning the school referendum are related for all adults.

The Expected frequencies can be found in Matrix B.

Type: 2nd

> x-1

to get to MATRIX.

Select: 2: [B] and press the ENTER key.

Press ENTER again.

In the above example, the Calculate option was chosen. If the Draw option is chosen, the

calculator will draw the curve and state the χ2 value and the p-value.




A Test of Homogeneity

A test of homogeneity is a test to determine if two (or more) populations are homogeneous

(similar) with regard to the distribution of a certain characteristic. The procedure to perform this

test on the TI-84 Plus is identical to performing a Test Of Independence (please see above).

Inferences About the Population Variance

In the same way that the population mean and population proportion are tested, so

is the population variance. This is often in response to a desire to control the

consistency of a value. If the population from which the sample is taken is

approximately normally distributed, then the sample variance has a chi-square

distribution with n - 1 degrees of freedom.

The TI-84 Plus does not have a built-in function to generate confidence intervals

about the population variance. The Goodness Of Fit function, χ2GOF-Test, can be

used for a test of hypothesis of the population variance if the data

Chapter

10 Comparing Two Treatments

Confidence Interval for µ1 - µ2

There are two functions used to compute confidence intervals for the difference of two

population means µ1 − µ2:

2-SampZInt for when both σ1 and σ2 are known and the sample sizes are large or the

populations from which the samples are drawn are normal

2-SampTInt for when σ1 and σ2 are not known.

Both are found on the STAT page under TESTS.

Known Population Standard Deviations

If you are fortunate enough to have information about the population standard deviations of the

two populations, either from theory or a pilot study, then you would use a Z-based confidence

interval, 2-SampZInt, to estimate the difference µ1 − µ2 between two population means. As was

the case for estimating one population mean, ZInterval, there are two different syntaxes for the

2-SampZInt command.

If you know the statistical information for both

populations (population standard deviations, sample

means, and sample sizes), then the syntax is 2-SampZInt

σ1, σ2 , , n1, , n2, confidence level.

If you have the sample data stored in two lists, then the

syntax is 2-SampZInt σ1, σ2, List name1, List name2,

Frequency list1, Frequency list2, confidence level.

Example 1:

The following information is obtained from two independent samples selected from two

populations. Construct a 90% confidence interval for µ1 − µ2.

n1 = 200 = 6.4 σ1 = 0.7

n2 = 190 = 5.6 σ2 = 0.55

2

We have the population standard deviations, σ1 and σ2, so we will use 2-SampZInt command.

We do not have the data itself, so we will select Stats where it asks for input. We enter 0.7 for

σ1, 0.55 for σ2, 6.4 for Ë1, 200 for n1, 5.6 for Ë2, 190 for n2, and 0.90 for C-Level.

Press STAT > TESTS > 9: 2-SampZInt and press ENTER.


Type in 0.7 for σ1.


Type in 6.4 for Ë1.

Type in 200 for n1.


Type in 190 for n2.



The 2-SampZInt output shows the 90% confidence interval

for µ1 − µ2, as well as both sample means and both sample

sizes.

The confidence interval is between 0.69543 and 0.90458.

With 90% confidence, we believe that the true difference

between the population means is between 0.69543 and 0.90458.

If you have the actual data rather than the statistics, then enter the

data in two lists and choose Data rather than Stats.

Independent Samples With Unknown But Equal Population Standard Deviations

In the real world, one rarely knows the population standard

deviations. Thus, the approach to estimating the difference µ1 −

µ2 between two population means is to compute a T-based

confidence interval, 2-SampTInt, with the assumption that the

population standard deviations are equal. The pooled sample

standard deviation for the two samples will be used. As was the

case for the 2-SampZInt command, there are two different

syntaxes for the 2-SampTInt command.

3

If you know the statistical information for both samples (sample standard deviations,

sample means, and sample sizes), then the syntax is 2-SampTInt , Sx1, n1, , Sx2,

n2, Confidence Level, Pooled.

If you have the sample data stored in two lists, then the syntax is 2-SampTInt

List name1, List name2, Frequency list1, Frequency list2, Confidence Level, Pooled.

Example 2:


populations with unknown but equal standard deviations. Construct a 95% confidence interval

for µ1 − µ2.

n1 = 42 = 78.4 s1 = 10.13

n2 = 39 = 75.2 s2 = 9.55

We have the sample standard deviations, s1 and s2, so we will use 2-SampTInt command. We do

not have the data itself, so we will select Stats where it asks for input. We enter 78.4 for Ë1,

10.13 for sx1, 42 for n1, 75.2 for Ë2, 9.55 for sx2, 39 for n2, 0.95 for C-Level. We have reason

to believe that the population standard deviations are the same, so select Yes by the prompt

Pooled.

Press STAT > TESTS > 0: 2-SampTInt and press ENTER.



Type in 10.13 for Sx1.

Type in 42 for n1.



Type in 39 for n2.


Highlight Yes for Pooled.


The 2-SampTInt output shows the 95% confidence interval

for µ1 − µ2, as well as both sample means and both sample

standard deviations. It also shows the degrees of freedom

(n1 + n2 – 2).

The confidence interval is between -1.162 and 7.5622.

With 95% confidence, we believe that the true difference

between the population means is between -1.162 and 7.5622.

If you have the actual data rather than the statistics, then enter

the data in two lists and choose Data rather than Stats.

4

Hypothesis Testing: µ1 - µ2

Known Population Standard Deviations

If you are fortunate enough to have information about the population standard deviations of the

two populations, either from theory or a pilot study, then you would use Z-based, 2-SampZTest,

to perform a test of hypothesis about µ1 − µ2. Again, there are two different syntaxes for the 2-

SampZTest command.

If you know the statistical information for both

populations (population standard deviations,

sample means, and sample sizes), then the syntax is

2-SampZTest σ1, σ2 , , n1, , n2,µ1:

If you have the sample data stored in two lists, then

the syntax is 2-SampZTest σ1, σ2, List name1,

List name2, Frequency list1, Frequency list2,

µ1:

Example 1 (from above):


populations. Test at the 5% significance level if the two population means are different, µ1 ≠ µ2.

n1 = 200 = 6.4 σ1 = 0.7

n2 = 190 = 5.6 σ2 = 0.55

Press STAT > TESTS > 3: 2-SampZTest and press ENTER.





Type in 200 for n1.


Type in 190 for n2.

Highlight ≠ µ2 for µ1:


The 2-SampZ-Test output shows the

alternative hypothesis: μ1 ≠ µ2


p-value: 2.707293E-36

sample means: Ë1=6.4; Ë2=5.6

sample sizes: n1=200; n2=190

5

The p-value is 2.707293E-36, which is less than 5%. We reject H0 and conclude that the

population means are statistically-significantly different.


calculator will draw the curve and state the z value and the p-value.

Press STAT > TESTS > 3: 2-SampZTest and press ENTER.





Independent Samples With Unknown But Equal Population Standard Deviations

In the real world, one rarely knows the population standard

deviations. Thus, t-based, 2-SampTTest, is used to perform a

test of hypothesis about µ1 − µ2,with the assumption that the

population standard deviations are equal. The pooled sample

standard deviation for the two samples will be used. Again,

there are two different syntaxes for the

2-SampTTest command.

If you know the statistical information for both samples (sample standard deviations,

sample means, and sample sizes), then the syntax is 2-SampTTest , Sx1, n1, , Sx2,

n2, µ1:, Pooled.

If you have the sample data stored in two lists, then the syntax is 2-SampTTest

List name1, List name2, Frequency list1, Frequency list2, µ1:, Pooled.

6



populations with unknown but equal standard deviations.

Test at the 5% significance level if µ1 > µ2.

n1 = 42 = 78.4 s1 = 10.13

n2 = 39 = 75.2 s2 = 9.55

Press STAT > TESTS > 4: 2-SampTTest and press ENTER.




Type in 42 for n1.



Type in 39 for n2.

Highlight > µ2 for µ1:

Highlight Yes for Pooled.


The 2-SampTTest output shows the:

alternative hypothesis: μ1 > µ2

test statistic: t=1.460144062

p-value: 0.0741080013

degrees of freedom: 79

sample means: Ë1=78.4; Ë2=75.2

sample standard deviations: Sx1=10.13; Sx2=9.55

pooled sample standard deviation: Sxp=9.85527418


The p-value is 0.0741080013, which is greater than 5%. We do not reject H0 and conclude that

population mean1 is not statistically-significantly greater than population mean2.

In the above example, the Calculate option was chosen. If the

Draw option is chosen, the calculator will draw the curve and

state the z value and the p-value.

Press STAT > TESTS > 4: 2-SampTTest and press ENTER.



7



Independent Samples With Unknown and Unequal Population Standard Deviations

In the case of unequal population standard deviations, use the same procedures above for equal

population standard deviations, with the one exception of choosing No for Pooled.

Paired Samples

The confidence intervals and hypothesis tests described above all assume that the samples are

taken independently. Two samples that are taken from the same population are said to be

dependent. The most common data collection design with dependent samples is called

Pretest/Posttest. Data is collected from a sample before some type of treatment and then data is

collected again from that same sample after the treatment. We then work with the mean

difference between the pre- and posttest scores. The null hypothesis is that the average

difference is zero. By working with the differences between the variables, we can perform a

one-sample T-Test (it is rarely the case that the population standard deviations are known), using

the TInterval command.

Example 3:

Find the following confidence intervals for µd assuming that the

population of paired differences are normally distributed at the

99% confidence level.

n = 16 = 21.7 sd = 10.4

Press STAT > TESTS > 8: TInterval and press ENTER.

We do not have the sample data, so we will select Stats.

Select Stats for Inpt: and press the ENTER key.

Type in 21.7 for Ë (this is ).

Type in 10.4 for Sx (this is sd).

Type in 16 for n.



8

The TInterval output shows the 99% confidence interval

along with the sample mean, sample standard deviation, and

sample size.


We can state with 99% confidence that the mean difference

between the pre- and posttest is between 14.039 and 29.361.

Two Population Proportions

Confidence Interval: p1 - p2

Large and Independent Samples

If you have two large and independent samples, then you

would use a Z-based confidence interval, 2-PropZInt, to

estimate the difference p1 − p2 of two population proportions.

The function is located on the STAT page in the TEST list.

For population proportions, a large sample size is defined as

n1p1, n1q1, n2p2, n2q2 are all greater than 5.

Example 4:

The following information is obtained from two large and independent samples selected from

two populations. Construct a 95% confidence interval for p1 − p2.

n1 = 200 p1 = 0.42 n2 = 220 p2 = 0.35

The 2-PropZInt command requests x1 rather than p1. To find x1, use the formula .

x1 = (200)(0.42) = 84 x2 = (220)(0.35) = 77

Press STAT > TESTS > B: 2-PropZInt and press ENTER.

Type in 84 for x1.

Type in 200 for n1.

Type in 77 for x2.

Type in 220 for n2.



The 2-SampZInt output shows the 95% confidence interval for

p1 − p2, as well as both sample proportions and both sample

sizes.

9

The confidence interval is between -0.023 and 0.116301.

With 95% confidence, we believe that the true difference between the population porportions is

between -0.023 and 0.116301.

Hypothesis Testing: p1 - p2

If you have two large and independent samples, then you would

use Z-based, 2-PropZTest, to perform a test of hypothesis

about p1 − p2. The function is located on the STAT page in the

TEST list. For population proportions, a large sample size is

defined as n1p1, n1q1, n2p2, n2q2 are all greater than 5.


The following information is obtained from two large and independent samples selected from

two populations. Test at the 1% significance level if the two population proportions are different,

p1 ≠ p2.

n1 = 200 p1 = 0.42 n2 = 220 p2 = 0.35

The 2-PropZTest command requests x1 rather than p1. To find x1, use the formula .

x1 = (200)(0.42) = 84 x2 = (220)(0.35) = 77

Press STAT > TESTS > 6: 2-PropZTest and press ENTER.

Type in 84 for x1.

Type in 200 for n1.

Type in 77 for x2.

Type in 220 for n2.

Highlight ≠ p2 for p1:


The 2-PropZTest output shows the

alternative hypothesis: p1 ≠ p2


p-value: 0.14

sample proportions: =0.42; =0.35

pooled sample proportion: ; =0.383


The p-value is 0.14, which is greater than 1%. We reject H0 and conclude that the population

means are not statistically-significantly different.

10


calculator will draw the curve and state the z value and the p-value.

Press STAT > TESTS > 6: 2-PropZTest and press ENTER.



Chapter

11 Regression Analysis I

Simple Linear Regression

Simple Linear Regression Models

A simple linear regression model is an equation describing how to use one variable, x, to predict

another variable, y, based on the relationship existing in the sample data. Since the predictions

made from the sample data may differ from the actual values in the population data, the symbol

y' is used for the predicted value of y. The simplest possible model is a linear one: y' = a + bx.

The graph is a line, where b is the slope of the line and a is the y-coordinate of the y-intercept.

Creating a Linear Regression Model

The TI-84 Plus calculator has two built-in functions,

LinReg(ax+b) and LinReg(a+bx) to compute a simple linear

regression model. They are both located on the STAT page in

the CALC list. These are two forms of the same function, one

that writes the equation as ax+b and the other that writes the

equation as a+bx. We will use the latter form, a+bx, but either

is okay.

We will create a linear regression model for the English and

Math scores from the previous example. They are stored in lists

L1 and L2.

Press STAT > CALC > 8: LinReg(a+bx).

Type: L1 > , > L2 > ,

Press VARS > Y-VARS > 1 > 1 to get Y1.

Press ENTER.

2

The LinReg output shows:



slope: b=.3135854034



The equation of the linear regression model for the English and Math scores is

y = 53.2616 + 0.3136x

Graphing the Linear Regression Line

The LinReg command we entered requested that the

equation of the linear regression model (least squares

line) be stored in Y1.

Execute the STAT PLOT command again and the least squares

line will appear on the scatter diagram.

Confidence Interval for B

A goal for determining the regression line is to find the true

value of the slope B of the population regression line. The

slope, b, of the regression line for the sample is a point estimate

of the slope, B, of the regression line for the population. A

different sample would give a different b value. Thus, b is a

random variable and has a t distribution.

The LinRegTInt function can be used to construct the

confidence interval for B, based on b. The LinRegTInt command is located on the STAT page

in the TESTS list. The command does require that the data be stored in two lists.

Example

Using the data in the English and Math scores example above, construct a 95% confidence

interval for B.

Press STAT > TESTS > G:LinRegTInt and press ENTER.

3

Type L1 in for the Xlist:

Type L2 for the Ylist:

Let the Freq: remain at 1.

Type .95 for C-Level:


The LinRegTInt output shows:


confidence interval: (-.0382, .66535)

slope: b=.3135854034

degrees of freedom: df=7

sample standard deviation: s=4.729745193




The 95% confidence interval for the slope of the English and Math scores example is

(-.0382, .66535).

Hypothesis Tests

The LinRegTTest can be used to test whether or not the variable

x can meaningfully predict the variable y. This is equivalent to

testing whether or not B, the population slope coefficient for the

model (approximated by b), is really 0. LinRegTTest is located

on the STAT page in the TESTS list. The test requires the

names of the lists containing the data values and what the

alternative hypothesis is.

Example:

Test at the 1% significance level whether the slope of the regression line for the above example

on English and Math scores is positive.

Perform the test of H0: B = 0 versus H1: B > 0 in LinRegTTest.

4

Press the STAT > TESTS > F:LinRegTTest and press ENTER.

Type in L1 for the Xlist:

Type in L2 for the Ylist:

Let the Freq: remain at 1.

Move the cursor over >0 and press ENTER.

Set the RegEQ: to Y1.


The value of t is 2.108 and the

p-value is .0365, which is

greater than our significance

level of 1%. We fail to reject

the null hypothesis and

conclude that the slope is not

significantly greater than zero.

The regression equation was stored in Y1. This equation (linear

regression model) can be used to find y' for a given value of x,

such as x = 75.

Type Y1(75) and press ENTER.

To find the predicted Math score for someone with an English

score of 60, type Y1(60) to get a predicted Math score of 72.08.

One-Way Analysis of Variance

We have already seen how to test for the equality of means between two different populations

with 2-SampZTest and 2-SampTTest. With the added assumption of common population

standard deviations, we can extend the tests to more than two populations with a technique

known as Analysis of Variance (ANOVA for short).

In a one-way ANOVA, only one factor or variable is being tested. The null hypothesis is that

means of three or more populations are equal; the alternative hypothesis is that at least two of the

means differ.

The TI-84 built-in ANOVA function is on the STAT page in

the TESTS list. First store the sample data into lists, one per

population.

5

The syntax for the ANOVA function is: ANOVA(List1, List2, List3, . . .List20). There is a

minimum of 2 lists and a maximum of 20 lists. Use the list names in which the data is stored.

The result contains the F test statistic, degrees of freedom, various sums of squares, mean sums

of squares, and most importantly the p-value for the test.

Example: Fourth Grade Arithmetic With Equal Sample Sizes

Fifteen fourth-grade students were exposed to one of three

different methods of teaching arithmetic. They were randomly

assigned to three groups of five. At the end of the semester, the

same test was given to all 15 students. The following table gives

the scores of the students in the three groups.

Assume that the three populations are normally distributed with equal standard deviations.

Calculate the value of the test statistic F. At the 1% significance level, can we reject the null

hypothesis that the mean arithmetic score of all fourth-grade students taught by each of these

three methods is the same?

Press STAT > ENTER key to get to the Stat Editor.

Enter the Method I data values into L1.

Enter the Method II data values into L2.

Enter the Method III data values into L3.

Press the STAT > TESTS > H: ANOVA and press ENTER.

Type in: L1, L2, L3) and press ENTER.

Method

1

Method

2

Method

3

48 55 84

73 85 68

51 70 95

65 69 74

87 90 67

6

The one-way ANOVA table would appear as follows:

Source of

Variation

Degrees of

Freedom

Sum of Squares Mean Squares Value of the

Test Statistic

Between

(Factor)

Within (Error)

2

12

432.1333

2372.8

216.0667

197.7333 F = =

1.0927

Total 14 2804.9333 p = 0.3665

The df for Between is the number of populations minus 1: k -1 = 3 – 1 = 2.

The df for Within is the number of data items minus number of populations: n - k = 15 – 3 = 12.

The p-value is 0.366. Since this is greater than the significance level of 1%, we fail to reject the

null hypothesis. There is insufficient evidence from the data to show that the different teaching

methods have significantly different average results.

Example: Bank Tellers With Unequal Sample Sizes

From time to time, unknown to its employees, the research department at Post Bank observes

various employees for their work productivity. Recently this department wanted to check

whether the four tellers at a branch of this bank serve, on average, the same number of customers

per hour. The research manager observed each of the four tellers for a certain number of hours.

The following table gives the number of customers served by the four tellers during each of

the observed hours.

At a 5% level of significance, test the null hypothesis that the mean number of customers served

per hour by each of the four tellers is the same. Assume all the assumptions required to apply the

one-way ANOVA procedure hold true.


Enter the Teller A data values into L1.

Enter the Teller B data values into L2.

Enter the Teller C data values into L3.

Enter the Teller D data values into L4.

Teller A Teller B Teller C Teller D

19 14 11 24

21 16 14 19

26 14 21 21

24 13 13 26

18 17 16 20

13 18

7


Type in: L1, L2, L3, L4) and press ENTER.


Source of

Variation

Degrees of

Freedom


Test Statistic

Between

(Factor)

Within (Error)

3

18

255.6182

158.2

85.2061

8.7889 F = =

9.6947

Total 21 413.8182 p = 0.0005

The df for Between: k -1 = 4 – 1 = 3.

The df for Within: n - k = 22 – 4 = 18.

The p-value is 0.0005. Since this is less than the significance level of 5%, we reject the null

hypothesis. There is significant evidence here to show that the tellers’ job performances are not

all the same.

Chapter

12 Regression Analysis II

Multiple Linear Regression

and Other Topics

The TI-84 Plus calculator does not have multiple regression functions built directly into it. The

calculator can aid in solving multiple regression problems by using the formulas found in chapter

12.

Chapter

13 Analysis of

Categorical Data

A Goodness-of-Fit Test A goodness-of-fit test is used to make a test of hypothesis about

experiments with more than two possible outcomes (or

categories). These are called multinomial experiments. The

frequencies of each possible outcome obtained from the

experiment are called the observed frequencies. A goodness-of-

fit test tests the difference between the observed frequencies

and the expected frequencies (npi). This difference follows a

chi-square distribution. The sample size should be large enough

so that the expected frequency for each category is at least 5.

The TI-84 Plus command for the goodness-of-fit test is χ2GOF-Test, which performs a test to

confirm that sample data is from a population that conforms to a specified distribution (expected

frequencies). The command requires that both the observed and expected frequencies are put in

lists. The degrees of freedom, df, is the number of outcomes minus 1. The command is located

on the DISTR page. DISTR appears above the VARS key.

Example 1:

The following table lists the frequency

distribution of 90 rolls of a die. Test at

the 5% significance level whether the

null hypothesis that the given die is

fair is true.

Enter the observed frequencies into List L1 and the expected

frequencies into List L2, as shown on the left.

Outcome of roll 1 2 3 4 5 6

Observed Frequency 16 21 15 12 14 12

Expected

Frequency

15 15 15 15 15 15

2

There are 6 possible outcomes, so the degrees of freedom, df, is 6 – 1 = 5.

Select: STAT > TESTS > D: χ2GOF-Test and press ENTER.

Type in 2nd

> 1 for Observed.

Type in 2nd

> 2 for Expected.

Type in 5 for df:


The output for the χ2GOF-

Test shows the:

Chi-square value: χ2

= 3.7333

p-value = 0.5884

Degrees of freedom: df = 5

CNTRB = {.066666667, 2.4, 0, .6, .066666667}

Note: CNTRB= provides a list of the contributions of each

category to the overall value of χ2. This would be the values in the column. Notice that a

roll of ‘2’ had the largest contribution.

Since the p-value is greater than 5%, reject the null hypothesis and conclude that the dice is not

fair.

Contingency Tables

When measuring the relationship between two categorical variables, one of the most important

tools for analyzing the results is a two-way classification table, also known as a contingency

table.

A Test of Independence

The contingency table can be used to see if the variables are

independent, by comparing observed frequencies with the

frequencies that would be expected from such a sample if they

were independent. A Chi-Square (Χ2) test-statistic can be

computed from the observed and expected frequencies. The TI-

84 function Χ2-Test is used for the Test of Independence and is

located on the STAT page in the TESTS list. The Χ2-Test

function works differently than the other tests on this menu. It

requires you to enter the observed frequencies into a matrix. The computed expected frequencies

are then stored automatically in another matrix.

Example: School Referendum

In Favor Against No Opinion

3

A random sample of 300 adults was selected and asked if

they were in favor of the new school referendum. The

two-way classification table of the responses of the

adults is presented in the table.

Test at a 5% significance level whether or not Gender and Opinion are independent.

First the data has to be stored differently than in previous

statistical tests on the calculator. The data for a Chi-Square test

of independence has to be stored in a Matrix.

Type: 2nd

> x-1

to get to MATRIX.

Highlight EDIT and press the ENTER key.

After MATRIX [A] type in 2 x3.

The 2 represents the number of rows in the table.

The 3 represents the number of columns in the table.

Enter the data values into the matrix as they appear in the table.


The χ2-Test screen shows:

That the Observed values are in matrix A

That the Expected values will be put in matrix B


The output for the χ2-Test shows the:

Test statistic: χ2 =8.252773109

P-value: p =.0161410986

Degrees of freedom: df=2

Since the p-value of .016 is less than the significance level of .05, we reject the null hypothesis

that row and column variables are independent. We have significant evidence that gender and

opinion concerning the school referendum are related for all adults.

The Expected frequencies can be found in Matrix B.

Men 93 70 12

Women 87 32 6

4

Type: 2nd

> x-1

to get to MATRIX.

Select: 2: [B] and press the ENTER key.

Press ENTER again.


calculator will draw the curve and state the χ2 value and the p-value.




A Test of Homogeneity

A test of homogeneity is a test to determine if two (or more) populations are homogeneous

(similar) with regard to the distribution of a certain characteristic. The procedure to perform this

test on the TI-84 Plus is identical to performing a Test Of Independence (please see above).

Chapter

14 Analysis of Variance

(ANOVA)

One-Way Analysis of Variance

We have already seen how to test for the equality of means between two different populations

with 2-SampZTest and 2-SampTTest. With the added assumption of common population

standard deviations, we can extend the tests to more than two populations with a technique

known as Analysis of Variance (ANOVA for short).

In a one-way ANOVA, only one factor or variable is being tested. The null hypothesis is that

means of three or more populations are equal; the alternative hypothesis is that at least two of the

means differ.

The TI-84 built-in ANOVA function is on the STAT page in

the TESTS list. First store the sample data into lists, one per

population.

The syntax for the ANOVA function is: ANOVA(List1, List2, List3, . . .List20). There is a

minimum of 2 lists and a maximum of 20 lists. Use the list names in which the data is stored.

The result contains the F test statistic, degrees of freedom, various sums of squares, mean sums

of squares, and most importantly the p-value for the test.

Example: Fourth Grade Arithmetic With Equal Sample Sizes

Fifteen fourth-grade students were exposed to one of three

different methods of teaching arithmetic. They were randomly

assigned to three groups of five. At the end of the semester, the

same test was given to all 15 students. The following table gives

the scores of the students in the three groups.

Method

1

Method

2

Method

3

48 55 84

73 85 68

51 70 95

65 69 74

87 90 67

2

Assume that the three populations are normally distributed with equal standard deviations.

Calculate the value of the test statistic F. At the 1% significance level, can we reject the null

hypothesis that the mean arithmetic score of all fourth-grade students taught by each of these

three methods is the same?


Enter the Method I data values into L1.

Enter the Method II data values into L2.

Enter the Method III data values into L3.


Type in: L1, L2, L3) and press ENTER.


Source of

Variation

Degrees of

Freedom


Test Statistic

Between

(Factor)

Within (Error)

2

12

432.1333

2372.8

216.0667

197.7333 F = =

1.0927

Total 14 2804.9333 p = 0.3665

The df for Between is the number of populations minus 1: k -1 = 3 – 1 = 2.

The df for Within is the number of data items minus number of populations: n - k = 15 – 3 = 12.

3

The p-value is 0.366. Since this is greater than the significance level of 1%, we fail to reject the

null hypothesis. There is insufficient evidence from the data to show that the different teaching

methods have significantly different average results.

Example: Bank Tellers With Unequal Sample Sizes

From time to time, unknown to its employees, the research department at Post Bank observes

various employees for their work productivity. Recently this department wanted to check

whether the four tellers at a branch of this bank serve, on average, the same number of customers

per hour. The research manager observed each of the four tellers for a certain number of hours.

The following table gives the number of customers served by the four tellers during each of

the observed hours.

At a 5% level of significance, test the null hypothesis that the mean number of customers served

per hour by each of the four tellers is the same. Assume all the assumptions required to apply the

one-way ANOVA procedure hold true.


Enter the Teller A data values into L1.

Enter the Teller B data values into L2.

Enter the Teller C data values into L3.

Enter the Teller D data values into L4.


Type in: L1, L2, L3, L4) and press ENTER.

Teller A Teller B Teller C Teller D

19 14 11 24

21 16 14 19

26 14 21 21

24 13 13 26

18 17 16 20

13 18

4


Source of

Variation

Degrees of

Freedom


Test Statistic

Between

(Factor)

Within (Error)

3

18

255.6182

158.2

85.2061

8.7889 F = =

9.6947

Total 21 413.8182 p = 0.0005

The df for Between: k -1 = 4 – 1 = 3.

The df for Within: n - k = 22 – 4 = 18.

The p-value is 0.0005. Since this is less than the significance level of 5%, we reject the null

hypothesis. There is significant evidence here to show that the tellers’ job performances are not

all the same.

Chapter

15 Nonparametric

Inference

The TI-84 Plus calculator does not have nonparametric functions built directly into it. However,

the calculator can be used to solve problems involving nonparametric statistical methods. The

following examples show how to use the TI-84 plus calculator to compute the numbers needed

for the nonparametric:

sign test

paired-sample sign test

Wilcoxon Rank Sum test for two independent samples.

Sign Test

The Sign Test is a nonparametric test and it can be used even if nothing is known about the

continuous population distribution. The Sign Test can test claims about the median of the

population, because any sample measurement stands a 50% chance of being above the median

and a 50% chance of being below the median. Sample data can be treated as a binomial

experiment, with successes occurring when the measurement is above the median.

Example: Median Price of Homes (Small Sample)

A new car salesman states that the median price of cars in a small town is $37,000. A sample of

10 cars selected by a statistician produced the following data on the prices in dollars.

Car 1 2 3 4 5 6 7 8 9 10

Price 47,500 23,600 37,000 68,200 29,450 42,00 56,400 88,210 98,425 15,300

2

Using a 5% significance level, can we conclude that the median price differs from $37,000?

The hypotheses are:

H0: the median price is $37,000

H1: the median price differs from $37,000

Since one of the data items is the median, $37,000, it will be ignored. Thus, n = 9.

The sample prices are replaced by signs:

Plus (+) if the price is above the hypothetical median of $37,000

Minus (-) if the price is below $37,000.

We have six +’s out of nine signs. Since it is a two-tailed test, the p-value will be twice the

probability of getting six or more successes out of nine trials in a binomial experiment where

p = 0.5. The binomcdf(n, p, x) function calculates the probability from 0 to x and it can be used

to calculate this probability. The binomcdf( function is located at DISTR.

Thus, n = 9, p = 0.5, and x = 6.

p-value = 2 P(X ≥ 6)

= 2(1 − P(X ≤ 5))

= 2(1 − binomcdf(9, 0.5, 5))

= 0.5078125

Since the p-value of 0.5078 is greater than our significance level of 5%, we fail to reject the null

hypothesis. There is not enough evidence to conclude that the median differs from $37,000.

Paired-Sample Sign Test

Given paired data, the Sign Test can be used to test to see if the medians of their respective

populations are the same. If the medians are the same, then there is a 50% chance for one

measurement of a pair to be larger than the other. Each pair can be replaced by a sign indicating

whether the first or the second measurement is larger.

Example: Blood Pressure (Small Sample)

A researcher wants to find the effect of a special diet on systolic blood pressure in adults. She

selected a sample of 12 adults and put them on this dietary plan for three months. The following

table gives the systolic blood pressure of each adult before and after the completion of the plan.

Car 1 2 3 4 5 6 7 8 9 10

Price + - ignored + - + + + + -

3

Subject A B C D E F G H I J K L

Before 210 185 215 198 187 225 234 217 212 191 226 238

After 196 192 204 193 181 233 208 211 190 186 218 236

Using a 2.5% significance level, can we conclude that the dietary plan reduces the median

systolic blood pressure of adults? Our hypotheses are:

H0: the diet does not reduce the median systolic blood pressure of adults

H1: the diet does reduce the median systolic blood pressure of adults

We begin by adding a plus or minus sign indicating whether the Before or After pressure is

larger.

Plus (+) if the Before is larger

Minus (-) if the After is larger

Subject A B C D E F G H I J K L

Before 215 195 200 198 177 225 244 217 212 191 226 228

After 186 196 185 183 161 203 208 200 189 189 221 240

Sign + - + + + + + + + + + -

There are ten +’s out of twelve signs. Since it is a right-tailed test, the p-value will be the

probability of getting ten or more successes out of twelve trials in a binomial experiment where p

= 0.5.

Thus, n = 12, p = 0.5, and x = 10.

p-value = P(X ≥ 10)

= 1 − P(X ≤ 9)

= 1 − binomcdf(12, 0.5, 9)

= 0.019

Since the p-value is less than the significance level 2.5%, we reject the null hypothesis. The data

leads us to conclude that the diet does lower the median systolic blood pressure of adults.

Johnson Statistics 6e TIManual

Documents

Transcript of Johnson Statistics 6e TIManual