Johnson Statistics 6e TIManual
-
Upload
unseenfootage -
Category
Documents
-
view
38 -
download
3
Transcript of Johnson Statistics 6e TIManual
TI-84 Plus
GRAPHING CALCULATOR MANUAL
James A. Condor
Manatee Community College
Deanna L. Voehl
Indian River State College
to accompany
Statistics: Principles and Methods
Sixth Edition
by
?? Johnson ?? State University
2
JOHN WILEY & SONS, INC.
3
Contents
Preface
1 Introduction
2 Organization and Description of Data
3 Descriptive Study of Bivariate Data
4 Probability
5 Probability Distributions
6 The Normal Distribution
7 Variation in Repeated Samples - Sampling Distributions
8 Drawing Inferences from Large Samples
9 Small Sample Inferences for Normal Populations
10 Comparing Two Treatments
11 Regression Analysis I; Simple Linear Regression
12 Regression Analysis II; Multiple Linear Regression and Other Topics
4
Preface
Statistics is an important field of study, now more so than ever. We are surrounded by statistical
information in work and in our everyday lives. Many of us work in professions that require us to
understand statistical summaries, and some of us work in areas that require us to produce
statistical information.
The art of teaching statistics has changed dramatically in recent years, with computational
software eliminating the need for many of the previously taught techniques. The answers to
many of the complex computations come easier and faster to students with today’s calculators
performing most of the work in elementary statistical analysis. The challenge to the instructor is
to get the student to acquire a greater understanding of what (s)he is doing, with the calculator
tending to the details of the computations.
The TI-84 Plus calculator, by Texas Instruments, is a leading example of the progress in
statistical technology. Texas Instruments has provided us with an advanced device at an
affordable price that is capable of powerful statistical work and yet is still easy to use. This text
will run through the statistical capabilities of the TI-84 Plus calculator.
This text will follow the order of topics presented in Johnson’s Statistics, sixth edition, published
by John Wiley & Sons, Inc., but should also prove useful with other texts. It will not explain the
underlying statistics but instead focus on how best to use the TI-84 Plus calculator in computing
them.
Chapter
1 Introduction
Use of Technology
Statistics is a field that deals with sets of data. After the data is collected, it needs to be organized
and interpreted. There is a limit to how much of the work can be done effectively without the
help of some type of technology. The use of technology, such as a calculator with enhanced
statistical functions, can take care of most of the details of our work so that we can spend more
time focusing on what we are doing and how to interpret the results.
Technology can help us not only to store and manipulate data, but also to visualize what the data
is trying to tell us. As we work with a calculator, we will be able to:
• Enter, revise, and store data.
• Perform statistical computations on stored data or entered statistics.
• Draw pictures, based on the data, to help us to understand what useful information can be
inferred from that data.
Advantages of Using a Calculator
There are many good statistical software packages available, such as MINITAB, SAS, and
SPSS. Excel also contains many statistical built-in functions, as well as supporting plug-in’s for
statistical work. Still, for the student starting to learn statistics, it’s hard to beat the advantages of
using a powerful hand-held calculator.
• It is portable and easy to use in many different work environments.
• It has battery power that lasts far longer than that of a laptop computer.
• It is less expensive than a computer.
• It is less expensive than a statistical software package.
2
Advantages to Using the TI-84 Plus
This calculator manual will focus on how to get the most out of using the TI-84 Plus calculator
by Texas Instruments. The TI-83 was first released in 1996, improving upon its predecessors (the
TI-81 and TI-82) with the addition of many advanced statistical and financial functions. The TI-
83 Plus and the TI-84 Plus have essentially the same features as the TI-83, but with increased
memory capacity and a few extra statistical features. They are powerful calculators with
advanced functions, but at the same time easy to use.
• Most complicated statistical computations are handled through menus which prompt
you for the necessary input.
• Data entry and revision is handled through a Statistical List Editor that is similar to a
spreadsheet in how it is used.
• Statistical graphs are handled through menus and important parts of the graph can be
read by tracing along with the arrow keys.
• The calculators are built sturdily, and can withstand many falls off of student desks.
Entering and Revising Data
This chapter focuses on getting numbers into your calculator and storing them for the
organization, interpretation, and analysis part of statistics. When you are not given the necessary
statistics to perform calculations, you will need to enter data into the calculator to generate the
statistics. We will learn how to do statistical calculations with the calculator in future chapters.
Using the Statistical List Editor
The Statistical List Editor in the TI-84 Plus calculator provides a convenient way to enter
numbers and review them. Numbers from a data set can be stored in a list in the calculator so
that we can keep numbers that are related to each other together.
Example: Calories Consumed
An individual is modifying eating habits and has kept track of calories consumed for the last 10
days, as follows:
1474, 1633, 1686, 1748, 1326, 1112, 1245, 1539, 1220, 1561
If we want to do any sort of analysis on these numbers, we will need to get them into the
calculator and keep them together as a group.
3
Accessing the Statistical List Editor
Press the STAT key.
To input data or to make changes to an existing set of data values use the Edit function. (number
one under the EDIT list).
Press the number 1 key or
press the ENTER key if 1: is highlighted.
Resetting the Statistical List Editor
If the Statistical List Editor does not show the columns labeled as L1, L2, and L3, you can reset
the Editor to its default settings by selecting SetUpEditor as follows:
Press the STAT key.
Press the 5 key.
Press the ENTER key.
Once the Editor is set up, return to the Edit function.
Entering Data in the Statistical List Editor
Type in the ten calorie counts under the column labeled L1. Press the ENTER key when you are
done with one number and ready to move on to the next number.
4
Type in 1474 and press the ENTER key.
Type in 1633 and press the ENTER key.
Type in 1686 and press the ENTER key.
…continue until all the data values have been entered.
Use the up and down arrow keys to go back and forth between the numbers. Try changing the
value of one of the entries by typing in a new calorie count.
Clearing a List of Data Values
After a list of data values is no longer needed, you can delete the values by using one of the
following methods:
You can highlight each data value and use the DEL key. This method is slow and clears
the list one data value at a time.
You can highlight the list name, for example L1 at the top of the column, and press the
CLEAR key and then press the ENTER key.
You can go to the EDIT menu and press the number 4 key to clear list.
Press the STAT key.
Press the number 4 key.
Press the 2nd
key and then press the 1 key to get L1.
Press the ENTER key.
Entering Lists Directly to the Statistical Editor List
The home screen is where you do most of your calculator work that doesn’t involve menus.
Wherever you are on your calculator, you can always get back to the home screen by pressing
the 2nd
key and then the MODE key to access the QUIT function. From the home screen, you
can enter data into a list by typing it between a set of braces { and }, and separating the numbers
by commas:
{1474, 1633, 1686, 1748, 1326, 1112, 1245, 1539, 1220, 1561}
5
Once you’ve typed the numbers in, you will want to save them.
Use the store button STO► followed by L1, L2 or any other list.
( L1 – L6 are above the 1–6 keys, using the 2nd
key.) When you
press the STO► key, the screen will display an arrow going to
the right.
Once you’ve entered a list from the Statistical List Editor, you
can see the list by typing its name. For example, if you stored the
calories in L1, typing L1 (2nd
key then 1 key) on the home screen will display the list’s contents.
(You will need to use the left and right arrow keys to see all of the list’s contents.)
Entering Lists Directly to a Name
You can also store data to a name that you create. From the home screen, you can save a list to a
name by using the STO► key.
Moving the Named List to the Statistical List Editor
The data is stored in the name CAL, but it is not stored as a list. In
order to access CAL from the home screen, a list must be created
as follows:
Press the STAT key.
Press the number 1 key.
Press the ▲ key to highlight the name at the
top of one of the columns.
Press the 2nd
key and then the DEL key to get to
the INS (insert) function.
Type in the name of your new list - CAL
(note that the A-LOCK is turned on so you do not need to press
the ALPHA key before each letter),
Press the ENTER key.
The numbers that you stored in CAL should now appear in the list.
Create a Named List Within the Statistical List Editor
The lists L1 through L6 are good places to work with data if you do not need to save the data. If
you may need the data later and do not want to accidentally over-write it, you can give the data a
name. A list can be named with 1–5 characters. The first character must be a letter A - Z or the
angle symbol θ “theta”. The other characters can be a letter, a θ, or a number 0 - 9.
To get letters from the keyboard, press the ALPHA key before each letter. The letters appear
above and to the right of most of the keys. If you are typing several letters in a row, press A-
LOCK (above the ALPHA key), type the letters, and then press the ALPHA key again to release
the lock.
6
To create a new list named BURN within the Statistical List Editor,
Press the STAT key.
Press the number 1 key.
Press the ▲ key to highlight the name at the
top of one of the columns.
Press the 2nd
key and then the DEL key to get to
the INS (insert) function.
Type in the name of your new list
(note that the A-LOCK is already turned on for you, so
you don’t need to press the ALPHA key before each letter),
Press the ENTER key.
BURN should now appear at the top of a column.
The individual also kept track of the number of calories burned by
exercising for each of the last 10 days. Enter the following data into
the newly named list, BURN. {128, 37, 440, 128, 258, 486, 325, 171, 0, 529}
Getting the Names of Lists
Some of the calculator commands require that you type in the name
of a list.
If the name of the list is one of L1 – L6, then you can type it quickly
from the keyboard above the 1 – 6 keys.
If the list has a specific name, you cannot just type the name of a list
from the keyboard using the ALPHA key. List names on the TI-84
Plus calculator are distinguished from the names of other variables by
a small L to the left of the name. Select LIST (2nd
STAT) and use the arrow keys to choose one
of the list names. Then press the ENTER key.
Choosing Lists to Edit
There are several ways to control
which lists are displayed in the
Statistical List Editor. Press the
STAT key and then the number 5
key to get to SetUpEditor and
type in the desired list of names,
separated by commas. The
example shown will configure the Statistical List Editor to display
only lists CAL and BURN. Lists L1 – L6 will not be displayed. To get back to the default list of
names, L1 – L6, use command SetUpEditor without any list names.
7
Remove a Named List Within the Statistical List Editor
In the Statistical List Editor, use the arrow keys to move to the name of the list to be removed.
Press the DEL key for delete. The list disappears, but the contents of the list have not been
deleted. To erase the contents of a list, highlight the name of the list and press the CLEAR
button and then the ENTER key. This will leave the list name in the editor and clear its entries.
Deleting Lists
If you store many lists, programs, etc. on your calculator, you may run out of memory. Go to the
MEM menu (above the + key) and select 2:Mem Mgmt/Del.. Select 4: List to see all of the
current lists. Move the cursor to the list that you want to delete and press the DEL key, then
choose YES.
Numeric Operations on Lists
The TI-84 can perform various mathematical operations on data that are stored in lists. Enter the
following data in to Lists L1 and L2.
m 2 5 7 10 14 17 20 25 30
f 4 2 9 12 10 7 8 3 1
Press: STAT > 1:Edit > ENTER
Enter the m values in to List L1.
Enter the f values in to List L2.
Finding the sum of a List
The TI-84 can quickly calculate the sum of a List.
Press: 2nd
> MODE (QUIT) to return to the Home
Screen.
To find Σm, find the sum of List L1.
Press: 2nd
> STAT (LIST)
Use the right arrow key to move the cursor to MATH.
Press: 5:sum(
Press: 2nd
> 1(L1) > ) and press ENTER.
Repeat the above steps, using L2, to find Σf.
8
Find Σmf by creating a new List from an existing List.
The TI-84 can create a new List from one or more existing Lists.
One way to calculate Σmf is to create a new list by multiplying the corresponding
values of m and f.
Press: STAT > 1:Edit and press ENTER.
Use the right arrow key to move the cursor to the L3
column.
Use the up arrow key to move the cursor to the
column header, L3.
Press: 2nd
> 1(L1) > * > 2nd
2(L2) and press ENTER.
List L3 now contains the product, mf.
To calculate Σmf, use the Sum command as follows:
Press: 2nd
> STAT (LIST) > MATH > 5:sum( and
press ENTER.
Press: 2nd
> 1(L3) > ) and press ENTER.
Find Σmf by using the Sum( command and mathematical operations.
An alternative to creating a new List, is to perform the
multiplication of m and f within the Sum( command
itself.
At the HomeScreen,
Press: 2nd
> STAT > MATH >5:sum( and press
ENTER.
Press: 2nd
> 1 > * > 2nd
> 2 > ) and press ENTER.
9
Find Σf2 by using the Sum( command and mathematical operations.
In a similar fashion:
Press: 2nd
> STAT > MATH > 5:sum( and press
ENTER.
Press: 2nd
> 2 > x2 > ) and press ENTER.
Chapter
2 Organization and
Description of Data
One of the tasks of a statistician is to try to make sense of the data by organizing it. In today’s
world of technology, you are presented with tables and graphical displays of data on a daily
basis. This chapter focuses on ways of organizing data on the TI-84 Plus calculator.
Frequency Distributions
One way to organize data is to group similar values together. We can then count the number of
elements in each group.
To help us see how the values are distributed, we will split them into intervals or classes, all of
the same width, and count how many are in each class. Listing the classes and their frequencies
will give us the information needed to create a frequency distribution. The calculator does not
create a frequency distribution table, but it will construct a frequency histogram which will
display the information needed to create one.
Example: Hours of Sleep
Example 5, page 32, gives the number of hours of sleep the
previous night for 59 students at a large Midwest university.
Enter the hours of sleep into a list labeled HOURS.
Create a Frequency Histogram
The data is grouped in to 5 classes beginning with 4.3 and
ending with 10.3. The class width is 1.2.
Press the WINDOW key.
Set Xmin to 4.3 and Xmax to 10.3.
Set Xscl to our class width 1.2.
Graphing Calculator Manual 2
Set Ymin to 0 and Ymax to 25.
Now that we have told the calculator how to organize the data into classes, we are ready to set up
the frequency histogram.
Press the 2nd
key.
Press the Y= key to get to STAT PLOT.
Press the ENTER key.
With the flashing cursor over On,
Press the ENTER key to turn the plot On.
Use the arrow keys to select the histogram.
Enter the name of the list HOURS for Xlist (use the LIST
menu).
Leave Freq at 1, since each data value represents only one point.
Now we are ready to draw the frequency histogram.
Press the GRAPH key.
Press the TRACE key.
Use the arrow keys to move from one bar to the next. The screen
will display the frequency in each class (n = 20) and the range of values in each class (min = 6.7;
max < 7.9).
The histogram provides us with the following information.
_Class_ Frequency
4.3 - 5.5 5
5.5 – 6.7 15
6.7 – 7.9 20
7.9 – 9.1 16
9.1 – 10.3 3
Summary of How to Create a Histogram.
1. Enter the data into a list.
2. Determine the range of values for your data, as well as your desired class width.
3. Press the WINDOW key and set Xmin, Xmax, Xscl to the range of values and class
width. Set Ymin to 0 and Ymax to a value large enough for the tallest box in the
histogram. (You may need to revise this.)
Chapter 2: Organization and Description of Data 3
4. Press the 2nd
key and then the Y= key to get to STAT PLOT. Press 4: PlotsOff and
press ENTER. This will turn off all of the plots.
5. Return to STAT PLOT and select a plot. Turn the plot On by pressing the ENTER key
and select the third figure in the first row (the Histogram). Enter the name of your data
list in Xlist, and leave Freq as 1.
6. Press the GRAPH key. Press the TRACE key to display the information needed to
create the Frequency Distribution Table.
Enter the Frequency Distribution Table into the Statistical Editor
Since the class intervals cannot be entered into a List, the midpoint of each class is used. Such a
frequency distribution for the example of the number of hours slept would look as follows:
Create two new lists, MIDPT and FREQ. Enter the above table in to these lists, as shown.
Create a Polygon
The steps to create a Polygon are very similar to those needed to create a histogram. We will use
the data stored under the labels MIDPT and FREQ.
Press the 2nd
key and then the Y= key to get to STATPLOT.
Press the number 1 key.
Turn Plot1 On.
Highlight the xyline in Type (2nd
item in the 1st line).
Type MIDPT for the Xlist:
Type FREQ for the Ylist:
Select the square in Mark:
The same WINDOW that was used for the histogram is applicable to the Polygon. Another
option is to let the calculator determine the correct window by using the Zoom feature, as
follows:
Press the Zoom Key.
Press the number 9 key to
choose ZoomStat.
_Midpoint_ Frequency
4.9 5
6.1 15
7.3 20
8.5 16
9.7 3
Graphing Calculator Manual 4
Press the GRAPH Key.
Press the TRACE Key to see the data points.
Create a Relative Frequency Column
A Relative Frequency Column can be generated from the Frequency Column.
Create a new list named RELFR. With the cursor still highlighting
the name RELFR,
Press 2nd
STAT (LIST) and select the List named FREQ.
Press the ÷ Key.
Press 2nd
STAT (LIST), move the cursor to MATH.
Select 5:sum( and press ENTER.
Press 2nd
STAT (LIST) and select the List named FREQ.
Press the ) Key and press ENTER.
The above sequence of commands calculates each value in the
RELFR column by dividing the corresponding value in the FREQ
column by the sum of all of the values in the FREQ column.
Create a Percentage Column
A Percentage Column can be generated from the Relative
Frequency Column.
Create a new list named PERC. With the cursor still highlighting the
name PERC,
Press 2nd
STAT (LIST) and select the List named RELFR.
Press the × (Multiplication) Key.
Type in 100 and press ENTER.
The above sequence of commands calculates each value in the
PERC column by multiplying the corresponding value in the
RELFR column by 100.
Create a Cumulative Frequency Column
A Cumulative Frequency Column can be generated from the Frequency Column.
Chapter 2: Organization and Description of Data 5
Create a new list named CUMFR. With the cursor still highlighting the name CUMFR,
Press 2nd
STAT (LIST) and move the cursor to OPS
Select 6:cumSum( and press ENTER.
Press 2nd
STAT (LIST) and select the List named FREQ.
Press the ) Key and press ENTER.
The above sequence of commands calculates each value in the
CUMFR column by adding the corresponding value in the FREQ
column to the sum of the previous values.
The table at the right was displayed by using the SetUpEditor to
display just these 3 Lists.
Create an Ogive For The Cumulative Frequency
The steps to create an Ogive are very similar to those needed to create a Polygon. We will begin
by using the data stored under the labels MIDPT, FREQ, and CUMFR.
The Ogive uses the first lower and all of the upper boundaries rather than the midpoint. Thus it
does require an additional data value at the beginning of the list.
Insert a new row of data:
Move the cursor to the top value in the MIDPT list (the ‘4.9’).
Press 2nd
DEL (INS).
A ‘0’ will appear.
Repeat the above steps for the FREQ and CUMFR columns.
Overwrite the values in the
MIDPT column with the first
lower and all of the upper
boundaries, as shown at the right.
Press the 2nd
key and then the Y= key to get to STATPLOT.
Press the number 1 key.
Turn Plot1 On.
Highlight the xyline in Type (2nd
item in the 1st line).
Graphing Calculator Manual 6
Type MIDPT for the Xlist:
Type CUMFR for the Ylist:
Select the square in Mark:
Some modifications will be needed for the WINDOW to incorporate the lower boundary and the
increased y-values.
Press the GRAPH Key.
Press the TRACE Key to see the
data points.
Chapter
3 Descriptive Study of
Bivariate Data
Simple Linear Regression Models
A simple linear regression model is an equation describing how to use one variable, x, to predict
another variable, y, based on the relationship existing in the sample data. Since the predictions
made from the sample data may differ from the actual values in the population data, the symbol
y' is used for the predicted value of y. The simplest possible model is a linear one: y' = a + bx.
The graph is a line, where b is the slope of the line and a is the y-coordinate of the y-intercept.
Constructing a Scatter Diagram
In order to use the TI-84 Plus to find the linear regression model, the sample data must be
entered into the STAT editor. The relationship between the two quantitative variables can be
viewed in a scatter diagram.
You can create a scatter diagram by using the scatterplot option under STAT PLOT, which is
located above the Y = key. The steps to create the scatter diagram are similar to those needed to
create a histogram.
Example:
The data in the following table gives the English and Math scores for nine randomly selected
students.
Student A B C D E F G H I
English 77 90 85 62 71 88 95 67 75
Math 68 86 78 73 75 78 85 78 81
2
It is recommended, at this time, to clear Y = of any graphs. Enter the data in to the STAT editor.
Press STAT > 1 to get to the Stat editor.
Type the English scores in L1.
Type the Math scores in L2.
Press 2nd
> Y= to get to STAT PLOT.
Press 1 to get to Plot1.
Highlight ON and press ENTER.
Highlight the scatterplot (1st diagram) under Type and press ENTER.
Type L1 in for the Xlist:
Type L2 for the Ylist:
Select any one of the symbols under Mark: (the square tends to show up the best)
Press Zoom > 9 to get to ZoomStat.
Press the TRACE key and use the left and right arrow keys to
move among the points and see the coordinates of the marks.
Note: if any equations are stored on the Y = page, they may
also appear in your diagram. Delete them if they are in the
way.
Creating a Linear Regression Model
The TI-84 Plus calculator has two built-in functions,
LinReg(ax+b) and LinReg(a+bx) to compute a simple linear
regression model. They are both located on the STAT page in
the CALC list. These are two forms of the same function, one
that writes the equation as ax+b and the other that writes the
equation as a+bx. We will use the latter form, a+bx, but either
is okay.
We will create a linear regression model for the English and
Math scores from the previous example. They are stored in lists
L1 and L2.
Press STAT > CALC > 8: LinReg(a+bx).
Type: L1 > , > L2 > ,
Press VARS > Y-VARS > 1 > 1 to get Y1.
Press ENTER.
3
The LinReg output shows:
general model: y=a+bx
y-intercept: a=53.26159596
slope: b=.3135854034
coefficient of determination: r2=.3883069253
correlation coefficient: r=.6231427808
The equation of the linear regression model for the English and Math scores is
y = 53.2616 + 0.3136x
Coefficient of Determination and Correlation Coefficient
Note: In order to see the coefficient of determination, r2, and
the correlation coefficient, r, on the LinReg output, the
diagnostics must be turned On. Use the TI-84 DiagnosticOn
function to turn diagnostics on. The command only needs to
be executed once, and from then on r2 and r will be displayed
every time you compute a regression model.
Press 2nd
> 0 to get to CATALOG.
**Use the down arrow key until you find the command DiagnosticOn and then pressing
ENTER ENTER.
**Alternative: By pressing the first letter of the function you
are looking for, you can save time getting to the function.
The ALPHA key is engaged automatically when you go into
CATALOG, so to get to the letter D press the x-1
key.
Use the down arrow key to continue down until you find
DiagnosticOn and then press ENTER ENTER.
Enter the LinReg command again and the output should appear
as shown on the left.
Graphing the Linear Regression Line
The LinReg command we entered requested that the
equation of the linear regression model (least squares
line) be stored in Y1.
4
Execute the STAT PLOT command again and the least
squares line will appear on the scatter diagram.
Chapter
4 Probability
Generating Random Numbers
When working with probabilities, there is sometimes a need to generate numbers that you can’t
predict, but at the same time follow some standard rules. Computer simulations are a common
example of the need for random occurrences within a structured setting. These numbers are
called pseudo-random numbers since they are not totally random. Your calculator can generate
these types of random numbers. The numbers that will appear on your calculator screen are hard
to predict, but you will be able to attach probabilities to them. For each kind of pseudo-random
number, we will be able to say what the probability is that it will occur next.
Generating Random Numbers Between 0 and 1
Suppose that you would like to generate a number between 0 and 1. You want the number to be
unpredictable, but you want every number between 0 and 1 to have an equally likely chance of
being generated.
The numbers that are generated with the random number function
on your calculator will be very similar to those found in the
random number table in an Elementary Statistics textbook. The
function is on the MATH page and can found in the PRB list.
Press the MATH key.
Press the ► key until PRB is
highlighted.
Select 1:rand and press ENTER.
Press ENTER again.
If you continue to press the ENTER key you will generate a
different random number between zero and one each time you
press the ENTER key.
2
Generating Random Numbers Between Any Two Values
The TI-84 Plus does not have a built-in function to generate random real numbers that are
equally likely to occur and fall within a specified range of values, but they can be generated by
using the rand function with some additional commands. The following command will generate
random real numbers between 1 and 100.
Press the MATH key.
Press the ► key until PRB is highlighted.
Select 1:rand and press ENTER.
Type: * (100 - 1) + 1
Press ENTER.
If you continue to press the ENTER key, you will generate a different random number between
1 and 100 each time.
In general, the command used to generate a real number between values m and n is:
rand * (n – m) + m , where n is the larger number.
For example: rand * (10 – 1) + 1 will generate a real number between 1 and 10.
rand * (900 – 500) + 500 will generate a real number between 500 and 900.
Generating Random Integer Values Between Any Two Numbers
To generate random integer numbers (no decimals) that are
equally likely to occur and fall within a specified range of
values, the TI-84 Plus has the built-in function randInt. The
following sequence will generate 20 random integer values
between 1 and 100.
Select MATH > PRB.
Select 5:randInt( .
Type: 1, 100, 20) and press ENTER.
Use the right arrow key to see the remaining numbers. Press
ENTER to generate 20 more such random integers.
In general, the command used to generate n integer numbers between values j and k is:
randInt(j, k, n) If n = 1, then you do not need to enter it
For example: randInt(10, 50, 15) will generate 15 random integers between 10 and 50.
randInt(10, 50) will generate 1 random integer between 10 and 50.
3
Store Random Numbers in a List
The random numbers generated can be stored in a list to be used
with other statistical procedures.
The command in the screen shot
on the right will generate 15
randomly-generated integer
values between 10 and 50 and
store them in List L1.
Chapter
5
Probability Distributions
Mean and Standard Deviation of a Discrete Random Variable
Computing the mean and standard deviation of a discrete random variable is slightly different
than computing the mean and standard deviation of a set of data values. Each data value in the
data set weighs equally in the computation. However, in a discrete random variable, the possible
data values are given along with the likelihood of each value occurring on any given single trial.
As was the case for a set of data values, the TI-84 calculator can be used to calculate the mean
and standard deviation of a discrete random variable by either manually using the formulas or by
using a built-in function. We will begin with manually using the formulas.
Example: Number of Heads
Table 1 gives the probability distribution of the number of
heads in three tosses of a fair coin. Enter the number of heads
into a list named X, and the probability into a list named
PROBX.
Mean of a Discrete Random Variable
The formula to calculate the mean of a discrete
probability distribution is
.
Move the cursor to highlight the name of the empty
List next to List PROBX.
Type: List X * List PROBX
2
Press ENTER.
The formula now says to sum of this list of values.
Go to the homescreen (2nd
> Mode).
Select 2nd
> STAT > MATH > 5: sum(
Type: L1) and press ENTER.
µ = 1.5 heads in three tosses of a coin.
Standard Deviation of a Discrete Random Variable
The formula to calculate the standard deviation of a discrete probability
distribution is
Move the cursor to highlight the name of the empty List
next to List PROBX.
Type: List X ^ 2 * List
PROBX Press ENTER.
The formula now says to take the square root of the difference of the sum of this
list of values and the square of the mean.
Go to the homescreen (2nd
> Mode).
Press √ Key.
Select 2nd
> STAT > MATH > 5: sum(
Type: L1) – 1.5 ^ 2) and press ENTER .
σ =0.866 heads in three tosses of a coin.
3
Using TI-84 Plus Built-In Functions For Discrete Probability Distributions
The TI-84 Plus built-in function 1-Var Stats will also calculate
the numerical descriptive statistics for a Discrete Probability
Distribution.
We will use the same
probability distribution as
above, which we stored in Lists
X and PROBX.
Select STAT > CALC > 1-Var Stats.
Press ENTER.
Select 2nd
> STAT >X , 2nd
> STAT > PROBX Press ENTER.
The screen will display the descriptive statistics, which includes
the population mean and standard deviation.
Generating Dependent Probabilities
Factorials
A common function needed to compute dependent probabilities is the factorial function. The
notation for the factorial of n is n! The “!” function is found on the MATH page under the PRB
list.
To find the number of ways six people could be arranged in six different chairs, you would
calculate six factorial (6!).
Type: 6 > MATH > PRB > 4: ! and press ENTER.
6! = 720.
Calculate 10! and 0!.
4
Combinations
The combination formula can also be used to compute dependent probabilities. The notation for
the number of combinations is nCr, where n is the total number of elements, and r is the number
being selected. Combinations are used when selecting a few elements from a larger number of
distinct elements.
Example: Ice Cream
An ice cream parlor offers 6 flavors of ice cream. Kristen would like to purchase 2 flavors of ice
cream. In how many ways can Kristen choose 2 flavors out of the 6 flavors?
In order to find the number of ways of choosing two flavors out of
six, we would need to calculate 6C2.
Type 6 > MATH > PRB > 3:
nCr and press ENTER.
Press the number 2 key and press
ENTER.
There are 15 different combinations of two flavors of ice cream.
Calculate 6C3 and 8C3 .
Permutations
The permutation formula can be used to compute dependent probabilities. The notation for the
number of permutations is nPr, where n is the total number of elements, and r is the number
being selected. Permutations are used when trying to find all possible arrangements of elements
taken from a larger selection. Arrangements involve putting the elements in a particular order.
If Kristen’s story changes as below, then permutations apply
rather than combinations.
An ice cream parlor offers 6 flavors of ice cream. Kristen
would like to purchase 2 flavors of ice cream and concerned as
to which flavor is on the top and which flavor is on the bottom
(i.e. Kristen is concerned about the arrangement of the flavors).
In how many ways can Kristen arrange 2 flavors out of the 6
flavors?
In order to find the number of arrangements of choosing two flavors out of six, we would need to
calculate 6P2.
5
Type 6 > MATH > PRB > 2: nPr and press ENTER.
Press the number 2 key and press ENTER.
There are 30 different arrangements of two flavors of ice cream.
Calculate 6P3 and 8P3 .
Binomial Distribution
Randomly Generating Number of Successes From a Binomial Distribution
There are many situations in statistics where you need to
generate numbers from distributions where the numbers are not
equally likely to occur. One of the most commonly used
distributions used in statistics is the discrete Binomial
distribution.
The TI-84 Plus has a built-in function to generate random real
numbers from a specific Binomial distribution. The random real
number represents an x value, the number of successes.
Select MATH > PRB > 7:randBin(
Type 3, 0.3,5) and press ENTER.
The screen shot on the left repeated the Binomial experiment 5
times. Each time there were 3 trials with a probability of success
were 2, 1, 0, 1, and 2 respectively.
Generate 5 random numbers from a Binomial distribution with 3 trials and 0.9 probability of
success.
The syntax for the randBin( function is randBin(n, p, r). This will generate r random numbers
representing x the number of successes from a binomial distribution with n number of trials and p
probability of success on a given trial. Note: if r = 1, you may omit it.
Compute Binomial Probabilities
The command for computing a probability at x successes for a discrete Binomial distribution is
binompdf(.
To find the probability of x successes out of n trials, each with probability p of success, type
binompdf(n, p, x).
6
Example: VCR’s
Suppose that 5% of all VCR’s manufactured by an electronics
company are defective. Three VCR’s are selected at random. What
is the probability that exactly one of them is defective? P(x = 1)
Select 2nd
> VARS (DISTR) > A:
binompdf( and press ENTER.
Type: 3, 0.05, 1) and press ENTER.
The result is 0.135375 or ≈ 13.5% chance that exactly one of
them is defective.
Calculate the same probability with 8 VCR’s selected at random, rather than 3.
Now there is ≈ 27.9% chance that exactly one of them is defective.
Compute Cumulative Binomial Probabilities
The command for the probability for a cumulative number of
successes from 0 to x for a discrete Binomial distribution with n
number of trials and p probability of success on any given single
trial is binomcdf( . P(number of successes ≤ x)
Using the same Binomial distribution of 3 VCR’s as above, what
is the probability that zero or one of them is defective? P(x ≤ 1)
Select 2nd
> VARS (DISTR) > B: binomcdf( and press
ENTER.
Type: 3, 0.05, 1) and press ENTER.
The result is 0.99275 or ≈ 99.3% chance that at most one of
them is defective.
Calculate the same probability with 8 VCR’s selected at random, rather than 3.
Now there is ≈ 94.3% chance that at most one of them is defective.
Compute Poisson Probabilities
7
The command for computing the probability of x occurrences within a given interval for a
discrete Poisson distribution with a mean number of occurrences λ is poissonpdf(λ, x).
P(number of occurrences = x)
Example: Telemarketing
Suppose that a household receives, on average, 9.5 telemarketing calls per week. Find the
probability that the household receives 6 calls this week.
Select 2nd
> VARS (DISTR) > C: poissonpdf( and press
ENTER.
Type: 9.5, 6) and press ENTER.
The result is 0.076420796 ≈ 7.6% chance that the household
receives 6 calls this week.
Find the probability that the household receives 10 calls this week.
There is ≈ 12.4% chance that the household receives 10 calls this week.
Compute Cumulative Binomial Probabilities
The command for computing the probability of at most x
occurrences (cumulative) within a given interval for a discrete
Poisson distribution with a mean number of occurrences λ is
poissoncdf(λ, x). P(number of occurrences ≤ x)
Using the same Poisson distribution of Telemarketing calls as
above, what is the probability that the household receives at
most 6 calls this week? P(x ≤ 6)
Select 2nd
> VARS (DISTR) > D: poissoncdf( and press
ENTER.
Type: 9.5, 6) and press ENTER.
There is ≈ 16.5% chance that the household receives at most 6
calls this week.
Find the probability that the household receives at most 10 calls this week.
8
There is ≈ 64.5% chance that the household receives at most 10 calls this week.
Geometric Probabilities
Your calculator can compute probabilities for a geometric random variable with probability of
success p using the geometpdf( command, located on the DISTR page. To find the probability of
the random variable taking the value x, type geometpdf(p, x).
Example: Car Ignition
Suppose that a car with a bad starter can be started 90% of the time by turning on the ignition.
What is the probability that it will take three tries to get the car started? Type geometpdf(0.9, 3);
the answer is 0.9%.
Cumulative Geometric Probabilities
As with the binomial and cumulative probability functions, there is a cumulative version
geometcdf( . It can be used to find the probability that a geometric random variable will take a
value of at most x by typing geometcdf(p, x).
Chapter
6 The Normal Distribution
Continuous random variables are used to approximate probabilities where there are many
possibilities or an infinite number of possibilities on a given trial. One of the most well-known
continuous distributions used to approximate probabilities is the normal distribution.
Traditionally normal distribution probabilities were figured using a normal distribution table.
The table method is being replaced with calculators such as the TI-84 Plus. The calculator
reduces the time needed to perform the calculations and reduces the rounding errors that occur
because of the brevity of the tables in elementary statistics textbooks.
Normal Distribution
Randomly Generating a Number From a Normal Distribution
Just as the TI-84 had a built-in function to generate random real
numbers from a Binomial distribution, it also has a built-in
function to generate random real numbers from a specific Normal
distribution with a mean µ and standard deviation σ. The random
real numbers represent x values. The general syntax is
randNorm(µ, σ, n), where n is the number of random real
numbers.
The following command will generate 30 numbers from a
Normal distribution with a mean of 45 and a standard deviation
of 8 and store them in L2.
Select MATH > PRB > 6:randNorm( and press ENTER.
Type: 45, 8, 30) > STO > L2
Generate 200 numbers from a Normal distribution with µ = 100 and σ = 15
and store them in L3.
2
Generate a histogram of the 200 numbers in L3 and observe that the histogram is beginning to
look like a normal distribution. Experiment with generating a larger number of data values.
Computing Normal Distribution Probabilities
The commands for the Normal distribution are normalpdf( , normalcdf( , and invNorm( .
They are located on the DISTR page. DISTR appears above the VARS key.
Compute Cumulative Normal Probabilities
The normalcdf( function stands for normal cumulative density function and gives the probability
of getting an x value that falls within an interval of values from the normal distribution. There
are three possibilities:
Finding the probability that a number will fall between two values under the Normal
distribution.
Finding the probability that a number will fall to the left of a value under the Normal
distribution.
Finding the probability that a number will fall to the right of a value under the Normal
distribution.
The syntax for the normalcdf( function is normalcdf(L, B, µ, σ), where L is the lower bound of
the interval, B is the upper bound of the interval, µ is the mean, and σ is the standard deviation.
The values for µ and σ may be omitted if it is the Standard Normal distribution.
Finding the Area Between Two Values
To find the area between two numbers a and b under the
Standard Normal curve, P(a < z < b) = normalcdf(a, b, 0, 1).
Find the probability of getting a value between 1.04 and 1.82
under the Standard Normal curve.
Select: 2nd
> VARS >2: normalcdf( and press ENTER.
Type: 1.04, 1.82, 0, 1) and press ENTER.
P(1.04 < z < 1.82) = 0.115.
Find the probability of getting a value between 0 and 3 under
the Standard Normal curve.
Find the probability of getting a value between 10 and 13 under
the Normal curve with a mean of 10 and a standard deviation of
2.
3
P(10 < x < 13) = 0.43
Find the probability of getting a value between 2 and 12 under the Normal curve with a mean of
10 and a standard deviation of 2.
Finding the Area to the Left of a Value
To find the area to left of b under the Normal curve, P(z < b) = normalcdf(-∞, b, µ, σ). The
problem is that the TI-84 calculator does not have a built-in key for negative infinity (-∞). Thus,
the value -1E99 is used, which represents a very large negative number. The letter E stands for
scientific notation and it is located above the comma (,) key (2nd
> ,). Thus, the command will
look like: normalcdf(-1E99, b, µ, σ).
Find the probability of getting a value less than 0 under the Standard Normal curve.
Select: 2nd
> VARS >2: normalcdf( and press ENTER.
Type: -1 > 2nd
> , > 99,0) and press ENTER.
P(z < 0) = 0.5.
Find the probability of getting a value less than 32.45 under the
Normal curve with mean 25 and standard deviation 6.
Finding the Area to the Right of a Value
To find the area to right of a under the Normal curve, P(z > a) = normalcdf(a, 1E99, µ, σ).
Find the probability of getting a value greater than -1.08 under the Standard Normal curve.
Select: 2nd
> VARS >2: normalcdf( and press ENTER.
Type: -1 > 2nd
> , > 99,0) and press ENTER.
P(z > -1.08) = 0.8599.
Find the probability of getting a value greater than 15.3 under
the Normal curve with mean 12 and standard deviation 4.
Inverse Normal Distribution Probabilities
4
There are times in statistics when we have a probability and
need a relevant z-score or raw score. The problem of this type
may look like: P(z > ?) = 0.8599. Such problems are known
as inverse normal distribution problems. Such computations
can be performed using tables of normal probabilities, but the
work is tedious, error-prone, and often has rounding errors.
Fortunately, the calculator has a function, invNorm(, that
performs the calculation.
We know from the previous section that the unknown in P(z >
?) = 0.8599 is -1.08.
Select: 2nd
> VARS > 3:invNorm( and press ENTER.
Type: 0.8599) and press ENTER.
The screen is telling us that the answer is positive 1.08. The
invNorm( function gives an answer based on a cumulative
probability of 0.8599 from -∞ to 1.08. Since the Normal distribution is symmetric, the same
cumulative probability applies to -1.08 to ∞. It is always advisable to draw the normal curve to
help in visualizing this concept.
Graph the Normal Probability Density Function
The function normalpdf( stands for Normal probability
density function and does not actually generate a probability,
since it applies to a single x value in a continuous distribution
and that probability is always zero. The main use of this
command is to draw the Normal curve. The syntax for the
function is normalpdf(x, μ, σ), where μ is the mean and σ is
the standard deviation.
The following sequence of commands will draw the standard normal curve (μ =0 and σ = 1).
Select: Y = > 2nd
> VARS > 1: normalpdf( and press ENTER.
Type: x, 0, 1) > ZOOM > 9
This command may be used
to draw any Normal
distribution curve with any
mean and standard deviation.
5
Shade the Normal Probability Density Function
When calculating the probability of an area under the Normal curve, it is often helpful to shade
the area. The syntax for the TI-84 Plus command to do this is ShadeNorm(a, b, µ, σ).
Example
To shade the area under the Standard Normal curve for
P(1.04 < z < 1.82) = 0.115, begin by turning off all other
graphs (STATPLOT or Y =).
Adjust the WINDOW to view the Standard Normal curve,
as shown on the right.
Select: 2nd
> VARS > DRAW > 1: ShadeNorm( and press ENTER.
Type: 1.04, 1.82, 0, 1) and press ENTER.
Notice that the area of
the shaded region is also
shown on the graph and
it is the same value
calculated from the
normalcdf( command.
Thus, the ShadeNorm(
is an alternative
command for normalcdf(, with the added benefit of the shading of the area.
Example
Find the probability of getting a value greater than 15.3 under the Normal curve with mean 12
and standard deviation 4.
Adjust the WINDOW as shown on the right.
Type: ShadeNorm(15.3, 1E99, 12, 4) and press ENTER.
P(x > 15.3) = 0.2047
Chapter
7 Variation in
Repeated Samples -
Sampling Distributions
A large part of statistics consists of analyzing the probability of getting a specific sample mean
or sample proportion from a repeated number of samples drawn from the same population.
Usually we focus on two kinds of statistics from those samples: the sample mean Ë, if the data is
quantitative, or the sample proportion ê, if the data is categorical. For large sample sizes, both
Ë and ê have normal distributions, which have already been discussed. The normalcdf(
function on the TI-84 Plus calculator will be used, with a slight modification. As before, the
answers using normalcdf( function will differ slightly from the answers found from a table of
normal probabilities, since the latter involves rounding.
Probabilities for Sample Means
For a large sample size, the Central Limit Theorem states that the sampling distribution Ë is
normally distributed with µË = µ and σË = σ/ n .
The syntax normalcdf(a, b, µ, σ/ n ) is used to find the
probability that a < Ë < b. The procedure is the same as
finding the probability of x with a given mean and standard
deviation. As before, if you are finding the area to the left of
some value b, we use -1E99 for a. If you are finding the area
to the right of some value a, we use 1E99 for b. The key
stroke for E is 2nd
> comma.
Example: Cookie Weights
2
Assume that the weights of all packages of a certain brand of cookies are normally distributed
with a mean of 32 ounces and a standard deviation of 0.3 ounces. Find the probability that the
mean weight, Ë, of a random sample of 20 packages of this brand of cookies will be less than
31.8 ounces. The sample size here is not large, but the distribution of all such cookies is normal,
so the sample mean will be normal as well.
Select 2nd
> VARS > 2: normalcdf( and press ENTER.
Type: -1E99, 31.8, 32, 0.3/ 20 )) and press ENTER.
= 0.9986
An alternative approach would be to
adjust the WINDOW as shown on
the right, and use the ShadeNorm(
function.
Type: ShadeNorm(-1E99, 31.8, 32, 0.3/ 20 )
Probabilities for Sample Proportions
For a large sample size, the Central Limit Theorem states that the sampling distribution for ê is
normally distributed with µê = p and σê = npq /
To find the probability that a < ê < b on the calculator, use normalcdf(a, b, p, pq / n ). Note
that it is more accurate to type npq / directly into the normalcdf( command than to compute it
separately and type it in. Any time you find yourself typing in an intermediate result in a
computation you may be performing some unnecessary rounding.
Example: Voters
A candidate for mayor in a large city claims that she is favored by 53% of all eligible voters of
that city. Assume that the claim is true. What is the probability that in a random sample of 400
registered voters taken from this city, between 49% and 51% will favor the candidate?
3
Select 2nd
> VARS > 2: normalcdf( and press ENTER.
Type: 0.49, 0.51, 0.53, √(0.53 * 0.47 / 400 )) and press ENTER.
= 0.1570
In other words,
normalcdf(0.49, 0.51, 0.53, 400/47.*53. ) = 0.1570 = 15.7%.
Chapter
8 Drawing Inferences from
Large Samples
In statistics, we collect samples to know more about a population. If the sample is representative
of the population, the sample mean or proportion should be statistically close to the actual
population mean or proportion. A way to judge how close the sample statistic may be, is to
create a confidence interval. This chapter will describe how to use the calculator to compute
confidence intervals and tests of hypothesis for population means and proportions, drawn from
large samples.
Confidence Intervals for Population Means
The function used to compute confidence intervals for the population mean, µ, is ZInterval for
when σ is known and the sample is large. It is found on the STAT page under TESTS.
Known Population Standard Deviation
If you are fortunate enough to know the population standard deviation σ, either from theory or
from a pilot study, then you would use a Z-based confidence interval, ZInterval, to estimate the
population mean, µ. There are two different syntaxes for the ZInterval command.
If you know the statistical information (population
standard deviation, sample mean, and sample size),
then the syntax is ZInterval σ, , n, confidence level.
If you have the sample data stored in a list, then the
syntax is ZInterval σ, List name, Frequency list,
confidence level.
2
Example: Textbook Price
A publishing company has just published a new college textbook. Before the company decides
the price at which to sell this textbook, it wants to know the average price of all such textbooks
in the market. The research department at the company took a sample of 36 such textbooks and
collected information on their prices. This information produced a mean of $70.50 for this
sample. It is known that the standard deviation of the prices of all such textbooks is $4.50.
Construct a 90% confidence interval for the mean price of all such college textbooks.
We have the population standard deviation, σ, so we will use ZInterval command. We do not
have the data itself, so we will select Stats where it asks for input. We enter 4.5 for σ, 70.5 for
Ë, 36 for n, and .90 for C-Level.
Press STAT > TESTS > 7: ZInterval and press ENTER.
Select Stats by moving the cursor over Stats and press the ENTER key.
Type in 4.5 for σ.
Type in 70.5 for Ë.
Type in 36 for n.
Type in .90 for C-Level.
Highlight Calculate and then press the Enter key.
The ZInterval output shows
the 90% confidence level, as
well as the sample mean and sample size.
The confidence interval is between $69.27 and $71.73.
With 90% confidence, we believe that the true population mean
price is between $69.27 and $71.34.
Example: Randomly Generated Sample Data
Generate 50 random numbers from a Normal distribution with a
mean of 45 and a standard deviation of 8 and store them in L1.
Select MATH > PRB > 6:randNorm( and press ENTER.
Type: 45, 8, 50) > STO > L1
We have the population standard deviation, σ = 8, so we will use ZInterval command. We have
the data, so we will select Data where it asks for input. We enter L1 for List, 1 for Freq, and
.90 for C-Level.
3
Press STAT > TESTS > 7: ZInterval and press ENTER.
Select Data by moving the cursor over Data and press the ENTER key.
Type: 4.5 for σ.
Type: L1 for List.
Type: 1 for Freq.
Type: 0.90 for C-Level.
Highlight Calculate and then press the Enter key.
The ZInterval output
shows the 90% confidence
level, as well as the sample mean, sample standard deviation,
and sample size.
The confidence interval is between 45.001 and 47.095.
With 90% confidence, we believe that the true population mean
price is between 45 and 47.1.
Hypothesis Tests About Means
A hypothesis test about a population mean can be Z-based (if σ is known) or T-based (if σ is
unknown and either the population is normal or the sample size is over 30). The TI-84 Plus
calculator provides functions for both the Z-Test and the T-Test, which are located on the STAT
page in the TESTS list.
The menus for Z-Test and T-Test are very similar to the ones for
ZInterval and TInterval described in the last chapter. The
functions work with either the data or the summary statistics.
Both functions ask for the null hypothesis and an alternative
hypothesis. Both functions provide a p-value for comparison with
the test’s significance level.
Example: One Sided Test; Know Sigma
H0: µ = 50; H1: µ > 50; n = 200; Ë = 52.7; σ = 16.2; α = .05
Since sigma is known, Z-Test will be used.
Select: STAT > TESTS . 1: Z-Test and press ENTER.
Highlight Stats and press ENTER.
Type: 50 for μ0.
Type: 16.2 for σ.
Type: 52.7 for Ë.
4
Type: 200 for n.
Move the cursor over >μ0 and press ENTER.
Move the cursor over Calculate and press ENTER.
The Z-Test output shows the
alternative hypothesis: μ>50
test statistic: z=2.357022604
p-value: .0092110461
sample mean: Ë=52.7
sample size: n=200
The p-value is .0092110461, which is less than 5%. We reject H0 and conclude that the
population mean is statistically-significantly higher than 50.
Example: Two Sided Test; Unknown Sigma
H0: µ = 112; H1: µ ≠ 112; n = 85; Ë = 108.5; Sx = 12.4; α = .005
Since sigma is not known, T-Test will be used.
Select: STAT > TESTS > 2: T-Test and press ENTER.
Highlight Stats and press ENTER.
Type: 112 for μ0.
Type: 108.5 for Ë.
Type: 12.4 for Sx.
Type: 85 for n.
Move the cursor over ≠μ0 and press ENTER.
Move the cursor over Calculate and press ENTER.
The T-Test output shows the
alternative hypothesis: μ≠112
test statistic: t=-2.602290774
p-value: .0109440164
sample mean: Ë=108.5
sample size: n=85
The p-value is .0109440164, which is greater than 0.5%. We do not reject H0 and conclude that
the population mean is not statistically-significantly different than 112.
5
In the above examples, the Calculate option for T-Test and Z-Test was chosen. If the Draw
option is chosen, the calculator will draw the curve and state the z/t value and the p-value.
Select: STAT > TESTS . 2: T-Test and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
Confidence Intervals for Population Proportions
The function 1-PropZInt computes Z-based confidence intervals for a population proportion
when the sample size is large enough, i.e., the number of successes, x, is greater than 5 and the
number of failures, n – x, is greater than 5.
1-PropZInt is found on the STAT page under TESTS. To
use 1-PropZInt, enter in the number of successes as x, the
sample size as n, and the confidence level as C-Level.
Note: x must be a whole number. If you are finding x by
multiplying ê by n, you will need to round to the nearest
whole number.
Example: Legal Advice
A recent sample of 500 college students revealed that 82% of them owned a graphing calculator.
Find a 95% confidence interval for the percentage of all college students who own a graphing
calculator.
In our sample of 500 college students, there were 82% or 410 successes and 90 failures. We can
use 1-PropZInt with x = 410, n = 500, and our C-Level set at 0.95.
Press STAT > TESTS > A: 1-PropZInt and press ENTER.
Type: 410 for x.
Type: 500 for n.
Type: 0.59 for C-Level.
Highlight Calculate and press the ENTER key.
6
The 1-PropZInt output shows the 95% confidence interval, the
sample proportion, and the sample size.
The 95% confidence interval is from 0.786 to 0.854.
With 95% confidence, we believe that the true population
proportion is between 78.6% and 85.4%
Hypothesis Tests About Proportions
Hypothesis tests about proportions are Z-based when both
np and nq are greater than 5. The TI-84 Plus has function
1-PropZTest to compute a test statistic and p-value. The
1-PropZTest is located on the STAT page under the
TESTS list. The menu for 1-PropZTest is very similar to
the one for 1-PropZInt described in the last chapter.
Again you need to enter x, the number of successes, and n,
the sample size. You also need to enter p0, the number that
p is being compared with, and the alternative hypothesis.
Example: One Sided Test
H0: p = .41; H1: p < .41; n = 300; x = 97
Select: STAT > TESTS > 5: 1-PropZTest and press ENTER.
Type: .41 for p0.
Type: 97 for x.
Type in 300 for n.
Highlight <P0 and press ENTER.
Highlight Calculate and press ENTER.
The 1-PropZTest output shows the
alternative hypothesis: prop < .41
test statistic: z=-3.052072083
p-value: p =.0011364066
sample proportion: ê=.323
sample size: n=300
7
In the above example, the Calculate option for PropZ-Test was chosen. If the Draw option is
chosen, the calculator will draw the curve and state the z value and the p-value.
Select: STAT > TESTS > 5: 1-PropZTest and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
Chapter
9 Small Sample Inferences
For Normal Populations
Confidence Interval for Mean – Small Sample Size
The population standard deviation is usually not known. When this is the case, the sample mean
has a t-distribution, rather than a Normal distribution. Thus, a t-based confidence interval,
TInterval, is used to estimate the population mean, µ. Another consideration is that either the
population is normal or the sample size is larger than 30. Just as for the ZInterval command,
there are two different syntaxes for the TInterval command.
If you know the statistical information (sample mean,
sample standard deviation, and sample size), then the
syntax is TInterval , , n, confidence level.
If you have the sample data stored in a list, then the syntax
is TInterval List name, Frequency list, confidence level.
Example: Household Debt
A local orange grove sells oranges at Saturday’s Downtown Farmer’s Market. They wanted to
estimate the average number of oranges sold on a given Saturday. They took a sample of 35
Saturday’s and found that the average number of oranges sold for this sample is 256 with a
standard deviation of 40. Construct a 99% confidence interval for the population mean µ.
We do not have the population standard deviation, σ, so the TInterval command will be used.
Press STAT > TESTS > 8: TInterval and press ENTER.
2
We do not have the sample data, so we will select Stats, and enter 256 for Ë, 40 for sx, 35 for n,
and .99 for the C-Level.
Select Stats for Inpt: and press the ENTER key.
Type in 256 for Ë.
Type in 40 for Sx.
Type in 35 for n.
Type in .99 for C-Level.
Highlight Calculate and press the ENTER key.
The TInterval output shows the 99% confidence interval along with the
sample mean, sample standard deviation, and sample size.
The 99% confidence interval is from 237.55 to 274.45.
With 99% confidence, we believe that the true population mean number of
oranges sold is between 237.6 and 274.5 oranges.
Hypothesis Test for Mean – Small Sample Size
A hypothesis test about a population mean is T-based if σ is
unknown and either the population is normal or the sample size
is over 30. The TI-84 Plus calculator provides the T-Test, which
is located on the STAT page in the TESTS list.
The T-Test function works with either the data or the summary
statistics, requesting null hypothesis and an alternative
hypothesis. A p-value is provided for comparison with the test’s
significance level.
Example:
H0: µ = 112; H1: µ ≠ 112; n = 85; Ë = 108.5; Sx = 12.4; α = .005
Since sigma is not known, T-Test will be used.
Select: STAT > TESTS > 2: T-Test and press ENTER.
Highlight Stats and press ENTER.
Type: 112 for μ0.
Type: 108.5 for Ë.
Type: 12.4 for Sx.
Type: 85 for n.
3
Move the cursor over ≠μ0 and press ENTER.
Move the cursor over Calculate and press ENTER.
The T-Test output shows the
alternative hypothesis: μ≠112
test statistic: t=-2.602290774
p-value: .0109440164
sample mean: Ë=108.5
sample size: n=85
The p-value is .0109440164, which is greater than 0.5%. We do not reject H0 and conclude that
the population mean is not statistically-significantly different than 112.
In the above examples, the Calculate option for T-Test and Z-Test was chosen. If the Draw
option is chosen, the calculator will draw the curve and state the z/t value and the p-value.
Select: STAT > TESTS . 2: T-Test and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
Chi-Square Tests
Computing Chi-Square Distribution Probabilities
The computation commands for the Chi-Square distribution
are χ2pdf( , and χ
2cdf( . They are located on the DISTR page.
DISTR appears above the VARS key.
Compute Cumulative Chi-square Probabilities
The χ2cdf( function stands for chi-square cumulative density function and gives the probability
of getting an x value that falls within an interval of values from the chi-square distribution for the
specified degrees of freedom. There are three possibilities:
Finding the probability that a number will fall between two values under the Chi-square
distribution.
4
Finding the probability that a number will fall to the right of a value under the Chi-square
distribution (in the right tail).
Finding the probability that a number will fall to the left of a value under the Chi-square
distribution (in the left tail).
The syntax for the χ2cdf( function is χ
2cdf(L, B, df), where L is the lower bound of the interval,
B is the upper bound of the interval, and df is the degrees of freedom.
Finding the Area Between Two Values
To find the area between two numbers a and b under the Chi-square curve, P(a < χ2 < b) =
χ2cdf(a, b, df).
Find the probability of getting a value between 5.14 and 7.28
under the Chi-square curve with 8 degrees of freedom.
Select: 2nd
> VARS > 8: χ2cdf( and press ENTER.
Type: 5.14, 7.28, 8) and press ENTER.
P(5.14 < χ2 < 7.28) = 0.236.
Finding the Area in the Left Tail
To find the area to the left of b (in the left tail) under the Chi-square curve, use P(χ2 < b) =
χ2cdf(-∞, b, df). The TI-84 calculator does not have a built-in key for negative infinity (-∞).
Thus, the value -1E99 is used, which represents a very large negative number. The letter E
stands for scientific notation and it is located above the comma (,) key (2nd
> ,). Thus, the
command will look like: χ2cdf(-1E99, b, df).
Find the probability of getting a value less than 20 under the
Chi-square curve with 17 degrees of freedom.
Select: 2nd
> VARS >8: χ2cdf( and press ENTER.
Type: -1 > 2nd
> , > 99,20,17) and press ENTER.
P(χ2 < 20) = 0.7258
Finding the Area in the Right Tail
To find the area to right of a (in the right tail) under the Chi-square curve, use
P(χ2 > a) = χ
2cdf(a, 1E99, df).
Find the probability of getting a value greater than 31.08
under the Standard Chi-square curve with 25 degrees of
freedom.
Select: 2nd
> VARS >8: χ2cdf( and press ENTER.
Type: 31.08 > , > 1 > 2nd
> , 99,25) and press ENTER.
5
P(χ2 > 31.08) = 0.1864.
Graph the Chi-square Probability Density Function
The function χ2pdf( stands for Chi-square probability density function and does not actually
generate a probability, since it applies to a single x value in a continuous distribution and that
probability is always zero. The main use of this command is to draw the Chi-square curve. The
syntax for the function is χ2pdf(x, df), where df is the degrees of freedom.
The following sequence of commands will draw the chi-square curve with 7 degrees of freedom.
Select: Y = > 2nd
> VARS > 7: χ2pdf( and press ENTER.
Type: x, 7) > ZOOM > 0
This command may be used to draw any Chi-square
distribution curve with any degrees of freedom. It may be
necessary to adjust the window. For example, changing Xmax
to 30 gives a better Chi-square distribution curve (see below).
Shade the Chi-square Probability Density Function
When calculating the probability of an area under the Chi-
square curve, it is often helpful to shade the area. The syntax
for the TI-84 Plus command to do this is Shade χ2 (a, b, df).
To shade the area under the
Standard Chi-square curve
6
with 8 degrees of freedom for P(5.14 < χ2 < 7.28) = 0.236, begin by turning off all other graphs
(STATPLOT or Y =).
Adjust the WINDOW to view the Standard Chi-square curve with 8 degrees of freedom, as
shown on the left.
Select: 2nd
> VARS > DRAW > 3: Shade χ2 ( and press
ENTER.
Type: 5.14, 7.28, 8) and press ENTER.
Notice that the area of the
shaded region is also
shown on the graph and it is the same value calculated from the
χ2cdf( command. Thus, the ShadeNorm( is an alternative
command for χ2cdf(, with the added benefit of the shading of
the area.
A Goodness-of-Fit Test A goodness-of-fit test is used to make a test of hypothesis about
experiments with more than two possible outcomes (or
categories). These are called multinomial experiments. The
frequencies of each possible outcome obtained from the
experiment are called the observed frequencies. A goodness-of-
fit test tests the difference between the observed frequencies
and the expected frequencies (npi). This difference follows a
chi-square distribution. The sample size should be large enough
so that the expected frequency for each category is at least 5.
The TI-84 Plus command for the goodness-of-fit test is χ2GOF-Test, which performs a test to
confirm that sample data is from a population that conforms to a specified distribution (expected
frequencies). The command requires that both the observed and expected frequencies are put in
lists. The degrees of freedom, df, is the number of outcomes minus 1. The command is located
on the DISTR page. DISTR appears above the VARS key.
Example 1:
The following table lists the frequency
distribution of 90 rolls of a die. Test at
the 5% significance level whether the
null hypothesis that the given die is
fair is true.
Outcome of roll 1 2 3 4 5 6
Observed Frequency 16 21 15 12 14 12
Expected
Frequency
15 15 15 15 15 15
7
Enter the observed frequencies into List L1 and the expected
frequencies into List L2, as shown on the left.
There are 6 possible outcomes, so the degrees of freedom, df,
is 6 – 1 = 5.
Select: STAT > TESTS > D: χ2GOF-Test and press
ENTER.
Type in 2nd
> 1 for Observed.
Type in 2nd
> 2 for Expected.
Type in 5 for df:
Highlight Calculate and press ENTER.
The output for the χ2GOF-Test
shows the:
Chi-square value: χ2
= 3.7333
p-value = 0.5884
Degrees of freedom: df = 5
CNTRB = {.066666667, 2.4, 0, .6, .066666667}
Note: CNTRB= provides a list of the contributions of each
category to the overall value of χ2. This would be the values in the column. Notice that a
roll of ‘2’ had the largest contribution.
Since the p-value is greater than 5%, reject the null hypothesis and conclude that the dice is not
fair.
Contingency Tables
When measuring the relationship between two categorical variables, one of the most important
tools for analyzing the results is a two-way classification table, also known as a contingency
table.
A Test of Independence
The contingency table can be used to see if the variables are
independent, by comparing observed frequencies with the
frequencies that would be expected from such a sample if they
were independent. A Chi-Square (Χ2) test-statistic can be
computed from the observed and expected frequencies. The TI-
84 function Χ2-Test is used for the Test of Independence and is
located on the STAT page in the TESTS list. The Χ2-Test
function works differently than the other tests on this menu. It
8
requires you to enter the observed frequencies into a matrix. The computed expected frequencies
are then stored automatically in another matrix.
Example: School Referendum
A random sample of
300 adults was selected
and asked if they were
in favor of the new
school referendum. The two-way classification table of the
responses of the adults is presented in the table.
Test at a 5% significance level whether or not Gender and Opinion are independent.
First the data has to be stored differently than in previous
statistical tests on the calculator. The data for a Chi-Square test
of independence has to be stored in a Matrix.
Type: 2nd
> x-1
to get to MATRIX.
Highlight EDIT and press the ENTER key.
After MATRIX [A] type in 2 x3.
The 2 represents the number of rows in the table.
The 3 represents the number of columns in the table.
Enter the data values into the matrix as they appear in the table.
Select: STAT > TESTS > C: χ2-Test and press ENTER.
The χ2-Test screen shows:
That the Observed values are in matrix A
That the Expected values will be put in matrix B
Highlight Calculate and press ENTER.
The output for the χ2-Test shows the:
Test statistic: χ2 =8.252773109
P-value: p =.0161410986
Degrees of freedom: df=2
In Favor Against No Opinion
Men 93 70 12
Women 87 32 6
9
Since the p-value of .016 is less than the significance level of .05, we reject the null hypothesis
that row and column variables are independent. We have significant evidence that gender and
opinion concerning the school referendum are related for all adults.
The Expected frequencies can be found in Matrix B.
Type: 2nd
> x-1
to get to MATRIX.
Select: 2: [B] and press the ENTER key.
Press ENTER again.
In the above example, the Calculate option was chosen. If the Draw option is chosen, the
calculator will draw the curve and state the χ2 value and the p-value.
Select: STAT > TESTS > C: χ2-Test and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
A Test of Homogeneity
A test of homogeneity is a test to determine if two (or more) populations are homogeneous
(similar) with regard to the distribution of a certain characteristic. The procedure to perform this
test on the TI-84 Plus is identical to performing a Test Of Independence (please see above).
Inferences About the Population Variance
In the same way that the population mean and population proportion are tested, so
is the population variance. This is often in response to a desire to control the
consistency of a value. If the population from which the sample is taken is
approximately normally distributed, then the sample variance has a chi-square
distribution with n - 1 degrees of freedom.
The TI-84 Plus does not have a built-in function to generate confidence intervals
about the population variance. The Goodness Of Fit function, χ2GOF-Test, can be
used for a test of hypothesis of the population variance if the data
Chapter
10 Comparing Two Treatments
Confidence Interval for µ1 - µ2
There are two functions used to compute confidence intervals for the difference of two
population means µ1 − µ2:
2-SampZInt for when both σ1 and σ2 are known and the sample sizes are large or the
populations from which the samples are drawn are normal
2-SampTInt for when σ1 and σ2 are not known.
Both are found on the STAT page under TESTS.
Known Population Standard Deviations
If you are fortunate enough to have information about the population standard deviations of the
two populations, either from theory or a pilot study, then you would use a Z-based confidence
interval, 2-SampZInt, to estimate the difference µ1 − µ2 between two population means. As was
the case for estimating one population mean, ZInterval, there are two different syntaxes for the
2-SampZInt command.
If you know the statistical information for both
populations (population standard deviations, sample
means, and sample sizes), then the syntax is 2-SampZInt
σ1, σ2 , , n1, , n2, confidence level.
If you have the sample data stored in two lists, then the
syntax is 2-SampZInt σ1, σ2, List name1, List name2,
Frequency list1, Frequency list2, confidence level.
Example 1:
The following information is obtained from two independent samples selected from two
populations. Construct a 90% confidence interval for µ1 − µ2.
n1 = 200 = 6.4 σ1 = 0.7
n2 = 190 = 5.6 σ2 = 0.55
2
We have the population standard deviations, σ1 and σ2, so we will use 2-SampZInt command.
We do not have the data itself, so we will select Stats where it asks for input. We enter 0.7 for
σ1, 0.55 for σ2, 6.4 for Ë1, 200 for n1, 5.6 for Ë2, 190 for n2, and 0.90 for C-Level.
Press STAT > TESTS > 9: 2-SampZInt and press ENTER.
Select Stats by moving the cursor over Stats and press the ENTER key.
Type in 0.7 for σ1.
Type in 0.55 for σ2.
Type in 6.4 for Ë1.
Type in 200 for n1.
Type in 5.6 for Ë2.
Type in 190 for n2.
Type in .90 for C-Level.
Highlight Calculate and then press the Enter key.
The 2-SampZInt output shows the 90% confidence interval
for µ1 − µ2, as well as both sample means and both sample
sizes.
The confidence interval is between 0.69543 and 0.90458.
With 90% confidence, we believe that the true difference
between the population means is between 0.69543 and 0.90458.
If you have the actual data rather than the statistics, then enter the
data in two lists and choose Data rather than Stats.
Independent Samples With Unknown But Equal Population Standard Deviations
In the real world, one rarely knows the population standard
deviations. Thus, the approach to estimating the difference µ1 −
µ2 between two population means is to compute a T-based
confidence interval, 2-SampTInt, with the assumption that the
population standard deviations are equal. The pooled sample
standard deviation for the two samples will be used. As was the
case for the 2-SampZInt command, there are two different
syntaxes for the 2-SampTInt command.
3
If you know the statistical information for both samples (sample standard deviations,
sample means, and sample sizes), then the syntax is 2-SampTInt , Sx1, n1, , Sx2,
n2, Confidence Level, Pooled.
If you have the sample data stored in two lists, then the syntax is 2-SampTInt
List name1, List name2, Frequency list1, Frequency list2, Confidence Level, Pooled.
Example 2:
The following information is obtained from two independent samples selected from two
populations with unknown but equal standard deviations. Construct a 95% confidence interval
for µ1 − µ2.
n1 = 42 = 78.4 s1 = 10.13
n2 = 39 = 75.2 s2 = 9.55
We have the sample standard deviations, s1 and s2, so we will use 2-SampTInt command. We do
not have the data itself, so we will select Stats where it asks for input. We enter 78.4 for Ë1,
10.13 for sx1, 42 for n1, 75.2 for Ë2, 9.55 for sx2, 39 for n2, 0.95 for C-Level. We have reason
to believe that the population standard deviations are the same, so select Yes by the prompt
Pooled.
Press STAT > TESTS > 0: 2-SampTInt and press ENTER.
Select Stats by moving the cursor over Stats and press the ENTER key.
Type in 78.4 for Ë1.
Type in 10.13 for Sx1.
Type in 42 for n1.
Type in 75.2 for Ë2.
Type in 9.55 for Sx2.
Type in 39 for n2.
Type in .95 for C-Level.
Highlight Yes for Pooled.
Highlight Calculate and then press the Enter key.
The 2-SampTInt output shows the 95% confidence interval
for µ1 − µ2, as well as both sample means and both sample
standard deviations. It also shows the degrees of freedom
(n1 + n2 – 2).
The confidence interval is between -1.162 and 7.5622.
With 95% confidence, we believe that the true difference
between the population means is between -1.162 and 7.5622.
If you have the actual data rather than the statistics, then enter
the data in two lists and choose Data rather than Stats.
4
Hypothesis Testing: µ1 - µ2
Known Population Standard Deviations
If you are fortunate enough to have information about the population standard deviations of the
two populations, either from theory or a pilot study, then you would use Z-based, 2-SampZTest,
to perform a test of hypothesis about µ1 − µ2. Again, there are two different syntaxes for the 2-
SampZTest command.
If you know the statistical information for both
populations (population standard deviations,
sample means, and sample sizes), then the syntax is
2-SampZTest σ1, σ2 , , n1, , n2,µ1:
If you have the sample data stored in two lists, then
the syntax is 2-SampZTest σ1, σ2, List name1,
List name2, Frequency list1, Frequency list2,
µ1:
Example 1 (from above):
The following information is obtained from two independent samples selected from two
populations. Test at the 5% significance level if the two population means are different, µ1 ≠ µ2.
n1 = 200 = 6.4 σ1 = 0.7
n2 = 190 = 5.6 σ2 = 0.55
Press STAT > TESTS > 3: 2-SampZTest and press ENTER.
Select Stats by moving the cursor over Stats and press the ENTER key.
Type in 0.7 for σ1.
Type in 0.55 for σ2.
Type in 6.4 for Ë1.
Type in 200 for n1.
Type in 5.6 for Ë2.
Type in 190 for n2.
Highlight ≠ µ2 for µ1:
Highlight Calculate and then press the Enter key.
The 2-SampZ-Test output shows the
alternative hypothesis: μ1 ≠ µ2
test statistic: z=12.58305739
p-value: 2.707293E-36
sample means: Ë1=6.4; Ë2=5.6
sample sizes: n1=200; n2=190
5
The p-value is 2.707293E-36, which is less than 5%. We reject H0 and conclude that the
population means are statistically-significantly different.
In the above example, the Calculate option was chosen. If the Draw option is chosen, the
calculator will draw the curve and state the z value and the p-value.
Press STAT > TESTS > 3: 2-SampZTest and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
If you have the actual data rather than the statistics, then enter
the data in two lists and choose Data rather than Stats.
Independent Samples With Unknown But Equal Population Standard Deviations
In the real world, one rarely knows the population standard
deviations. Thus, t-based, 2-SampTTest, is used to perform a
test of hypothesis about µ1 − µ2,with the assumption that the
population standard deviations are equal. The pooled sample
standard deviation for the two samples will be used. Again,
there are two different syntaxes for the
2-SampTTest command.
If you know the statistical information for both samples (sample standard deviations,
sample means, and sample sizes), then the syntax is 2-SampTTest , Sx1, n1, , Sx2,
n2, µ1:, Pooled.
If you have the sample data stored in two lists, then the syntax is 2-SampTTest
List name1, List name2, Frequency list1, Frequency list2, µ1:, Pooled.
6
Example 2 (from above):
The following information is obtained from two independent samples selected from two
populations with unknown but equal standard deviations.
Test at the 5% significance level if µ1 > µ2.
n1 = 42 = 78.4 s1 = 10.13
n2 = 39 = 75.2 s2 = 9.55
Press STAT > TESTS > 4: 2-SampTTest and press ENTER.
Select Stats by moving the cursor over Stats and press the ENTER key.
Type in 78.4 for Ë1.
Type in 10.13 for Sx1.
Type in 42 for n1.
Type in 75.2 for Ë2.
Type in 9.55 for Sx2.
Type in 39 for n2.
Highlight > µ2 for µ1:
Highlight Yes for Pooled.
Highlight Calculate and then press the Enter key.
The 2-SampTTest output shows the:
alternative hypothesis: μ1 > µ2
test statistic: t=1.460144062
p-value: 0.0741080013
degrees of freedom: 79
sample means: Ë1=78.4; Ë2=75.2
sample standard deviations: Sx1=10.13; Sx2=9.55
pooled sample standard deviation: Sxp=9.85527418
sample sizes: n1=42; n2=39
The p-value is 0.0741080013, which is greater than 5%. We do not reject H0 and conclude that
population mean1 is not statistically-significantly greater than population mean2.
In the above example, the Calculate option was chosen. If the
Draw option is chosen, the calculator will draw the curve and
state the z value and the p-value.
Press STAT > TESTS > 4: 2-SampTTest and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
7
If you have the actual data rather than the statistics, then enter
the data in two lists and choose Data rather than Stats.
Independent Samples With Unknown and Unequal Population Standard Deviations
In the case of unequal population standard deviations, use the same procedures above for equal
population standard deviations, with the one exception of choosing No for Pooled.
Paired Samples
The confidence intervals and hypothesis tests described above all assume that the samples are
taken independently. Two samples that are taken from the same population are said to be
dependent. The most common data collection design with dependent samples is called
Pretest/Posttest. Data is collected from a sample before some type of treatment and then data is
collected again from that same sample after the treatment. We then work with the mean
difference between the pre- and posttest scores. The null hypothesis is that the average
difference is zero. By working with the differences between the variables, we can perform a
one-sample T-Test (it is rarely the case that the population standard deviations are known), using
the TInterval command.
Example 3:
Find the following confidence intervals for µd assuming that the
population of paired differences are normally distributed at the
99% confidence level.
n = 16 = 21.7 sd = 10.4
Press STAT > TESTS > 8: TInterval and press ENTER.
We do not have the sample data, so we will select Stats.
Select Stats for Inpt: and press the ENTER key.
Type in 21.7 for Ë (this is ).
Type in 10.4 for Sx (this is sd).
Type in 16 for n.
Type in .99 for C-Level.
Highlight Calculate and press the ENTER key.
8
The TInterval output shows the 99% confidence interval
along with the sample mean, sample standard deviation, and
sample size.
The 99% confidence interval is from 14.039 to 29.361.
We can state with 99% confidence that the mean difference
between the pre- and posttest is between 14.039 and 29.361.
Two Population Proportions
Confidence Interval: p1 - p2
Large and Independent Samples
If you have two large and independent samples, then you
would use a Z-based confidence interval, 2-PropZInt, to
estimate the difference p1 − p2 of two population proportions.
The function is located on the STAT page in the TEST list.
For population proportions, a large sample size is defined as
n1p1, n1q1, n2p2, n2q2 are all greater than 5.
Example 4:
The following information is obtained from two large and independent samples selected from
two populations. Construct a 95% confidence interval for p1 − p2.
n1 = 200 p1 = 0.42 n2 = 220 p2 = 0.35
The 2-PropZInt command requests x1 rather than p1. To find x1, use the formula .
x1 = (200)(0.42) = 84 x2 = (220)(0.35) = 77
Press STAT > TESTS > B: 2-PropZInt and press ENTER.
Type in 84 for x1.
Type in 200 for n1.
Type in 77 for x2.
Type in 220 for n2.
Type in .95 for C-Level.
Highlight Calculate and then press the Enter key.
The 2-SampZInt output shows the 95% confidence interval for
p1 − p2, as well as both sample proportions and both sample
sizes.
9
The confidence interval is between -0.023 and 0.116301.
With 95% confidence, we believe that the true difference between the population porportions is
between -0.023 and 0.116301.
Hypothesis Testing: p1 - p2
If you have two large and independent samples, then you would
use Z-based, 2-PropZTest, to perform a test of hypothesis
about p1 − p2. The function is located on the STAT page in the
TEST list. For population proportions, a large sample size is
defined as n1p1, n1q1, n2p2, n2q2 are all greater than 5.
Example 4 (from above):
The following information is obtained from two large and independent samples selected from
two populations. Test at the 1% significance level if the two population proportions are different,
p1 ≠ p2.
n1 = 200 p1 = 0.42 n2 = 220 p2 = 0.35
The 2-PropZTest command requests x1 rather than p1. To find x1, use the formula .
x1 = (200)(0.42) = 84 x2 = (220)(0.35) = 77
Press STAT > TESTS > 6: 2-PropZTest and press ENTER.
Type in 84 for x1.
Type in 200 for n1.
Type in 77 for x2.
Type in 220 for n2.
Highlight ≠ p2 for p1:
Highlight Calculate and then press the Enter key.
The 2-PropZTest output shows the
alternative hypothesis: p1 ≠ p2
test statistic: z=1.47
p-value: 0.14
sample proportions: =0.42; =0.35
pooled sample proportion: ; =0.383
sample sizes: n1=200; n2=220
The p-value is 0.14, which is greater than 1%. We reject H0 and conclude that the population
means are not statistically-significantly different.
10
In the above example, the Calculate option was chosen. If the Draw option is chosen, the
calculator will draw the curve and state the z value and the p-value.
Press STAT > TESTS > 6: 2-PropZTest and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
Chapter
11 Regression Analysis I
Simple Linear Regression
Simple Linear Regression Models
A simple linear regression model is an equation describing how to use one variable, x, to predict
another variable, y, based on the relationship existing in the sample data. Since the predictions
made from the sample data may differ from the actual values in the population data, the symbol
y' is used for the predicted value of y. The simplest possible model is a linear one: y' = a + bx.
The graph is a line, where b is the slope of the line and a is the y-coordinate of the y-intercept.
Creating a Linear Regression Model
The TI-84 Plus calculator has two built-in functions,
LinReg(ax+b) and LinReg(a+bx) to compute a simple linear
regression model. They are both located on the STAT page in
the CALC list. These are two forms of the same function, one
that writes the equation as ax+b and the other that writes the
equation as a+bx. We will use the latter form, a+bx, but either
is okay.
We will create a linear regression model for the English and
Math scores from the previous example. They are stored in lists
L1 and L2.
Press STAT > CALC > 8: LinReg(a+bx).
Type: L1 > , > L2 > ,
Press VARS > Y-VARS > 1 > 1 to get Y1.
Press ENTER.
2
The LinReg output shows:
general model: y=a+bx
y-intercept: a=53.26159596
slope: b=.3135854034
coefficient of determination: r2=.3883069253
correlation coefficient: r=.6231427808
The equation of the linear regression model for the English and Math scores is
y = 53.2616 + 0.3136x
Graphing the Linear Regression Line
The LinReg command we entered requested that the
equation of the linear regression model (least squares
line) be stored in Y1.
Execute the STAT PLOT command again and the least squares
line will appear on the scatter diagram.
Confidence Interval for B
A goal for determining the regression line is to find the true
value of the slope B of the population regression line. The
slope, b, of the regression line for the sample is a point estimate
of the slope, B, of the regression line for the population. A
different sample would give a different b value. Thus, b is a
random variable and has a t distribution.
The LinRegTInt function can be used to construct the
confidence interval for B, based on b. The LinRegTInt command is located on the STAT page
in the TESTS list. The command does require that the data be stored in two lists.
Example
Using the data in the English and Math scores example above, construct a 95% confidence
interval for B.
Press STAT > TESTS > G:LinRegTInt and press ENTER.
3
Type L1 in for the Xlist:
Type L2 for the Ylist:
Let the Freq: remain at 1.
Type .95 for C-Level:
Highlight Calculate and press ENTER.
The LinRegTInt output shows:
general model: y=a+bx
confidence interval: (-.0382, .66535)
slope: b=.3135854034
degrees of freedom: df=7
sample standard deviation: s=4.729745193
y-intercept: a=53.26159596
coefficient of determination: r2=.3883069253
correlation coefficient: r=.6231427808
The 95% confidence interval for the slope of the English and Math scores example is
(-.0382, .66535).
Hypothesis Tests
The LinRegTTest can be used to test whether or not the variable
x can meaningfully predict the variable y. This is equivalent to
testing whether or not B, the population slope coefficient for the
model (approximated by b), is really 0. LinRegTTest is located
on the STAT page in the TESTS list. The test requires the
names of the lists containing the data values and what the
alternative hypothesis is.
Example:
Test at the 1% significance level whether the slope of the regression line for the above example
on English and Math scores is positive.
Perform the test of H0: B = 0 versus H1: B > 0 in LinRegTTest.
4
Press the STAT > TESTS > F:LinRegTTest and press ENTER.
Type in L1 for the Xlist:
Type in L2 for the Ylist:
Let the Freq: remain at 1.
Move the cursor over >0 and press ENTER.
Set the RegEQ: to Y1.
Highlight Calculate and press ENTER.
The value of t is 2.108 and the
p-value is .0365, which is
greater than our significance
level of 1%. We fail to reject
the null hypothesis and
conclude that the slope is not
significantly greater than zero.
The regression equation was stored in Y1. This equation (linear
regression model) can be used to find y' for a given value of x,
such as x = 75.
Type Y1(75) and press ENTER.
To find the predicted Math score for someone with an English
score of 60, type Y1(60) to get a predicted Math score of 72.08.
One-Way Analysis of Variance
We have already seen how to test for the equality of means between two different populations
with 2-SampZTest and 2-SampTTest. With the added assumption of common population
standard deviations, we can extend the tests to more than two populations with a technique
known as Analysis of Variance (ANOVA for short).
In a one-way ANOVA, only one factor or variable is being tested. The null hypothesis is that
means of three or more populations are equal; the alternative hypothesis is that at least two of the
means differ.
The TI-84 built-in ANOVA function is on the STAT page in
the TESTS list. First store the sample data into lists, one per
population.
5
The syntax for the ANOVA function is: ANOVA(List1, List2, List3, . . .List20). There is a
minimum of 2 lists and a maximum of 20 lists. Use the list names in which the data is stored.
The result contains the F test statistic, degrees of freedom, various sums of squares, mean sums
of squares, and most importantly the p-value for the test.
Example: Fourth Grade Arithmetic With Equal Sample Sizes
Fifteen fourth-grade students were exposed to one of three
different methods of teaching arithmetic. They were randomly
assigned to three groups of five. At the end of the semester, the
same test was given to all 15 students. The following table gives
the scores of the students in the three groups.
Assume that the three populations are normally distributed with equal standard deviations.
Calculate the value of the test statistic F. At the 1% significance level, can we reject the null
hypothesis that the mean arithmetic score of all fourth-grade students taught by each of these
three methods is the same?
Press STAT > ENTER key to get to the Stat Editor.
Enter the Method I data values into L1.
Enter the Method II data values into L2.
Enter the Method III data values into L3.
Press the STAT > TESTS > H: ANOVA and press ENTER.
Type in: L1, L2, L3) and press ENTER.
Method
1
Method
2
Method
3
48 55 84
73 85 68
51 70 95
65 69 74
87 90 67
6
The one-way ANOVA table would appear as follows:
Source of
Variation
Degrees of
Freedom
Sum of Squares Mean Squares Value of the
Test Statistic
Between
(Factor)
Within (Error)
2
12
432.1333
2372.8
216.0667
197.7333 F = =
1.0927
Total 14 2804.9333 p = 0.3665
The df for Between is the number of populations minus 1: k -1 = 3 – 1 = 2.
The df for Within is the number of data items minus number of populations: n - k = 15 – 3 = 12.
The p-value is 0.366. Since this is greater than the significance level of 1%, we fail to reject the
null hypothesis. There is insufficient evidence from the data to show that the different teaching
methods have significantly different average results.
Example: Bank Tellers With Unequal Sample Sizes
From time to time, unknown to its employees, the research department at Post Bank observes
various employees for their work productivity. Recently this department wanted to check
whether the four tellers at a branch of this bank serve, on average, the same number of customers
per hour. The research manager observed each of the four tellers for a certain number of hours.
The following table gives the number of customers served by the four tellers during each of
the observed hours.
At a 5% level of significance, test the null hypothesis that the mean number of customers served
per hour by each of the four tellers is the same. Assume all the assumptions required to apply the
one-way ANOVA procedure hold true.
Press STAT > ENTER key to get to the Stat Editor.
Enter the Teller A data values into L1.
Enter the Teller B data values into L2.
Enter the Teller C data values into L3.
Enter the Teller D data values into L4.
Teller A Teller B Teller C Teller D
19 14 11 24
21 16 14 19
26 14 21 21
24 13 13 26
18 17 16 20
13 18
7
Press the STAT > TESTS > H: ANOVA and press ENTER.
Type in: L1, L2, L3, L4) and press ENTER.
The one-way ANOVA table would appear as follows:
Source of
Variation
Degrees of
Freedom
Sum of Squares Mean Squares Value of the
Test Statistic
Between
(Factor)
Within (Error)
3
18
255.6182
158.2
85.2061
8.7889 F = =
9.6947
Total 21 413.8182 p = 0.0005
The df for Between: k -1 = 4 – 1 = 3.
The df for Within: n - k = 22 – 4 = 18.
The p-value is 0.0005. Since this is less than the significance level of 5%, we reject the null
hypothesis. There is significant evidence here to show that the tellers’ job performances are not
all the same.
Chapter
12 Regression Analysis II
Multiple Linear Regression
and Other Topics
The TI-84 Plus calculator does not have multiple regression functions built directly into it. The
calculator can aid in solving multiple regression problems by using the formulas found in chapter
12.
2
Chapter
13 Analysis of
Categorical Data
A Goodness-of-Fit Test A goodness-of-fit test is used to make a test of hypothesis about
experiments with more than two possible outcomes (or
categories). These are called multinomial experiments. The
frequencies of each possible outcome obtained from the
experiment are called the observed frequencies. A goodness-of-
fit test tests the difference between the observed frequencies
and the expected frequencies (npi). This difference follows a
chi-square distribution. The sample size should be large enough
so that the expected frequency for each category is at least 5.
The TI-84 Plus command for the goodness-of-fit test is χ2GOF-Test, which performs a test to
confirm that sample data is from a population that conforms to a specified distribution (expected
frequencies). The command requires that both the observed and expected frequencies are put in
lists. The degrees of freedom, df, is the number of outcomes minus 1. The command is located
on the DISTR page. DISTR appears above the VARS key.
Example 1:
The following table lists the frequency
distribution of 90 rolls of a die. Test at
the 5% significance level whether the
null hypothesis that the given die is
fair is true.
Enter the observed frequencies into List L1 and the expected
frequencies into List L2, as shown on the left.
Outcome of roll 1 2 3 4 5 6
Observed Frequency 16 21 15 12 14 12
Expected
Frequency
15 15 15 15 15 15
2
There are 6 possible outcomes, so the degrees of freedom, df, is 6 – 1 = 5.
Select: STAT > TESTS > D: χ2GOF-Test and press ENTER.
Type in 2nd
> 1 for Observed.
Type in 2nd
> 2 for Expected.
Type in 5 for df:
Highlight Calculate and press ENTER.
The output for the χ2GOF-
Test shows the:
Chi-square value: χ2
= 3.7333
p-value = 0.5884
Degrees of freedom: df = 5
CNTRB = {.066666667, 2.4, 0, .6, .066666667}
Note: CNTRB= provides a list of the contributions of each
category to the overall value of χ2. This would be the values in the column. Notice that a
roll of ‘2’ had the largest contribution.
Since the p-value is greater than 5%, reject the null hypothesis and conclude that the dice is not
fair.
Contingency Tables
When measuring the relationship between two categorical variables, one of the most important
tools for analyzing the results is a two-way classification table, also known as a contingency
table.
A Test of Independence
The contingency table can be used to see if the variables are
independent, by comparing observed frequencies with the
frequencies that would be expected from such a sample if they
were independent. A Chi-Square (Χ2) test-statistic can be
computed from the observed and expected frequencies. The TI-
84 function Χ2-Test is used for the Test of Independence and is
located on the STAT page in the TESTS list. The Χ2-Test
function works differently than the other tests on this menu. It
requires you to enter the observed frequencies into a matrix. The computed expected frequencies
are then stored automatically in another matrix.
Example: School Referendum
In Favor Against No Opinion
3
A random sample of 300 adults was selected and asked if
they were in favor of the new school referendum. The
two-way classification table of the responses of the
adults is presented in the table.
Test at a 5% significance level whether or not Gender and Opinion are independent.
First the data has to be stored differently than in previous
statistical tests on the calculator. The data for a Chi-Square test
of independence has to be stored in a Matrix.
Type: 2nd
> x-1
to get to MATRIX.
Highlight EDIT and press the ENTER key.
After MATRIX [A] type in 2 x3.
The 2 represents the number of rows in the table.
The 3 represents the number of columns in the table.
Enter the data values into the matrix as they appear in the table.
Select: STAT > TESTS > C: χ2-Test and press ENTER.
The χ2-Test screen shows:
That the Observed values are in matrix A
That the Expected values will be put in matrix B
Highlight Calculate and press ENTER.
The output for the χ2-Test shows the:
Test statistic: χ2 =8.252773109
P-value: p =.0161410986
Degrees of freedom: df=2
Since the p-value of .016 is less than the significance level of .05, we reject the null hypothesis
that row and column variables are independent. We have significant evidence that gender and
opinion concerning the school referendum are related for all adults.
The Expected frequencies can be found in Matrix B.
Men 93 70 12
Women 87 32 6
4
Type: 2nd
> x-1
to get to MATRIX.
Select: 2: [B] and press the ENTER key.
Press ENTER again.
In the above example, the Calculate option was chosen. If the Draw option is chosen, the
calculator will draw the curve and state the χ2 value and the p-value.
Select: STAT > TESTS > C: χ2-Test and press ENTER.
The original information should still be there.
Highlight Draw and press ENTER.
A Test of Homogeneity
A test of homogeneity is a test to determine if two (or more) populations are homogeneous
(similar) with regard to the distribution of a certain characteristic. The procedure to perform this
test on the TI-84 Plus is identical to performing a Test Of Independence (please see above).
Chapter
14 Analysis of Variance
(ANOVA)
One-Way Analysis of Variance
We have already seen how to test for the equality of means between two different populations
with 2-SampZTest and 2-SampTTest. With the added assumption of common population
standard deviations, we can extend the tests to more than two populations with a technique
known as Analysis of Variance (ANOVA for short).
In a one-way ANOVA, only one factor or variable is being tested. The null hypothesis is that
means of three or more populations are equal; the alternative hypothesis is that at least two of the
means differ.
The TI-84 built-in ANOVA function is on the STAT page in
the TESTS list. First store the sample data into lists, one per
population.
The syntax for the ANOVA function is: ANOVA(List1, List2, List3, . . .List20). There is a
minimum of 2 lists and a maximum of 20 lists. Use the list names in which the data is stored.
The result contains the F test statistic, degrees of freedom, various sums of squares, mean sums
of squares, and most importantly the p-value for the test.
Example: Fourth Grade Arithmetic With Equal Sample Sizes
Fifteen fourth-grade students were exposed to one of three
different methods of teaching arithmetic. They were randomly
assigned to three groups of five. At the end of the semester, the
same test was given to all 15 students. The following table gives
the scores of the students in the three groups.
Method
1
Method
2
Method
3
48 55 84
73 85 68
51 70 95
65 69 74
87 90 67
2
Assume that the three populations are normally distributed with equal standard deviations.
Calculate the value of the test statistic F. At the 1% significance level, can we reject the null
hypothesis that the mean arithmetic score of all fourth-grade students taught by each of these
three methods is the same?
Press STAT > ENTER key to get to the Stat Editor.
Enter the Method I data values into L1.
Enter the Method II data values into L2.
Enter the Method III data values into L3.
Press the STAT > TESTS > H: ANOVA and press ENTER.
Type in: L1, L2, L3) and press ENTER.
The one-way ANOVA table would appear as follows:
Source of
Variation
Degrees of
Freedom
Sum of Squares Mean Squares Value of the
Test Statistic
Between
(Factor)
Within (Error)
2
12
432.1333
2372.8
216.0667
197.7333 F = =
1.0927
Total 14 2804.9333 p = 0.3665
The df for Between is the number of populations minus 1: k -1 = 3 – 1 = 2.
The df for Within is the number of data items minus number of populations: n - k = 15 – 3 = 12.
3
The p-value is 0.366. Since this is greater than the significance level of 1%, we fail to reject the
null hypothesis. There is insufficient evidence from the data to show that the different teaching
methods have significantly different average results.
Example: Bank Tellers With Unequal Sample Sizes
From time to time, unknown to its employees, the research department at Post Bank observes
various employees for their work productivity. Recently this department wanted to check
whether the four tellers at a branch of this bank serve, on average, the same number of customers
per hour. The research manager observed each of the four tellers for a certain number of hours.
The following table gives the number of customers served by the four tellers during each of
the observed hours.
At a 5% level of significance, test the null hypothesis that the mean number of customers served
per hour by each of the four tellers is the same. Assume all the assumptions required to apply the
one-way ANOVA procedure hold true.
Press STAT > ENTER key to get to the Stat Editor.
Enter the Teller A data values into L1.
Enter the Teller B data values into L2.
Enter the Teller C data values into L3.
Enter the Teller D data values into L4.
Press the STAT > TESTS > H: ANOVA and press ENTER.
Type in: L1, L2, L3, L4) and press ENTER.
Teller A Teller B Teller C Teller D
19 14 11 24
21 16 14 19
26 14 21 21
24 13 13 26
18 17 16 20
13 18
4
The one-way ANOVA table would appear as follows:
Source of
Variation
Degrees of
Freedom
Sum of Squares Mean Squares Value of the
Test Statistic
Between
(Factor)
Within (Error)
3
18
255.6182
158.2
85.2061
8.7889 F = =
9.6947
Total 21 413.8182 p = 0.0005
The df for Between: k -1 = 4 – 1 = 3.
The df for Within: n - k = 22 – 4 = 18.
The p-value is 0.0005. Since this is less than the significance level of 5%, we reject the null
hypothesis. There is significant evidence here to show that the tellers’ job performances are not
all the same.
Chapter
15 Nonparametric
Inference
The TI-84 Plus calculator does not have nonparametric functions built directly into it. However,
the calculator can be used to solve problems involving nonparametric statistical methods. The
following examples show how to use the TI-84 plus calculator to compute the numbers needed
for the nonparametric:
sign test
paired-sample sign test
Wilcoxon Rank Sum test for two independent samples.
Sign Test
The Sign Test is a nonparametric test and it can be used even if nothing is known about the
continuous population distribution. The Sign Test can test claims about the median of the
population, because any sample measurement stands a 50% chance of being above the median
and a 50% chance of being below the median. Sample data can be treated as a binomial
experiment, with successes occurring when the measurement is above the median.
Example: Median Price of Homes (Small Sample)
A new car salesman states that the median price of cars in a small town is $37,000. A sample of
10 cars selected by a statistician produced the following data on the prices in dollars.
Car 1 2 3 4 5 6 7 8 9 10
Price 47,500 23,600 37,000 68,200 29,450 42,00 56,400 88,210 98,425 15,300
2
Using a 5% significance level, can we conclude that the median price differs from $37,000?
The hypotheses are:
H0: the median price is $37,000
H1: the median price differs from $37,000
Since one of the data items is the median, $37,000, it will be ignored. Thus, n = 9.
The sample prices are replaced by signs:
Plus (+) if the price is above the hypothetical median of $37,000
Minus (-) if the price is below $37,000.
We have six +’s out of nine signs. Since it is a two-tailed test, the p-value will be twice the
probability of getting six or more successes out of nine trials in a binomial experiment where
p = 0.5. The binomcdf(n, p, x) function calculates the probability from 0 to x and it can be used
to calculate this probability. The binomcdf( function is located at DISTR.
Thus, n = 9, p = 0.5, and x = 6.
p-value = 2 P(X ≥ 6)
= 2(1 − P(X ≤ 5))
= 2(1 − binomcdf(9, 0.5, 5))
= 0.5078125
Since the p-value of 0.5078 is greater than our significance level of 5%, we fail to reject the null
hypothesis. There is not enough evidence to conclude that the median differs from $37,000.
Paired-Sample Sign Test
Given paired data, the Sign Test can be used to test to see if the medians of their respective
populations are the same. If the medians are the same, then there is a 50% chance for one
measurement of a pair to be larger than the other. Each pair can be replaced by a sign indicating
whether the first or the second measurement is larger.
Example: Blood Pressure (Small Sample)
A researcher wants to find the effect of a special diet on systolic blood pressure in adults. She
selected a sample of 12 adults and put them on this dietary plan for three months. The following
table gives the systolic blood pressure of each adult before and after the completion of the plan.
Car 1 2 3 4 5 6 7 8 9 10
Price + - ignored + - + + + + -
3
Subject A B C D E F G H I J K L
Before 210 185 215 198 187 225 234 217 212 191 226 238
After 196 192 204 193 181 233 208 211 190 186 218 236
Using a 2.5% significance level, can we conclude that the dietary plan reduces the median
systolic blood pressure of adults? Our hypotheses are:
H0: the diet does not reduce the median systolic blood pressure of adults
H1: the diet does reduce the median systolic blood pressure of adults
We begin by adding a plus or minus sign indicating whether the Before or After pressure is
larger.
Plus (+) if the Before is larger
Minus (-) if the After is larger
Subject A B C D E F G H I J K L
Before 215 195 200 198 177 225 244 217 212 191 226 228
After 186 196 185 183 161 203 208 200 189 189 221 240
Sign + - + + + + + + + + + -
There are ten +’s out of twelve signs. Since it is a right-tailed test, the p-value will be the
probability of getting ten or more successes out of twelve trials in a binomial experiment where p
= 0.5.
Thus, n = 12, p = 0.5, and x = 10.
p-value = P(X ≥ 10)
= 1 − P(X ≤ 9)
= 1 − binomcdf(12, 0.5, 9)
= 0.019
Since the p-value is less than the significance level 2.5%, we reject the null hypothesis. The data
leads us to conclude that the diet does lower the median systolic blood pressure of adults.