Statistics...What is Statistics? •It is a tool to. . . •Aid in organizing and summarizing data...

Post on 24-Apr-2020

5 views 0 download

Transcript of Statistics...What is Statistics? •It is a tool to. . . •Aid in organizing and summarizing data...

StatisticsChapter 13 in APIC Text

Objectives……

• Provide definitions

• Role of Stats

• Descriptive Statistics

• Inferential Statistics

• Data Presentation

• Interpreting the Data

Key Concepts

• A working knowledge of statistics is a requirement for an effective infection prevention and control program.

• Ability to collect, organize, and analyze data is fundamental.

• Appropriate statistical methods must be used if correct interpretation of data is expected.

• Data should be displayed in appropriate graphs

and shared with those in the facility who will use the data to improve outcomes.

What is Statistics?• It is a tool to. . .

• Aid in organizing and summarizing data

• To communicate findings clearly and meaningfully to others

• To make inferences about data.

• Stats can neither prove an association or causality. Strength of association is determined by computing statistical tests.

• Definition of Statistics……..

• Role of Statistics in Hospital Epidemiology

– To analyze data and prepare reports for healthcare facility

– To obtain knowledge about hospital epidemiology for the facility

– To investigate clusters of infections and describe an outbreak

– To perform research studies within your facility (work with statistician before you begin research).

• Descriptive Statistics – Provide numerical information about variables

– Scales of measurement

• Nominal scale

• Ordinal scale

• Interval scale

• Frequency Measure

• Rates

• Incidence rate

• Attack Rate

• Incidence

• Prevalence rate

• Mortality rate

• Standard deviation

• Ratios

Notification for a 2x2 Table:

Disease No Disease Total

Factor

Present

A B A+B

Factor

Absent

C D C+D

Totals A+C B+D N

• Testing for reliability

–Definition

– Sensitivity

– Specificity

–Positive Predictive

–Negative Predictive

• Bivariate Relationships

–Correlation –

–Regression

–Confound Variables –

• Measures of central tendency data

–Mean: average of all values

–Median: middle point 50%

–Mode: most frequent

• Measures of dispersion (variability)

–Range

–Deviation

– Standard Deviation

• Frequency distribution

– Normal distribution

– Skewness – Asymmetrical distribution

• Negative Skew curve to left

• Positive Skew curve to the right

– Kurtosis – how flat or peaked a curve is

Intervals

• One standard deviation = approx 68% of the measurements.

• Two standard deviations = approx 95% of the measurements.

• Three standard deviations = approx 99.7% of the measurements.

Inferential Statistics

– Terms

• Population

• Sample

• Hypothesis testing

• Types of inferential statistics

– Parametric- assume normal distribution, continuous-interval scale

• z test to examine two means (sample >30)

• t test to examine two means (sample <30)

• Mann-Whitney U test when <5 in any box on 2X2 table

– Nonparametric- no assumption of distribution, discrete, nominal, ordinal and interval data

• Chi-square to measure observed versus expected frequency in 2 different groups (Fisher’s exact test used when <5 in any cell of a 2X2 table)

– Type I

• Error - the probability of rejecting a true null hypothesis; indicates the significance level, or P value, usually at 0.05

– Type II

• Error - the probability of not rejecting a false null hypothesis; heavily influenced by sample size

– Control - Keeping the alpha level very small will decrease the risk for committing a type I error, no control over type II error except determination of sample size

– Type I & II errors are inversely related; decreasing the risk for committing a type I error increases the risk of committing a type II error; Impossible to control both simultaneously

• Test statistic - numeric measure, computed from a set of sample measurements, which quantifies the magnitude of discrepancy between the hypothesized population parameter and the statistic computed from the sample.

– Can be converted to a probability value using special tables

• Level of significance (alpha) -probability value arbitrarily chosen by the researcher as the desired level of probability at which one may feel secure in rejecting the null hypothesis

• P value - the probability, given that the null hypothesis is true, of collecting a random sample of the same size from the same population that yields a test statistic at least as extreme as the one calculated from the sample.

–Rejection region

–Confidence interval (CI)

• The CI is usually calculated to be 95% but can be 90%, 99%, or 99.99%

• Data presentation– Methods

• Tables- set of data that is arranged in rows and columns

• Graphs- method of showing quantitative data using a system of coordinates

– Types of line graphs

• Arithmetic scale line graph

• Semi-logarithmic scale line graph

• Histogram

• Frequency polygon

–Charts – Method of illustrating information using only one coordinate

• Bar chart

• Geographic coordinate chart

• Pictogram

• Pie Chart

• Statistical Process Control

– Definition – purpose to ensure that each process is performed consistently and correctly within predetermined parameters. Focuses on process and is based on the principle of random variation

– Is used in IC programs to monitor outcomes or to monitor process of care. Facilitates the determination of common cause or special cause. (Was there something different that happened)

– Tools – Control charts, types of data and frequency of events determine type of control chart• Continuous or discrete data

• Events that occur frequently usually follow normal distribution

• Frequency of event and type of sample determine chart to be used

– Constructing a control chart

• Collect your data

• Calculate your mean (average)

• Calculate standard deviation’

• Set up control chart, indicating mean, and upper and lower control limits. Usually 3 standard deviations

• Determine the scale for Y and X axis

• Interpret data to determine if it is in control or out of control & monitor for trends

OBSERVATIONAL

Research question involves a

prevention, treatment, or

causal factor.

Moderate or large effect

expected.

Trial not ethical or feasible.

Trial too expensive.

EXPERIMENTAL

Research question involves a

prevention or treatment.

Small effect expected.

Ethical and feasible.

Money is available

COHORT

Little known about exposure.

Evaluate many effects of an

exposure.

Exposure is rare.

Underlying population is

fixed

CASE–CONTROL

Little known about disease.

Evaluate many exposures.

Disease is rare.

Disease has long induction and

latent period.

Exposure data are expensive.

Underlying population is

dynamic.

RETROSPECTIVE

Disease has long induction and

latent period.

Historical exposure.

Want to save time and money.

PROSPECTIVE

Disease has short induction

and

latent period.

Current exposure.

Want high-quality data

DECISION TREE for CHOOSING STUDY DESIGN

versus

versus

versus

Practice………

1. Which statistical test is used when the data are in small numbers?

Fisher’s exact

t test

Chi-square

z test

1. Which statistical test is used when the data are in small numbers?

Fisher’s exact

t test

Chi-square

z test

2. Statistical process control (SPC) charts are used for all of the following purposes except…..

a. Monitor the process of care

b. Facilitate the determination of variation

c. Eliminate natural variation

d. Monitor outcomes

2. Statistical process control (SPC) charts are used for all of the following purposes except…..

a. Monitor the process of care

b. Facilitate the determination of variation

c. Eliminate natural variation

d. Monitor outcomes

3. The frequency measures MOST commonly used in health care epidemiology are;

a. Mean, median and mode

b. Risk ratio and odds ratio

c. Incidence rate, prevalence rate and incidence density

d. Variance, standard deviation and range

3. The frequency measures MOST commonly used in health care epidemiology are;

a. Mean, median and mode

b. Risk ratio and odds ratio

c. Incidence rate, prevalence rate and incidence density

d. Variance, standard deviation and range

4. The measure of central tendency MOST affected by outliers is:

a. Mean

b. Median

c. Mode

d. Range

4. The measure of central tendency MOST affected by outliers is:

a. Mean

b. Median

c. Mode

d. Range

5. If chance is a likely explanation for the difference between a sample statistic and the corresponding null hypothesis population value, then:

a. The difference is not statistically

significant

b. The sample result is not compatible with

the null hypothesis

c. The difference is statistically significant

d. The null hypothesis can be rejected

5. If chance is a likely explanation for the difference

between a sample statistic and the corresponding null hypothesis population value, then:

a. The difference is not statistically

significant

b. The sample result is not compatible with the

null hypothesis

c. The difference is statistically significant

d. The null hypothesis can be rejected

6. One hundred preschool children were monitored

for colds during the winter. Eighteen of them have asthma. Of the children with asthma, 65% had two or more colds. Of the children who did not have asthma, 30% had two or more colds. What type of study was this?

a. Case-control c. Cross sectional

b. Cohort d. Period prevalence

6. One hundred preschool children were monitored

for colds during the winter. Eighteen of them have asthma. Of the children with asthma, 65% had two or more colds. Of the children who did not have asthma, 30% had two or more colds. What type of study was this?

a. Case-control c. Cross sectional

b. Cohort d. Period prevalence

References:

• Certification Study Guide 6th Edition; APIC.