6.Review of Statistical Methods---Misiri.ppt

Post on 08-Dec-2015

217 views 3 download

Tags:

Transcript of 6.Review of Statistical Methods---Misiri.ppt

REVIEW OF STATISTICAL METHODS

HE MISIRICOMMUNITY HEALTH DEPARTMENT

• Describe how you will conduct descriptive statistical analysis in your study

• Describe how you will conduct hypothesis testing in your study (when applicable)

• Describe the statistical tests you will use to analyse data from your proposed study

Random Error

• Research is usually conducted on samples.• It is expensive, time-consuming and

logistically difficult to conduct a census.• Sample estimates will always be unexact

because of sampling error also known as random variation.

• The smaller the sample the greater the variation.

Types of Data

• Categorical data-from categorical variables like eye colour,sex, marital status, level of education etc

• A categorical variable has categories. Eg Sex is categorised as Male or Female.

• Continuous variables assume any value on the real line.

• Continuous data is from continuous variables

Scales of measurement

• Nominal: Sex• Ordinal: Severity of pain• Interval/Ratio: Weight, Speed

Describing data

Data can be described by using:• Charts• Tables• Numerical summary values• Shapes of distributions

1. Charts-Histogram

Pie Chart

2. Tables-Frequency distributionAge group Number of patients

< 30 30

31-40 102

41-50 162

51-60 96

61-70 22

71-80 4

Total 416

Percentage distributionSatisfaction with nursing care

No of patients Percentage

Very satisfied 121 25.5

Satisfied 161 33.9

Neutral 90 18.9

Dissatisfied 51 10.7

Very dissatisfied 52 10.9

TableSex Mean Age(SD)

Males 20.3(1.2)

Females 18.2(1.6)

All 19.3(1.8)

Table from Misiri et al(2012b)

HIV Rates-Misiri et al(2012a)

4. Shapes of distributions:Symmetry and kurtosis

The degree of “peakedness”(Chris Caple,1991) is called kurtosis

• Positively skewed

• Negatively skewed

Positively skewed

Symmetric

Kurtosis

Variation in sample data

Numerical summaries

• For categorical data one uses numbers/frequencies ,percentages or proportions, rates to describe data.

• For continuous data one uses measures of central tendency and variation

3. Numerical values- Summary statistics

Examples of summary statistics are:A. Measures of central tendency:

Mean,Median,Mode

B. Measures of variation:Variance,standard deviation,range,interquartile range

C. Other statistics: Proportion, Percentiles, etc

Examples-categorical data

• In a class of 200 students, 51 are males and 149 are females.-Numbers.

• 25.5% of patients were very satisfied with nursing care

• The prevalence of Chlamydia in young women in England in 1996 was 3.1%.

• The incidence rate of cancer is 90 cases per 100,000 person years of time

Moe examples:et al(2012b)

Ze & Misiri(2009)

Descriptive statistics-Categorical data

• Proportions

• Percentages

• Each proportion should have a CI

• Better summarized in a percentage distribution or frequency distribution

Appropriate average to use

• Use the mean and standard deviation for symmetric data.

• Use the median and range or quartiles for skewed data.

Misiri et al(2012c)

Standard deviation

• SD=sqrt(44.8/4) =3.3

• This is the average variation in the data.

• That means the difference between individual data points and the sample mean is on average 3.3.

• A normal distribution is a distribution that is symmetric and looks similar to a bell in shape. If distribution of the data in a population follows a normal distribution (the measure of spread around the mean) then:

• The range covered by 1 SD below and 1 above the mean includes 68% of the distribution.

• The range covered by 2 SDs below and 2 above the mean includes 95% of the distribution.

• The range covered by 3 SDs below and 3 above the mean includes 99.7% of the distribution.

• The standard deviation is not used for the scatter around the median. The measure for the scatter around the median is the INTER-QUARTILE RANGE. There are three quartiles: at 25%, 50% and 75%. They divide the data into four quarters in a similar way to the median (the 50%-ile) dividing it into two halves. The inter-quartile range is the range of values between the 25%-ile and the 75%-ile. These values are used in producing a box (and whisker)-plot.

Bell-shaped distribution

• The standard deviation is not used for the scatter around the median. The measure for the scatter around the median is the INTER-QUARTILE RANGE.

• There are three quartiles: at 25%, 50% and 75%. They divide the data into four quarters in a similar way to the median (the 50%-ile) dividing it into two halves. The inter-quartile range is the range of values between the 25%-ile and the 75%-ile. These values are used in producing a box (and whisker)-plot.

Example:Plasma glucose

• 4.67• 4.97• 5.11• 5.17• 5.33• 6.22• 6.50• 7.00

Hypothesis Testing

• Null• Alternative• Type I Error• Type II Error• Level of significance

Example

• Null hypothesis: mothers attending ANC at clinic A are as likely to be attended by a skilled birth attendant as mothers attending ANC at clinic B

• Alternative hypothesis: mothers attending ANC at clinic A are either more likely or less likely to be attended by a skilled birth attendant as mothers attending ANC at clinic B.

Paired samples t-test

• See example

Independent sample t-test

• See example

• P-value is the probability that the statistic is as observed from your sample or even more extreme.

Example:• If Ho: Mean Difference=0• Ha: Mean Difference >0• The test statistic is Z• Given that the level of significance is 5%:

• We will reject Ho if p-value < 5%• This is so because this implies that our

findings are less likely to have happened by chance.

• We will accept Ho if the p-value > 5%• This is so because this implies that our

findings are more likely to happen as stated in the Ho.

WARNING!

• Do not abuse p-values• P-values should always be accompanied by

confidence intervals.• Confidence intervals give the magnitude of

the effect as well as the precision of estimation.

Example:Zverev & Misiri(2009)

• One-way analysis of variance revealed a significant effect of shift phase on total sleep duration (F = 36.8, d.f. = 8, P < 0.000).

• Ho:The mean total sleep duration of the three shift phases are equal.Ha:The mean total sleep duration for the three shift phases are different.

Summary of methods