Basic statistics 11/09/13. Topics to cover Averages: Mean, Median, Mode, Range, Confidence...

Post on 12-Jan-2016

217 views 0 download

Transcript of Basic statistics 11/09/13. Topics to cover Averages: Mean, Median, Mode, Range, Confidence...

Basic statistics

11/09/13

Topics to cover

Averages: Mean, Median, Mode, Range, Confidence intervals, Standard deviation

Incidence and prevalence Screening tests – positive and

negative predictive values, sensitivity and specificity

Basic statistics

Central Tendencies and Spread of Data

Measures of central tendency

15 patients with epilepsy were recruited into a trial. They were asked to record the number of seizures they had in a six month period. The results are presented below.

4 2 1 1 13 1 9 2 1 1 2 3 7 5 8

Measures of central tendency

Calculate the: Mean Median Mode Range

Measures of central tendency

Calculate the: Mean = 4 Median = 2 Mode = 1 Range = 12

Measures of central tendency Mean: sum of the observations divided by the

number of observations (1 + 1 + 1 + 1 + 1 + 2 + 2 + 2 + 3 + 4 + 5 + 7 + 8 + 9 + 13) /

15 = 4

Median: the middle value when the total observations are arranged in order of increasing value

1 1 1 1 1 2 2 2 3 4 5 7 8 9 13

Mode: the most commonly occurring value 1 1 1 1 1 2 2 2 3 4 5 7 8 9 13

Range: the difference between the highest and lowest values in a set of data

13 – 1 = 12

Standard deviation

A measure of the spread of the data

Can be used to calculate confidence intervals

Normal Distribution

Confidence intervals Used to assess statistical significance Provides a measure of the extent to

which a sample estimate is likely to differ from the true population value

Indicates with a standard level of certainty (usually 95%), the range of values within which the true population mean is likely to lie e.g. 25±5

Confidence intervals contd. For a given level of confidence:

a narrow interval indicates that the sample estimate has good (high) precision

a wide interval indicates that the sample estimate has poor (low) precision

Confidence intervals become narrower as: the sample size increases the variability of the data decreases the degree of confidence required for the

population mean decreases e.g. 90%, 95%, 99%

Basic statistics

Incidence and Prevalence

Incidence

In the last year there have been 24 new cases of colorectal carcinoma in your practice (list size 10276). What is the incidence of colorectal carcinoma?

Incidence = 24 / 10276 = 0.0023 Incidence per 1000 = 2.30 per 1000

Incidence

Number of new cases diagnosed in a population per unit of time

Incidence rate = (number of new cases diagnosed in a given period of time / population size) x 100, 000 (or 1000 etc.)

Prevalence 2593 of the 8725 patients

registered with your practice have a BMI of 30 or more. What is the prevalence of obesity?

Prevalence = (2593 / 8725) x 100 = 29.7%

Prevalence = (2593 / 8725) x 1000 = 297.2 per 1000

Prevalence

Total number of cases per population at a particular point in time (e.g. number per 100,000 population) Prevalence rate = (number of cases in

population / total size of population) x 100,000 (or 1000 etc.)

Prevalence = incidence x duration of condition

Relationship between incidence and prevalence

Basic statistics

Screening Test Statistics

Screening tests

A blood test to help diagnose cervical cancer has been developed. A study is done on 1000 patients comparing this test to the standard technique

Cervical cancer present

Cervical cancer absent

New test positive

100True positives (TP)

50False positives(FP)

New test negative

10False negatives(FN)

840True negatives(TN)

Positive and negative predictive values

Positive predictive value (PPV): proportion of people who test positive who actually have the disease PPV = TP / (TP + FP)

Negative predictive value (NPV): proportion of people told they don’t have the disease that really don’t have it NPV = TN / (TN + FN)

Give an indication of the reliability of a positive or negative test result

PPV and NPV

PPV = 100 / (100+50) = 0.67 = 67%

NPV = 840 / (840+10) = 0.99 = 99%

Cervical cancer present

Cervical cancer absent

New test positive

100 50New test negative

10 840

The higher the PPV, the more likely it is that a patient with a positive test result does have the disease

The higher the NPV, the more likely it is that someone who has tested negative really doesn’t have the disease

Sensitivity and specificity Sensitivity: the proportion of people with a

disease who are detected by the test (proportion of positives found) Sensitivity = TP / (TP + FN)

Specificity: the proportion of people who don’t have a disease who test negative (proportion of negatives found) Specificity = TN / (TN + FP)

Indicate the proportion of the population with/without the disease which will be detected by the test

Sensitivity and specificity

Sensitivity = 100 / (100+10) = 0.91 = 91%

Specificity = 840 / (840+50) = 0.94 = 94%

Cervical cancer present

Cervical cancer absent

New test positive

100 50New test negative

10 840

High sensitivity = few missed diagnoses High specificity = few false positives