Statistics introduction
Transcript of Statistics introduction
STATISTICS IN MEDICINE
What is statistics ?
`Statistics involves
data or informationeg.Information regarding a countryInformation about weatherInformation on health status Information about a patient Information on normal subjects
WHO
Unesco
CDC
What is statistics ?Statistics is the science dealing with
analysis, presentationcollection,
& interpretation of data
When is statistics useful in day-to-day life
weather reportin a shopin an examinationin match makingin lottery
Weather report rain will be
expected in the central area
rest of the country most likely will be dry
Likelihood, chance
ColomboWeather
In a shop
what will you buy?
how to select?which one is
good?how much is
worth?
Previous knowledge
Banana
In an examinationwhat is the pass
mark?who has sound
knowledge & who has poor knowledge?
half knowledge, should you pass?
performance, cut off point
In match making in matching two
people previous known
details details of the family
based on these a prediction is made
does it always work ???
prediction
In lotterychoice between
number to win and number to lose
previous experience
guessworkprediction
prediction
SummaryLikelihoodChancePrevious knowledgeCut off pointGuessworkPredictionProbability
Statistics
When is statistics useful in medicine ?
decision making on diagnosis eg. a patient with headache
• does all signs and symptoms fit into already known diseases
• previous knowledge, experience from previous patients
• guesswork• does it work ???
When is statistics useful in medicine ?
decision making on treatment eg. a patient with headache
• which drug to be given among A,B,C analgesics
• previous knowledge, experience of using them
• guesswork• does it work ???
When is statistics useful in medicine ?
acquiring new knowledge eg. how does blood flow against
gravity• planning a study• doing the study• arriving at conclusions
When is statistics useful in medicine ?
surveys of diseases in a population eg. how far malaria has spread
• planning a study• doing the study• arriving at conclusions
Summary
Statistics is useful in medicine in different areas
Research studies use statistics in data analysis
VariablesTask: write 3 variables and 1 constant
found in the human body eg. Height as a variable
almost all biological features show variation
it is extremely difficult to find a feature which does not vary???
Variables
What is basis of this ‘variation’ ?
fundamentally genetical but environmental factors are
always important
Variablesheightweightblood pressurepulse ratebody temperaturesize of a swellingsocial status: income
Variables
response to a question Do you like to treat a patient with
AIDS? Y / N / undecided Do you agree with what I say in
this lecture? Y/N/?Attitudes - Opinions - What we feel
Types of variables
a) nominalb) ordinalc) intervald) ratio
Nominal variables
Qualitative classificationDistinct categoriesNo ranking eg: gender, race, color, city
males
females
Ordinal variables
Qualitative classificationCategories have order or rankeg. Socioeconomic status
• Upper• Middle (upper & lower)• Lower
Interval & ratio variables
Quantitative classificationInterval variable has no absolute
zero eg. T°C
Ratio variable has absolute zero eg. Kelvin temp, time, space
Methods of collecting data
questionnaireinterviewmeasurement using special
instrumentBHT studies (Bed Head Ticket)Postal surveys
Different types of sampling
Simple random sampleSystematic samplingStratified samplingCluster sampling
Population Sample
Simple random sampleeach subject in the population has an equal chance of getting selected
to the sample each subject is given a number subjects selected using random numbers use random tables or computer generated
random numbers
random table20 17 42 28 31 17 59 66 38 61 03 51 10 55 92 52 44 25 88
74 49 04 03 08 33 53 70 11 54 48 94 60 49 57 38 65 15 40
Non-random sample this is a biased sample certain subjects have more probability of getting selected
to the sample
certain situations randomisation is not possible either due to practical difficulties or difficulty in finding subjects
Statistical concepts
In order to arrive at conclusions data are analysed
Conclusions are based on concepts just like geometric theorems
Descriptive statistics Background information
Inferential statistics Hypothesis testing
Central TendencyCentral Tendency
A single value representing a datasetA single value representing a dataset eg. pulse rateeg. pulse rate
Measures of Central Measures of Central TendencyTendency
MeanMean (x) (x) averageaverage x = x = x x
nn ModeMode
commonest value or the most frequent valuecommonest value or the most frequent value MedianMedian
central valuecentral value
--
exampleexample
In a dataset In a dataset 1 2 2 3 3 3 4 4 5 1 2 2 3 3 3 4 4 5
mean = 3mean = 3 mode = 3mode = 3 median = 3median = 3
VariationVariation
Central value gives only the Central value gives only the representative figurerepresentative figure
variation of data set is not shownvariation of data set is not shown
Measures of VariationMeasures of Variation
RangeRange from minimum to maximumfrom minimum to maximum
Charts or graphsCharts or graphs bar chartbar chart histogramhistogram
Bar chartBar chart
05
101520253035
frequency
males females
Patients with headache
generally used for categorical variables
Pie chartPie chartPatients with headache
malesfemales
Multiple bar chartMultiple bar chart
0
50
100
150
200
250
average income
1986 1987 1988
average income in 3 years
malefemale
Histogram Histogram
Age of employee
38.037.0
36.035.0
34.033.0
32.031.0
30.029.0
28.027.0
26.025.0
24.023.0
50
40
30
20
10
0
Std. Dev = 3.35 Mean = 29.4N = 292.00
for continuous variables
Frequency distribution curve Frequency distribution curve
Standard deviation (SD)Standard deviation (SD)
This is an accurate measure of This is an accurate measure of variabilityvariability
Calculated for continuous dataCalculated for continuous data Based on deviations of data from the Based on deviations of data from the
meanmean
Standard deviation (SD)Standard deviation (SD) If the data set is If the data set is
1, 2, 2, 3, 3, 3, 4, 4, 51, 2, 2, 3, 3, 3, 4, 4, 5 mean will be 3mean will be 3 deviations are calculateddeviations are calculated
(3-1), (3-2), (3-2), (3-3), (3-3), (3-4), (3-1), (3-2), (3-2), (3-3), (3-3), (3-4), (3-4), (3-5)(3-4), (3-5)2, 1, 1, 0, 0, 0, -1, -1, -22, 1, 1, 0, 0, 0, -1, -1, -2
deviations are squareddeviations are squared4, 1, 1, 0, 0, 0, 1, 1, 44, 1, 1, 0, 0, 0, 1, 1, 4
mean of squared deviation is calculated= 12/9mean of squared deviation is calculated= 12/9 square root is taken= root of 12/9square root is taken= root of 12/9
Standard deviation (SD)Standard deviation (SD) datadata deviationsdeviations square of deviationssquare of deviations 11 3-13-1 22 44 22 3-23-2 11 11 22 3-23-2 11 11 33 3-33-3 00 00 33 3-33-3 00 00 33 3-33-3 00 00 44 3-43-4 -1-1 11 44 3-43-4 -1-1 11 5 5 3-53-5 -2-2 44
12/9
variance = 12/9SD = 12/9
Standard deviation (SD)Standard deviation (SD) In the previous example of 292 subjectsIn the previous example of 292 subjects
mean mean = 29.41 yrs= 29.41 yrs SD SD = 3.35 yrs= 3.35 yrs minimum = 23.00 yrsminimum = 23.00 yrs maximum= 37.83 yrsmaximum= 37.83 yrs (n = 292)(n = 292)
Coefficient of variation Coefficient of variation (COV)(COV)
SDSDCOV = ---------COV = --------- X 100 %X 100 %meanmean
COV has no unitsCOV has no units Variations between two variables could be compared Variations between two variables could be compared
using COV (eg. Blood pressure and pulse rate)using COV (eg. Blood pressure and pulse rate)
Normal distributionNormal distribution
generally continuous variables in the human generally continuous variables in the human body such as height and weight has a definite body such as height and weight has a definite patternpattern
It is a symmetrical, an inverted bell shaped curveIt is a symmetrical, an inverted bell shaped curve mean, mode and median are similarmean, mode and median are similar this type of distribution is called this type of distribution is called “normal “normal
distribution”distribution”
Normal distributionNormal distribution
VAR00001
9.08.07.06.05.04.03.02.01.0
12
10
8
6
4
2
0
Std. Dev = 2.02
Mean = 5.0
N = 50.00
Normal distributionNormal distribution
VAR00001
9.08.07.06.05.04.03.02.01.0
12
10
8
6
4
2
0
Std. Dev = 2.02
Mean = 5.0
N = 50.00
•symmetrical•bell-shaped•mean, mode, median similar•also called a Gaussian curve
Normal distributionNormal distribution
mean
mode
median