Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and...

19
Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan

Transcript of Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and...

Page 1: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Sampling and power analysis in the

High Resolution studies

Pamela MinicozziDescriptive Studies and Health Planning Unit,

Department of Preventive and Predictive Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan

Page 2: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

2

High Resolution studies

collected detailed data

from patients’ clinical records, so that the influence

of non-routinely collected factors

(tumour molecular characteristics, diagnostic

investigations, treatment, relapse)

on survival and differences in standard care

could be analysed

Page 3: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Problem

In each country, the population of incident cases

for a particular cancer consists of N subjects

N is large (so, rare cancers are not considered here)

Since N is large, not all cases can be investigated

use a representative sample to derive valid conclusions

that are applicable to the entire original population

3

Solution

Page 4: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

4

Two questions

1) What kind of probability sampling should we use?

2) What sample size should we use?

Page 5: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

5

Sampling

Page 6: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Previous High Resolution studies

6

Samples were representative of

1-year incidence

a time interval (e.g. 6 months) within the study period, provided that

incidence was complete

an administratively defined area covered by cancer registration

Page 7: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

7

We want to eliminate variations in types of sampling between countries

and within a single country

Present High Resolution studies

Main types of probability sampling

This implies more sophisticated sampling

Page 8: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Simple random sampling

assign a unique number to each element of the study population determine the sample size randomly select the population elements using

a table of random numbers a list of numbers generated randomly by a computer

8

Advantage: - auxiliary information on subjects is not requiredDisadvantage: - if subgroups of the population are of particular interest, they may not be included in sufficient numbers in the sample

Page 9: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Stratified sampling

identify stratification variable(s) and determine the number of strata to be used (e.g. day and month of birth, year of diagnosis, cancer registry, etc.)

divide the population into strata and determine the sample size of each stratum randomly select the population elements in each stratum

9

Advantage: - a more representative sample is obtainedDisadvantage: - requires information on the proportion of the total population belonging to each stratum

Page 10: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

Systematic sampling

determine the sample size (n); thus the sampling interval “i” is n/N randomly select a number “r” from 1 to “i” select all the other subjects in the following positions: r, r+ i, r+ 2*i, etc, until the sample is exhausted

10

Advantage: - eliminate the possibility of autocorrelationDisadvantage: - only the first element is selected on a probability basis pseudo-random sampling

Page 11: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

11

Howmany subjects do we

need?

Page 12: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

12

The main elements

Previous pilot

studies

the probability that the difference will be detected (e.g. 80%, 90%)

the probability that a positive finding is due to chance alone (e.g. 1%, 5%) they explored whether some

variables can be measured with sufficient precision (or available) and checked the study vision

Page 13: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

13

Number of patients was defined based on:

observed differences in survival and risk of death

incidence of the cancer under study

difficulties in collecting clinical information

available economic resources

Previous High Resolution studies

Notwithstanding that ...

we were able to identify statistically significant relative excess risks of

death

up to 1.60 among European countries

up to 1.40 among Italian areas

for breast cancer for which differences in survival are small.

Applicable to other cancers for which survival differences are larger

Page 14: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

14

Example for breast cancer (diagnosis 95-99)

Plot power as a function of hazard ratio for a 5% two-sided log-rank test with 80% power over sample sizes ranging from 100 and 1000 Assume 75% survival as reference (the overall survival in Europe, range: 65-90%)

45%

Page 15: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

15

Example for colorectal cancer (diagnosis 95-99)

Plot power as a function of hazard ratio for a 5% two-sided log-rank test with 80% power over sample sizes ranging from 100 and 1000 Assume 50% survival as reference (the overall survival in Europe, range: 30-70%)

32%

Page 16: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

16

Example for lung cancer (diagnosis 95-99)

Plot power as a function of hazard ratio for a 5% two-sided log-rank test with 80% power over sample sizes ranging from 100 and 1000 Assume 10% survival as reference (the overall survival in Europe, range: 5-20%)

30%

Page 17: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

17

We want to analyse both differences in survival and

adherence to standard care

Present High Resolution studies

Power analysis for both

logistic regression analysis

(to analyse the odds of receiving one type of care (typically standard care))

and relative survival analysis

(to analyse differences in relative survival and relative excess risks of death)

Page 18: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

18

Conclusions

Taking into account existing samplings and power methodology experience from previous studies different coverage of Cancer Registries available economic resources

We want to standardize the selection of data include a minimum number of cases that satisfies statistical considerations related to all aims of our studies

Prof. JS Long1 (Regression Models for Categorical and Limited Dependent,1997) suggests that sample sizes of less than 100 cases should be avoided and that 500 observations should be adequate for almost any situation.

1Professor of Sociology and Statistics at Indiana University

Page 19: Sampling and power analysis in the High Resolution studies Pamela Minicozzi Descriptive Studies and Health Planning Unit, Department of Preventive and.

19