Parallelism in practice USP Bioassay Workshop August 2010

Click here to load reader

download Parallelism in practice USP Bioassay Workshop August 2010

of 40

description

Parallelism in practice USP Bioassay Workshop August 2010. Ann Yellowlees Kelly Fleetwood Quantics Consulting Limited. Contents. What is parallelism? Approaches to assessing parallelism Significance Equivalence Experience Discussion. Setting the scene: Relative Potency. RP : - PowerPoint PPT Presentation

Transcript of Parallelism in practice USP Bioassay Workshop August 2010

Basic Pharmacokinetics

1Parallelism in practice

USP Bioassay Workshop August 2010Ann YellowleesKelly FleetwoodQuantics Consulting Limited

2ContentsWhat is parallelism?Approaches to assessing parallelismSignificanceEquivalenceExperienceDiscussion

3Setting the scene: Relative Potency

RP: ratio of concentrations of reference and sample materials required to achieve the same effect

RP = Cref / Csamp34ParallelismOne curve is a horizontal shift of the other

These are parallel or similar curves

Finney:A prerequisite of all dilution assays

5Real data: continuous response

56Linear model(4 concentrations)

Parallel when the slopes equalLinear: Y = a + log (C)NB the range - which concentrations?Do we care about the asymptotes?7Four parameter logisticmodel

4PL: Y = + ( - ) / [1 + exp ( log (C) )]Parallel when asymptotes , slope equalMention symmetry.

A looks the sameB does not look the same78Five parameter logisticmodel

Parallel when asymptotes , slope asymmetry equal

5PL: Y = + ( - ) / [1 + exp ( log (C) ) ]A: sameB: not the same.Slope not the same9Tests for parallelismApproach 1Is there evidence that the reference and test curves ARE NOT parallel?

Compare unrestricted vs restricted models Test loss of fit when model restricted to parallelp value approaches

Traditional F test approach as preferred by European Pharmacopoeia

Chi-squared test approach as recommended by Gottschalk & Dunn (2005)

910Approach 2Is there evidence that the reference and test curves ARE parallel?

Equivalence test approach as recommended in the draft USP guidance (Hauck et al 2005)Fit model allowing non-parallel curves Confidence intervals on differences between parameters

Pharmacopoeial disharmony exists!! (existed?)

1011In practice...

Four example data sets

Data set 1: 60 RP assays (96 well plates, OD: continuous)Data set 2: 15 RP assays (96 well plates, OD : continuous)Data set 3: 12 RP assays (96 well plates, OD : continuous)

Data set 4: 60 RP assays (in vivo, survival at day x: binary*)

* treated as such for this purpose; wasteful of data

1112In practice...

We have applied the proposed methods in the context of individual assay pass/fail (suitability):

Data set 1Compare 2 significance approachesCompare equivalence with significanceData sets 2, 3Compare 2 significance approachesData set 4 Compare F test (EP) with equivalence (USP)1213Data set 160 RP assays

8 dilutions 2 independent wells per dilution

4PL a good fit(vs 5PL)

NB precision

Model log e OD s log e conc AVERAGE SLOPE = 1/.384 = 2.6GD_RegressionGraph_4PL_WEIGHTED_077wmf1314Data set 1: F test and chi-squared testF test: straightforward

Chi-squared test: need to establish mean-variance relationship

This is a data driven method!!! Very arbitrary

Establishing equivalence limitsHauck paper: provisional capability based limits can be set using reference vs reference assaysNot available in our dataset...

1415Data set 1: F test and chi-squared test

F test: 12/60 = 20% of assays have p < 0.05Evidence of dissimilarity? OR Precise assay?

Chi-squared test: 58/60 = 97% of assays have p < 0.05!Intra-assay variability is low differences between parallel and non-parallel model are exaggerated

Histograms of F-test p-values and G&D p-valuesFollowed by example graph to illustrate why G&D behaves so poorly:Intra-assay is variability is low, compared to quality of the fit, differences between curves exaggerated. Poor choice of statistic

This is a data driven method!!! Very arbitrary

Establishing equivalence limitsHauck paper: provisional capability based limits can be set using reference vs reference assaysNot available in our dataset...

1516Data set 1: Comparison of approaches to parallelism

16Data set 1: Comparison of approaches to parallelismSome evidence of hook in modelResidual SS inflated

17

NOTE HOOK1718Data set 1: Comparison of approaches to parallelism

Excluding top 2 points because of HOOK Approx 20 /60 passRemodelled : quadratic relationship re fitted18

19Data set 1: F test and chi-squared test

RSSparallel = 159RSSnon-parallel = 112RSSp RSSnp = 47

Pr(23>47) < 0.01

F test: P = 0.03Example where both fail19

20Data set 1: F test and chi-squared test

RSSparallel = 100.2RSSnon-parallel = 99.0RSSp RSSnp = 1.2 Pr(23>1.2) = 0.75Example where both PASS20Data set 1: USP methodologyProve parallelLower asymptote:

2122Data set 1: USP methodologyUpper asymptote:

This is interesting: demonstrates that its not enough just to order the data and take the 2nd from the end as your limit. Need to examine it. Check for bias!2223Data set 1: USP methodologyScale:

Scale for reference: 0.384 (range 0.344 to 0.416)

NB scale = 1/ slope2324Data set 1: USP methodologyCriteria for 90% CI on difference between parameter values:

Lower asymptotes: (-0.235, 0.235)

Upper asymptotes: (-0.213, 0.213)

Scales: (-0.187, 0.187)

Applying the criteria:3/60 = 5% of assays fail the parallelism criteriaNo assay fails more than one criterion

scale parameter from R parameterisation: allows log RP to be estimated as a1 a2 (easy variance)24

25Data set 1: Comparison of approaches to parallelism

25

Data set 1: Comparison of approaches to parallelismThis plate fails all 3 testsUSP: Lower asymptote

26

FAILS ALL whether or not hook included2627Data set 1: Comparison of approaches to parallelism

Equivalence test: scales not equivalentF test p-value = 0.60Chi-squared test p-value < 0.001

F test passes: high variability2728Data set 2: Comparison of approaches to parallelism

Constant variance2829Data set 3: Comparison of approaches to parallelism

Linear fit for mean varianceAgain the G&D test suggests more assays FAIL2930In practice...

Data set 4: Compare F test with equivalenceMethodology for Chi-squared test not developed for binary data

3031Data set 460 RP assays

4 dilutions15 animals per dilution

Actual model is a GLM (i.e. response 0,1 dependent on survival), % Survival shown for illustrative purposes only;.SLOPES: average = -2.41. range (-14.71, -1.03)

3132Data set 4: Comparison of approaches to parallelism

F test: 5/60 = 8% failEquivalence: Fail 5% = 3Equivalence: could choose limit to match3233Data set 4: Comparison of approaches to parallelism

F-test approach and Equivalence approach could be in agreement depending on how limits are set.3334Broadly...

F test Fail (?wrongly?) when very precise assay Pass (?wrongly?) when noisy Linear case: p value can be adjusted to match equivalenceChi-squaredFail when very precise assay (even if difference is small)If model fits badly weighting inflates RSS (e.g. hook)2 further data sets supported thisUSPLimits are set such that the extreme 5% will failThey do! Regardless of precision, model fit etc3435Stepping back

What are we trying to do?

Produce a biologic to a controlled standard that can be used in clinical practice

For a batch we need to know its potencyWith appropriate precisionIn order to calculate clinical dosePerhaps add more information about precision to this3536Some thoughts

Establish a valid assayUse all development assay results unless a physical reason exists to exclude them Statistical methodology can be used to flag possible outliers for investigationUSP applies this to individual data points

Parallelism / similarityAre the parameter differences fundamentally zero?Or is there a consistent slope difference (e.g)?Equivalence approach + judgment for acceptable marginPerhaps add more information about precision to this3637Some thoughts

2. Set number of replicates to provide required precisionCombine RP values plus confidence intervals for reportable value

Per assay, use all results unless physical reason not to (They are part of the continuum of assays)Flag for investigation using statistical techniquesReference behaviourParallelism

4. Monitor performance over time (SPC)Reference stabilityParallelismPerhaps add more information about precision to this3738Which parallelism test?Our view:Chi squared test requires too many complex decisions and is very sensitive to the model

F test not generally applicable to the assay validation stageDoes not allow examination of the individual parametersDoes not lend itself to judgment about How parallel is parallel?

The equivalence test approach fits in all three contextsWith adjustment of the tolerance limits as appropriate

39Thank youUSP the invitation

Clients use of dataBioOutsource: www.biooutsource.comOther clients who prefer to remain anonymous

Quantics staff analysis and graphicsKelly Fleetwood (R), Catriona Keerie (SAS)

3940