USP_1033_Biological Assay Validation.pdf

25
© 2010 The United States Pharmacopeial Convention All Rights Reserved 1 BRIEFING 1 <1033> Biological Assay Validation. General Chapter Biological Assay Validation 2 <1033> is a companion chapter to three other proposed USP chapters pertaining to 3 bioassay: Design and Development of Biological Assays <1032>; Analysis of Biological 4 Assays <1034>; and a “roadmap” chapter (as yet unnumbered) that will include a 5 glossary applicable to <1032>, <1033> and <1034>. Taken together, these General 6 Chapters update and expand on information provided in General Chapter <111>, which 7 will continue to be available with modifications. A draft of <1033> previously appeared in 8 Pharmacopeial Forum [2009;35(2):349–367]. This revision incorporates responses to 9 comments received regarding the chapter, as well as further sections to describe 10 bioassay validation. USP encourages input from all interested parties regarding <1033> 11 and its companion chapters. USP’s intent is to reflect the best contemporary thought 12 regarding bioassay validation. This will be achieved when the bioassay community 13 takes advantage of this opportunity to engage in the chapter’s development by 14 responding to the material in this In-Process Revision. Comments regarding this 15 material should be sent to Tina S. Morris, PhD ([email protected]). 16 17 <1033> BIOLOGICAL ASSAY VALIDATION 18 19 INTRODUCTION 20 21 Biological assays (also called bioassays) are an integral part of the quality assessment 22 required for the manufacturing and marketing of many biological and some non- 23 biological drug products. Bioassays commonly used for drug potency estimation can be 24 distinguished from chemical tests by their reliance on a biological substrate (e.g., 25 animals, living cells, or functional complexes of target receptors). Because of multiple 26 operational and biological factors arising from this reliance on biology, they typically 27 exhibit a greater variability than do chemically-based tests. 28 Bioassays are one of several physicochemical and biologic tests with procedures and 29 acceptance criteria that control critical quality attributes (measurands) of a biological 30 drug product. As described in the ICH Guideline entitled Specifications: Test 31 Procedures And Acceptance Criteria For Biotechnological /Biological Products (Q6B), 32 section 2.1.2, bioassay techniques may measure an organism’s biological response to 33 the product; a biochemical or physiological response at the cellular level; enzymatic 34 reaction rates or biological responses induced by immunological interactions; or ligand- 35 and receptor-binding. As new biological drug products and new technologies emerge, 36 the scope of bioassay approaches is likely to expand. Therefore, General Chapter 37 Biological Assay Validation <1033> emphasizes validation approaches that provide 38 flexibility to adopt new bioassay methods, new biological drug products, or both in 39 conjunction for the assessment of drug potency. 40

Transcript of USP_1033_Biological Assay Validation.pdf

Page 1: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  1

BRIEFING 1 

<1033> Biological Assay Validation. General Chapter Biological Assay Validation 2 <1033> is a companion chapter to three other proposed USP chapters pertaining to 3 bioassay: Design and Development of Biological Assays <1032>; Analysis of Biological 4 Assays <1034>; and a “roadmap” chapter (as yet unnumbered) that will include a 5 glossary applicable to <1032>, <1033> and <1034>. Taken together, these General 6 Chapters update and expand on information provided in General Chapter <111>, which 7 will continue to be available with modifications. A draft of <1033> previously appeared in 8 Pharmacopeial Forum [2009;35(2):349–367]. This revision incorporates responses to 9 comments received regarding the chapter, as well as further sections to describe 10 bioassay validation. USP encourages input from all interested parties regarding <1033> 11 and its companion chapters. USP’s intent is to reflect the best contemporary thought 12 regarding bioassay validation. This will be achieved when the bioassay community 13 takes advantage of this opportunity to engage in the chapter’s development by 14 responding to the material in this In-Process Revision. Comments regarding this 15 material should be sent to Tina S. Morris, PhD ([email protected]). 16 

17 

<1033> BIOLOGICAL ASSAY VALIDATION 18 

19 INTRODUCTION 20 

21 Biological assays (also called bioassays) are an integral part of the quality assessment 22 required for the manufacturing and marketing of many biological and some non-23 biological drug products. Bioassays commonly used for drug potency estimation can be 24 distinguished from chemical tests by their reliance on a biological substrate (e.g., 25 animals, living cells, or functional complexes of target receptors). Because of multiple 26 operational and biological factors arising from this reliance on biology, they typically 27 exhibit a greater variability than do chemically-based tests. 28 

Bioassays are one of several physicochemical and biologic tests with procedures and 29 acceptance criteria that control critical quality attributes (measurands) of a biological 30 drug product. As described in the ICH Guideline entitled Specifications: Test 31 Procedures And Acceptance Criteria For Biotechnological /Biological Products (Q6B), 32 section 2.1.2, bioassay techniques may measure an organism’s biological response to 33 the product; a biochemical or physiological response at the cellular level; enzymatic 34 reaction rates or biological responses induced by immunological interactions; or ligand- 35 and receptor-binding. As new biological drug products and new technologies emerge, 36 the scope of bioassay approaches is likely to expand. Therefore, General Chapter 37 Biological Assay Validation <1033> emphasizes validation approaches that provide 38 flexibility to adopt new bioassay methods, new biological drug products, or both in 39 conjunction for the assessment of drug potency. 40 

Page 2: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  2

Good Manufacturing Practice requires that test methods used for assessing compliance 41 of pharmaceutical products should meet appropriate standards for accuracy and 42 reliability. Assay validation is the process of demonstrating and documenting that the 43 performance characteristics of the procedure and its underlying method meet the 44 requirements for the intended application and that the assay is thereby suitable for its 45 intended use. USP General Chapter Validation of Compendial Procedures <1225> 46 describes the assay performance characteristics that should be evaluated for 47 procedures supporting small-molecule pharmaceuticals and is broadly based on their 48 use for lot release, marketplace surveillance, and similar applications. Although 49 application of these validation parameters is straightforward for many types of analytical 50 procedures for well-characterized, chemically-based drug products, their interpretation 51 and applicability for some types of bioassays has not been clearly delineated. 52 

Assessment of bioassay performance is a continuous process, but bioassay validation 53 should be performed when development has been completed and the bioassay 54 standard operating procedure (SOP) has been approved. Bioassay validation is guided 55 by a validation protocol describing the goals and design of the validation study. General 56 Chapter <1033> provides validation goals pertaining to relative potency bioassays. 57 Relative potency bioassays are based on a comparison of bioassay responses for a 58 Test sample and a designated Standard, allowing a judgment regarding the activity of 59 the Test relative to that of the Standard. 60 

Validation parameters discussed include relative accuracy, specificity, intermediate 61 precision, and range. Laboratories may use dilutional linearity to verify the relative 62 accuracy and range of the method. Although robustness is not a requirement for 63 validation, General Chapter <1033> recommends that a bioassay’s robustness be 64 assessed either pre-validation or with additional experiments soon after the validation. 65 In addition, <1033> describes approaches for validation design (sample selection and 66 replication strategy), validation acceptance criteria, data analysis and interpretation, and 67 finally bioassay performance monitoring through quality control. Documentation of 68 bioassay validation results is also discussed, with reference to pre-validation 69 experiments performed to optimize bioassay performance. In the remainder of General 70 Chapter <1033> the term bioassay is used to indicate a relative potency bioassay. 71 

72 FUNDAMENTALS OF BIOASSAY VALIDATION 73 

74 The goal of validation for a bioassay is to confirm the operating characteristics of the 75 procedure for its intended use. Multiple dilutions (concentrations) of one or more Test 76 samples and the Standard sample are included in a single bioassay. These dilutions are 77 herein termed a replicate set, which contains a single organism type, e.g., group of 78 animals or vessel of cells, at each dilution for each sample [Test(s) and Standard]. A run 79 consists of performing assay work during a period when the accuracy (trueness) and 80 precision in the assay system can reasonably be expected to be stable. In practice, a 81 run frequently consists of the work performed by a single analyst in one lab, with one 82 set of equipment, in a short period of time (typically a day). An assay is the body of data 83 used to assess similarity and estimate potency relative to Standard for each Test 84 

Page 3: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  3

sample in the assay. A run may contain multiple assays, a single assay, or part of an 85 assay. Determining what will be an assay is an important strategic decision. The factors 86 involved in that decision are described in greater detail in General Chapter <1032> and 87 are assumed to be resolved by the time a bioassay is in validation. Multiple assays may 88 be combined to yield a reportable value for a sample. The reportable value is the value 89 that is compared to a product specification. 90 

In assays where the replication set involves groups at each dilution (e.g., 6 samples, 91 each at 10 dilutions, in the nonedge wells of each of several 96-well cell culture plates) 92 the groups (plates) constitute statistical blocks that should be elements in the assay and 93 validation analyses (blocks are discussed in <1032>). Within-block replicates for Test 94 samples are rarely cost-effective. Blocks will not be further discussed in this chapter; 95 more detailed discussion is found in <1032>. 96 

The amount of activity (potency) of the Standard sample is typically assigned 1.0 or 97 100%, and the potency of the Test sample is calculated by comparing the 98 concentration–response curves for the Test sample and Standard sample pair. This 99 results in a unitless measure, which is the relative potency of the Test sample in 100 reference to the potency of the Standard sample. In some cases the Standard sample is 101 assigned a value according to another property such as protein concentration. In that 102 case the potency of the Test sample is reported as specific activity, which is the relative 103 potency times the assigned value of the Standard sample. An assumption of parallel-104 line or parallel-curve (e.g., four-parameter logistic) bioassay is that the dose–response 105 curves that are generated using a Standard sample and a Test sample have similar 106 (parallel) curve shape distinguished only by a horizontal shift in the log dose. For slope-107 ratio bioassay, curves generated for Standard sample and Test article should be linear, 108 pass through a common intercept, and distinguished only by their slopes. Information 109 about how to assess parallelism is provided in General Chapters <1032> and <1034>. 110 

In order to establish the relative accuracy and range of the bioassay, Test samples may 111 be constructed for validation by a dilution series of the Standard sample to assess 112 dilutional linearity (linearity of the relationship between determined and constructed 113 relative potency). In addition, the validation study should be directed toward obtaining a 114 representative estimate of the variability of the relative potency determination. Although 115 robustness studies are usually performed during bioassay development, key factors in 116 these studies such as incubation time and temperature and, for cell-based bioassays, 117 cell passage number and cell number may be included in the validation. Because of 118 potential influences within a single laboratory on the bioassay from inter-run factors 119 such as multiple analysts, instruments, or reagent sources, the design of the bioassay 120 validation should include consideration of these factors. The variability of potency from 121 these combined elements defines the intermediate precision (IP) of the bioassay. An 122 appropriate study of the variability of potency, including the impact of intra-assay and 123 inter-run factors, can help the laboratory confirm an adequate testing strategy, thereby 124 forecasting the inherent variability of the reportable value (which may be the average of 125 multiple potency determinations). Variability estimates can also be utilized to establish 126 the sizes of differences (fold difference) that can be distinguished between samples 127 tested in the bioassay. 128 

Page 4: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  4

Demonstrating specificity requires evidence of lack of interference (also known as 129 selectivity) from matrix components such as manufacturing process components or 130 degradation products so that measurements describe the target molecule only. Other 131 analytical methods may complement a bioassay in measuring or identifying other 132 components in a sample. 133 

134 Bioassay Validation Protocol 135 

136 A bioassay validation protocol should include the number and types of samples that will 137 be studied in the validation; the study design, including inter-run and intra-run factors; 138 the replication strategy; the intended validation parameters and justified target 139 acceptance criteria for each parameter; and a proposed data-analysis plan. Note that in 140 regard to satisfying acceptance criteria, failure to find a statistically significant effect is 141 not an appropriate basis for defining acceptable performance in a bioassay; 142 conformance to acceptance criteria may be better evaluated using an equivalence 143 approach. 144 

In addition, assay (possibly run) and sample acceptance criteria such as system 145 suitability and similarity should be specified before performing the validation. Depending 146 on the extent of development of the bioassay, these may be proposed as tentative and 147 can be updated with data from the validation. Assay, run, or sample failures may be 148 reassessed in an appropriate fashion and, with sound justification, included in the 149 overall validation assessment. Additional validation trials may be required to support 150 changes to the method. 151 

The bioassay validation protocol should include target acceptance criteria for the 152 proposed validation parameters. Failure to meet a target acceptance criterion may 153 result in a limit on the range of potencies that can be measured in the bioassay or a 154 modification to the replication strategy in the bioassay procedure. 155 

156 Documentation of Bioassay Validation Results 157 

158 Bioassay validation results should be documented in a bioassay validation report. The 159 validation report should support the conclusion that the method is fit for use or should 160 indicate corrective action (such as change in bioassay format) that will be undertaken to 161 generate reliable results. The report should include the raw data from the study 162 (bioassay response measurements such as titers, or instrument measurements such as 163 OD) and the results of statistical calculations. Results of intermediate measures should 164 be reported when they add clarity and facilitate reproduction of validation conclusions 165 (e.g., variance component estimates should be provided in addition to overall 166 intermediate precision). Estimates of validation parameters should be reported at each 167 level and overall as appropriate. Deviations from the validation protocol should be 168 documented with justification. The conclusions from the study should be clearly 169 described with references to follow-up action as necessary. Follow-up action can 170 

Page 5: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  5

include amendment of system or sample suitability criteria or modification of the 171 bioassay replication strategy. Reference to prevalidation experiments may be included 172 as part of the validation study report. Prevalidation experiments may include robustness 173 experiments, where bioassay parameters have been identified and ranges have been 174 established for significant parameters, and also may include qualification experiments, 175 where the final procedure has been performed to confirm satisfactory performance in 176 routine operation. Conclusions from prevalidation and qualification experiments 177 performed during development contribute to the description of the operating 178 characteristics of the bioassay procedure. 179 

180 Bioassay Validation Design 181 

182 The bioassay validation should include samples that are representative of materials that 183 will be tested in the bioassay and should effectively establish the performance 184 characteristics of the procedure. For relative accuracy, sample relative potency levels 185 that bracket the range of potencies that may be tested in the bioassay should be used. 186 Thus samples that span a wide range of potencies might be studied for a drug or 187 biological with a wide specification range or for a product that is inherently unstable, but 188 a narrower range can be used for a more durable product. A minimum of 3 potency 189 levels is required, and 5 are recommended for a reliable assessment. The potency 190 levels chosen will constitute the range of the bioassay if the criteria for relative accuracy 191 and IP are met during the validation. A limited range will result from levels that fail to 192 meet their target acceptance criteria. Samples may also be generated for the bioassay 193 validation by stressing a sample to a level that might be observed in routine practice 194 (i.e., stability investigations). Additionally, the influences of the sample matrix 195 (excipients, process constituents, or combination components) can be studied 196 strategically by intentionally varying these together with the target analyte using a 197 multifactorial approach. 198 

The bioassay validation design should consider all facets of the measurement process. 199 Sources of bioassay measurement variability include sampling, sample preparation, 200 intra-run, and inter-run factors. Representative estimation of bioassay variability 201 necessitates consideration of these factors. Validation samples should be randomly 202 selected from the sources from which test articles will be obtained during routine use, 203 and sample preparation should be performed independently during each validation run. 204 

The replication strategy used in the validation should reflect knowledge of the factors 205 that might influence the measurement of potency. Intra-run variability may be affected 206 by bioassay operating factors that are usually set during development (temperature, pH, 207 incubation times, etc.); by the bioassay design (number of animals, number of dilutions, 208 replicates per dilution, dilution spacing, etc); by the assay acceptance and sample 209 acceptance criteria; and by the statistical analysis (where the primary endpoints are the 210 similarity assessment for each sample and potency estimates for the reference 211 samples). Operating restrictions and bioassay design (intra- and inter-run formulae that 212 result in a reportable value for a test material) are usually specified during development 213 and may become a part of the bioassay operating procedure. IP is studied by 214 

Page 6: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  6

independent runs of the procedure, perhaps using an experimental design that alters 215 those factors that may have an impact on the performance of the procedure. 216 Experiments [e.g., design of experiments (DOE) for treatment design] with nested or 217 crossed design structure can reveal important sources of variability in the procedure, as 218 well as ensure a representative estimate of long-term variability. During the validation it 219 is not necessary to employ the format required to achieve the reportable value for a 220 Test sample. A well-designed validation experiment that combines both intra-run and 221 inter-run sources of variability provides estimates of independent components of the 222 bioassay variability. These components can be used to verify or forecast the variability 223 of the bioassay format. 224 

A thorough analysis of the validation data should include graphical and statistical 225 summaries that address the validation parameters and their conformance to target 226 acceptance criteria. The analysis should follow the specifics of the data-analysis plan 227 outlined in the validation protocol. In most cases, log relative potency should be 228 analyzed in order to satisfy the assumptions of the statistical methods (see Statistical 229 Considerations, Scale of Analysis, below). Those assumptions include normality of the 230 distribution from which the data were sampled and homogeneity of variability across the 231 range of results observed in the validation. These assumptions can be explored using 232 graphical techniques such as box plots and probability plots. The assumption of 233 normality can be investigated using statistical tests of normality across a suitably sized 234 collection of historical results. Alternative methods of analysis should be sought when 235 the assumptions can be challenged. Confidence intervals should be calculated for the 236 validation parameters using methods described here and in Chapter Analytical Data—237 Interpretation and Treatment <1010>. 238 

239 Validation Strategies for Bioassay Performance Characteristics 240 

241 Parameters that should be verified in a bioassay are relative accuracy, specificity, IP, 242 and range. Other parameters discussed in Chapter <1225> and ICH Q2(R1) such as 243 detection limit and quantitation limit have not been included because they are usually 244 not relevant to a bioassay that reports relative potency. Following are strategies for 245 addressing bioassay validation parameters. 246 

247 1. Relative accuracy. The relative accuracy of a relative potency bioassay is the 248 relationship between measured relative potency and known relative potency. Relative 249 accuracy in bioassay refers to a unit slope between log measured relative potency vs. 250 log level when levels are known. The most common approach to demonstrating relative 251 accuracy for relative potency bioassays is by construction of target potencies by dilution 252 of the standard material or a Test sample with known potency. This type of study is 253 often referred to as a dilutional linearity study. The results from a dilutional linearity 254 study should be assessed using the estimated relative bias at individual levels and via a 255 trend in relative bias across levels. The relative bias at individual levels is calculated as 256 follows: 257 

Page 7: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  7

⎞⎛= ⋅ − ⎟⎜

⎝ ⎠

Measured PotencyRelative Bias 100 1 %Target Potency

258 

The trend in bias is measured by the estimated slope of log measured potency vs log 259 target potency, which should be held to a target acceptance criterion. If there is no trend 260 in relative bias across levels, the estimated relative bias at each level can be held to a 261 prespecified target acceptance criterion that has been defined in the validation protocol 262 (see Bioassay Validation Example, below). 263 

264 2. Specificity. For products or intermediates associated with complex matrices, 265 specificity (sometimes called selectivity) involves demonstrating lack of interference 266 from matrix components or product-related components that can be expected to be 267 present. This can be assessed via parallel dilution of the Standard sample with and 268 without a spike addition of the potentially interfering compound. If the curves are similar 269 and the potency conforms to expectations of a Standard-to-Standard comparison, the 270 bioassay is specific against the compound. For these assessments both similarity and 271 potency may be assessed using appropriate equivalence tests. 272 

273 3. Intermediate precision. Because of potential influences on the bioassay by factors 274 such as analysts, instruments, or reagent lots, the design of the bioassay validation 275 should include evaluation of these factors. The overall variability from measurements 276 taken under a variety of normal test conditions within one laboratory defines the IP of 277 the bioassay. IP is the ICH and USP term for what is also commonly referred to as inter-278 run variability. 279 

IP measures the influence of factors that will vary over time after the bioassay is 280 implemented. These influences are generally unavoidable and include factors like 281 change in personnel (new analysts), receipt of new reagent lots, etc. Although 282 robustness studies are normally conducted during bioassay development, key intra-run 283 factors such as incubation time and temperature may be included in the validation using 284 multifactor design of experiments (DOE). An appropriate study of the variability of the 285 bioassay, including the impact of intra-run and inter-run factors, can help the laboratory 286 confirm an adequate testing strategy, thereby forecasting the inherent variability of the 287 reportable value as well as the critical fold difference (see Use of Validation Results for 288 Characterization, below). 289 

When the validation has been planned using multifactor DOE, the impact of each factor 290 can first be explored graphically to establish important contributions to potency 291 variability. The identification of important factors should lead to procedures that seek to 292 control their effects, such as further restrictions on intra-assay operating conditions or 293 strategic qualification procedures on inter-run factors such as analysts, instruments, and 294 reagent lots. 295 

Contributions of validation study factors to the overall IP of the bioassay can be 296 determined by performing a variance component analysis on the validation results. 297 Variance component analysis is best carried out using a statistical software package 298 

Page 8: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  8

that is capable of performing a mixed-model analysis with restricted maximum likelihood 299 estimation (REML). 300 

A variance component analysis yields variance component estimates such as 301 σ σ2 2

Intra Inter and ˆ ˆ , corresponding to intra-run and inter-run variation. These can be used to 302 

estimate the IP of the bioassay, as well as the variability of the reportable value for 303 different bioassay formats (format variability). IP expressed as percent geometric 304 standard deviation (%GSD) is given by using the natural log of the relative potency in 305 the analysis (see Statistical Considerations, Scale of Analysis): 306 

Intermediate Precision ( )σ +σ= ⋅ −2 2Inter Intraˆ ˆ100 e 1 307 

The variability of the reportable value from testing performed with n replicate sets in 308 each of k runs (format variability) is equal to: 309 

Format Variability ( )σ +σ ⋅= ⋅ −2 2Inter Intrak n kˆ ˆ100 e 1 310 

This formulation can be used to determine a testing format suitable for various uses of 311 the bioassay (e.g., release testing and stability evaluation). 312 

313 4. Range. The range of the bioassay is defined as the true or known potencies for 314 which it has been demonstrated that the analytical procedure has a suitable level of 315 relative accuracy and IP. The range is normally derived from the dilutional linearity study 316 and minimally should cover the product specification range for potency. For stability 317 testing and to minimize having to dilute or concentrate hyper- or hypo-potent test 318 articles into the bioassay range, there is value in validating the bioassay over a broader 319 range. 320 

321 Validation Target Acceptance Criteria 322 

323 The validation target acceptance criteria should be chosen to minimize the risks 324 inherent in making decisions from bioassay measurements and to be reasonable in 325 terms of the capability of the art. When there is an existing product specification, 326 acceptance criteria can be justified on the basis of the risk that measurements may fall 327 outside of the product specification. Considerations from a process capability index 328 (Cpk) can be used to inform bounds on the relative bias (RB) and the IP of the 329 bioassay: 330 

−=

⋅ σ + + σ2 2 2Pr oduct RA

USL LSLCpk6 RB

331 

Page 9: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  9

where USL and LSL are the upper and lower specification acceptance criteria, RB is a 332 

bound on the degree of relative bias in the bioassay, and 2Pr oductσ and 2

RAσ are target 333 

product and release assay (with associated format) variances respectively. This 334 formulation requires prior knowledge regarding target product variability, or the inclusion 335 of a random selection of lots to estimate this characteristic as part of the validation. The 336 choice of a bound on Cpk is a business decision and is related to the risk of obtaining 337 an out-of-specification (OOS) result. Some laboratories require process capability 338 corresponding to Cpk greater than or equal to 1.3. This corresponds to approximately a 339 1 in 10,000 chance that a lot with potency at the center of the specification range will be 340 OOS. The proportion of lots that are predicted to be OOS is a function of Cpk. 341 

When specifications have yet to be established for a product, a restriction on relative 342 bias or IP can be formulated on the basis of the capability of the art of the bioassay 343 methodology. For example, although chemical assays and immunoassays are often 344 capable of achieving near single digit percent geometric standard deviation (%GSD), a 345 more liberal restriction might be placed on bioassays, such as animal potency 346 bioassays, that operate with much larger %GSD. In this case the validation goal might 347 be to characterize the method, using the validation results to establish an assay format 348 that is predicted to yield reliable product measurements. A sound justification for target 349 acceptance criteria or use of characterization should be included in the validation 350 protocol. 351 

352 Assay Maintenance 353 

354 Once a bioassay has been validated it can be implemented. However, it is important to 355 monitor its behavior over time. This is most easily accomplished by maintaining 356 statistical process control (SPC) charts for suitable parameters of the Standard sample 357 response curve and potency of assay QC samples. The purpose of these charts is to 358 identify at an early stage any shift or drift in the bioassay. If a trend is observed in any 359 SPC chart, the reason for the trend should be identified. If the resolution requires a 360 modification to the bioassay or if a serious modification of the bioassay has occurred for 361 other reasons (for example, a major technology change), the modified bioassay should 362 be revalidated or linked to the original bioassay by an adequately designed bridging 363 study with acceptance criteria that use equivalence testing. 364 

365 Statistical Considerations 366 

367 Several statistical considerations are associated with designing a bioassay validation 368 and analyzing the data. These relate to the properties of bioassay measurements as 369 well as the statistical tools that can be used to summarize and interpret bioassay 370 validation results. 371 

372 

Page 10: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  10

1. Scale of analysis. The scale of analysis of bioassay validation, where data are the 373 relative potencies of samples in the validation study, must be considered in order to 374 obtain reliable conclusions from the study. This chapter assumes that appropriate 375 methods are already in place to reduce the raw bioassay response data to relative 376 potency (as described in USP <1034>). Relative potency measurements are typically 377 nearly log normally distributed. Log normally distributed measurements are skewed and 378 are characterized by heterogeneity of variability, where the standard deviation is 379 proportional to the level of response. The statistical methods outlined in this chapter 380 require that the data be symmetric, approximating a normal distribution, but some of the 381 procedures require homogeneity of variability in measurements across the potency 382 range. Typically, analysis of potency after log transformation generates data that more 383 closely fulfill both of these requirements. The base of the log transformation does not 384 matter as long as a consistent base is maintained throughout the analysis. Thus, for 385 example, if the natural log (log to the base e) is used to transform relative potency 386 measurements, summary results are converted back to the bioassay scale utilizing base 387 e. 388 

The distribution of potency measurements should be assessed as part of bioassay 389 development (as described in <1032>). If it is determined that potency measurements 390 are normally distributed, the validation can be carried out using methods described in 391 USP Validation of Analytical Methods <1225>. 392 

As a consequence of the usual (for parallel-line assays) log transformation of relative 393 potency measurements, there are advantages (balance and representativeness) if the 394 levels selected for the validation study are evenly spaced on the log scale. An example 395 with five levels would be 0.50, 0.71, 1.0, 1.4 and 2.0. Intermediate levels are obtained 396 as the geometric mean of two adjacent levels. Thus for example, the mid-level between 397 0.50 and 1.0 is derived as follows: 398 

= ⋅ =GM 0.50 1.0 0.71. 399 

Likewise, summary measures of the validation are influenced by the log-normal scale. 400 Predicted response should be reported as the geometric mean of individual relative 401 potency measurements, and variability expressed as %GSD. GSD is calculated as the 402 anti-log of the standard deviation, slog, of log transformed relative potency 403 measurements. The formula is given by: 404 

GSD = antilog(slog)-1. 405 

Variability is expressed as GSD rather than RSD of the log normal distribution in order 406 to preserve continuity using the log transformation. Intervals that might be calculated 407 from GSD will be consistent with intervals calculated from mean and standard deviation 408 of log transformed data. Table 1 presents an example of the calculation of geometric 409 mean (GM) and associated RB, with %GSD for a series of relative potency 410 measurements performed on samples tested at the 1.00 level (taken from an example 411 that follows). The log base e is used in the illustration. 412 

413 414 

Page 11: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  11

Table 1. Illustration of calculations of GM and %GSD 415 

RP ln RP 1.1489 0.1388 0.9287 -0.0740 1.1489 0.1388 1.0000 0.0000 1.0060 0.0060 1.0000 0.0000 1.1000 0.0953 1.0488 0.0476

Average 0.0441 GM = 1.0451 RB = 4.51%

SD 0.0755 %GSD = 7.8% 416 

Here the GM of the relative potency measurements is calculated as the antilog of the 417 average log relative potency measurements and then expressed as relative bias, the 418 percent deviation from the target potency: 419 

= = =

⎞⎛ ⎞⎛= ⋅ − = ⋅ − =⎟ ⎜ ⎟⎜⎝ ⎠⎝ ⎠

Average 0.0441GM e e 1.0451

GM 1.0451RB 100 1 100 1 % 4.51%Target 1.00

420 

and the percent geometric standard deviation (%GSD) is calculated as: 421 

( ) ( )= ⋅ − = ⋅ − =SD 0.0755%GSD 100 e 1 % 100 e 1 % 7.8% 422 

Note that the GSD calculated for this illustration is not equal to the IP determined in the 423 bioassay validation example for the 100% level (9.2%); see Table 4. This illustration 424 utilizes the average of within-run replicates, but the IP in the validation example 425 represents the variability of individual replicates. 426 

427 2. Reporting validation results using confidence intervals. Estimates of bioassay 428 validation parameters should be presented as a point estimate together with a 429 confidence interval. A point estimate is the numerical value obtained for the parameter, 430 such as the GM or %GSD. A confidence interval’s most common interpretation is as the 431 likely range of the true value of the parameter. The previous example determines a 90% 432 confidence interval for average log relative potency, CIln, as follows: 433 

= ± ⋅

= ± ⋅ = −ln dfCI x t s / n

0.0441 1.89 0.0755 / 8 ( 0.0065, 0.09462) 434 

Page 12: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  12

For percent relative bias this becomes: 435 

− ⎞ ⎞⎛ ⎛= ⋅ − ⋅ − = −⎟ ⎟⎜ ⎜

⎝ ⎝⎠ ⎠

0.0065 0.09462

RBe eCI 100 1 , 100 1 ( 0.65%, 9.92%)1.00 1.00

436 

The statistical constant (1.89) is from a t-table, with degrees of freedom (df) equal to the 437 number of measurements minus one (df = 8 – 1 = 7). Thus the true relative bias falls 438 between –0.65% and 9.92% with 90% confidence. A confidence interval for IP or format 439 variability can be formulated using methods for variance components. 440 

441 3. Assessing conformance to acceptance criteria. Bioassay validation results are 442 compared to target acceptance criteria in order to demonstrate that the bioassay is fit 443 for use. The process of establishing conformance of validation parameters to validation 444 acceptance criteria should not be confused with establishing conformance of relative 445 potency measurements to product specifications. Product specifications should inform 446 the process of setting validation acceptance criteria. 447 

A common practice is to apply acceptance criteria to the estimated validation 448 parameter. This does not account, however, for the uncertainty in the estimated 449 validation parameter. A solution is to hold the confidence interval on the validation 450 parameter to the acceptance criterion. This is a standard statistical approach used to 451 demonstrate conformance to expectation and is called an equivalence test. It should not 452 be confused with the practice of performing a significance test, such as a t-test, which 453 seeks to establish a difference from some target value (e.g., 0% relative bias). A 454 significance test associated with a P value > 0.05 (equivalent to a confidence interval 455 that includes the target value for the parameter) indicates that there is insufficient 456 evidence to conclude that the parameter is different from the target value. This is not the 457 same as concluding that the parameter conforms to its target value. The study design 458 may have too few replicates, or the validation data may be too variable to discover a 459 meaningful difference from target. Additionally, a significance test may detect a small 460 deviation from target that is practically insignificant. These scenarios are illustrated in 461 Figure 1. 462 

463  464 

Page 13: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  13

Figure 1. Use of confidence intervals to establish that validation results conform to an 465 acceptance criterion 466 

467 

The solid horizontal line represents the target value (perhaps 0% relative bias), and the 468 dashed lines form the lower (LAL) and upper (UAL) acceptance limits. In scenario a, the 469 confidence bound includes the target, and thus one could conclude there is insufficient 470 evidence to conclude a difference from target (the significance test approach). However, 471 although the point estimate (the solid diamond) falls within the acceptance range, the 472 interval extends outside the range, which signifies that the true relative bias may be 473 outside the acceptable range. In scenario b, the interval falls within the acceptance 474 range, signifying conformance to the acceptance criterion. The interval in scenario c 475 also falls within the acceptance range butt excludes the target. Thus scenario c, 476 although it is statistically significantly different from the target, is nevertheless 477 acceptable because the confidence interval falls within the target acceptance range 478 

Using the 90% confidence interval calculated previously, we can establish whether the 479 bioassay has acceptable relative bias at the 1.00 level compared to a target acceptance 480 criterion of no greater than + 12%, for example. Because the 90% confidence interval 481 for percent relative bias (– 0.65%, 9.92%) falls within the interval (100*[(1/1.12) – 1]%, 482 100*[(1.12/1) – 1]%) = (– 11%, 12%), we conclude that there is acceptable relative bias 483 at the 1.0 level. Note that a 90% confidence interval is used in an equivalence test 484 rather than a conventional 95% confidence interval. This is common practice and is the 485 same as the two 1-sided test (TOST) approach used in pharmaceutical bioequivalence 486 testing. 487 

488 4. Risks in decision-making and number of validation runs. The application of 489 statistical tests, including the assessment of conformance of a validation parameter to 490 its acceptance criteria, involves risks. One risk is that the parameter does not meet its 491 acceptance criterion although the property associated with that parameter is 492 satisfactory; another, the converse, is that the parameter meets its acceptance criterion 493 although the procedure is truly unsatisfactory. A consideration related to these risks is 494 sample size. 495 

0

U A L

a b c

L A L

0

U A L

a b c

L A L

Page 14: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  14

The two types of risk can be simultaneously controlled by strategic design, including 496 consideration of the number of runs that will be conducted in the validation. Specifically, 497 the minimum number of runs needed to establish conformance to an acceptance 498 criterion for relative bias is given by: 499 

( )α β+ σ≥

θ

2 2,df 2,df IP

2

t t ˆn 500 

where tα,df and tβ/2,df are distributional points from a Student’s t-distribution, α and β are 501 the one-sided type I and type II errors, respectively, df is the degrees of freedom 502 

associated with the study design (usually n – 1), σ2IPˆ is a preliminary estimate of IP, and 503 

θ is the acceptable deviation (target acceptance criterion). 504 

For example, if the acceptance criterion for relative bias is ± 0.11 log (i.e., θ = 0.11), the 505 

bioassay variability is σIPˆ = 0.076 log, and α = β = 0.05, 506 

( )+≥ ≈

2 2

2

1.89 2.36 0.076n 8 runs

0.11 507 

Note that this formulation of sample size assumes no intrinsic bias in the bioassay. A 508 more conservative solution includes some nonzero bias in the determination of sample 509 size. This results in a greater sample size to offset the impact of the bias. In the current 510 example the sample size increases to 10 runs if one assumes an intrinsic bias equal to 511 2%. Note also that this calculation represents a recursive solution (because the degrees 512 of freedom depends on n) requiring statistical software or an algorithm that employs 513 iterative methodology. 514 

515 5. Modeling validation results using mixed effects models. Many analyses 516 associated with bioassay validation must account for multiple design factors such as 517 fixed effects (e.g., potency level), as well as random effects (e.g., operator, run, and 518 replicate). Statistical models composed of both fixed and random effects are called 519 mixed effects models and usually require sophisticated statistical software for analysis. 520 The results of the analysis may be summarized in an analysis of variance (ANOVA) 521 table or a table of variance component estimates. The primary goal of the analysis is to 522 estimate critical parameters rather than establish the significance of an effect. The 523 modeling output provides parameter estimates together with their standard errors of 524 estimates that can be utilized to establish conformance of a validation parameter to its 525 acceptance criterion. Thus the average relative bias at each level is obtained as a 526 portion of the analysis together with its associated variability. These compose a 527 confidence interval that is compared to the acceptance criterion as described above. If 528 variances across levels can be pooled, statistical modeling can also determine the 529 overall relative bias and IP by combining information across levels performed in the 530 validation. Similarly, mixed effects models can be used to obtain variance components 531 for validation study factors and to combine results across validation study samples and 532 levels. 533 

Page 15: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  15

534 6. Statistical design. Statistical design, such as multifactor DOE or nesting can be 535 used to organize runs in a bioassay validation. Such designs are useful to incorporate 536 factors that are believed to influence the bioassay and that vary during long-term use of 537 the procedure. These are also used to better characterize the sources of variability and 538 to develop a strategic test plan that can be used to manage the variability of the 539 bioassay. 540 

Table 2 shows an example of a multifactor DOE that incorporates multiple analysts, 541 multiple cell culture preparations, and multiple reagent lots into the validation plan. 542 

543 Table 2: Example of a multifactor DOE with three factors

Run Analyst Cell Prep Reagent Lot 1 1 1 1 2 1 1 2 3 1 2 1 4 1 2 2 5 2 1 1 6 2 1 2 7 2 2 1 8 2 2 2

544 

In this design each analyst performs the bioassay with both cell preparations and both 545 reagent lots. This is an example of a full factorial design because all combinations of the 546 factors are performed in the validation study. To reduce the number of runs in the study, 547 fractional factorial designs may be employed when more than three factors have been 548 identified. Validation designs do not need to regard the resolution requirements of 549 typical screening designs. Unlike screening experiments, the validation design should 550 incorporate as many factors at as many levels as possible in order to obtain a 551 representative estimate of IP. More than two levels of a factor should be employed in 552 the design whenever possible. Validation runs should be randomized whenever 553 possible to mitigate the potential influences of run order or time. 554 

Figure 2 illustrates an example of a validation using nesting (replicate sets nested within 555 plate, plate nested within analyst). 556 

557 

558 

Page 16: USP_1033_Biological Assay Validation.pdf

559 

560 

561 562 563 564 565 

566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 

586 587 

588 589 590 591 592 593 594 

For bothvariabilitcan be uformat th

7. Signibioassay2% and should n1.2 and roundingrather thcalculatias confocriteria sof signifisupportewarrante1.2. Rouwhen anresult in initially rthe labosignifica

An examused to describeshowingcalculateruns of tassume

© 201

Fig

h of these tyty can be eused to idenhat meets t

ificant figuy is related 20% will su

not be confu0.12 have

g is approphan additiveions when tormance to should be sicant figureed by the ped for a parund potencinalyzing the

nonrepresround potenratory woul

ant figures s

mple illustrasupport a s

ed in the seg the projeced on the bthe bioassad to be equ

10 The United S

gure 2: Exam

ypes of desstimated frontify significthe procedu

ures. The nuto the latte

upport two used with ththe same nriate for log

e variabilitythe final mespecificatio

stated with tes in a specrecision of rticular bioaies in the vae results. Rentative esncies to twold like to obshould be u

A BIO

ates the prinspecificationection on Vacted rate of basis of the ay (see discual to 0 in th

States Pharma

mple of a n

sign as wellom the validcant sourceure’s require

umber of sier’s precisiosignificant fhe number number (twog normally s. Note that easurementon acceptathe appropr

cification shthe reporta

assay, a spealidation to ounding to

stimates of po significantbtain a precused in that

OASSAY VA

nciples desn range of 0alidation TaOOS resultvariability o

cussion of fohe calculati

acopeial Conve

ested desig

l as combindation resu

es of variabements for

ignificant figon. In generfigures. Theof decimal

o) of significscaled mearounding ot is reportednce criteriariate numbeould be con

able value. ecification aan approptoo many oprecision. It figures wh

cise estimatportion of t

ALIDATIO

cribed in th0.71 to 1.4

arget Accepts for variouof a reportaormat variaons. The la

ention All Rights

gn using tw

nations of thults. These ility as wellprecision.

gures in a rral, a bioase number oplaces—re

cant figuresasurementsccurs at thed and used

a. Likewise, er of significnsistent witFor examplacceptanceriate numbeor too few sn general then assesste of relativthe analysis

N EXAMPL

his chapter. for the prod

ptance Criteus restrictioable value uability, abovaboratory m

s Reserved 

wo analysts

he two, comcomponent as to deriv

reported ressay with %

of significaneported valus. This stan

s that have e end of a s

d for decisiospecificatio

cant figuresth the numble, if two fige criterion mer of signifisignificant fhe laborato

sing variabilve accuracys.

LE

The bioassduct. Usingeria, a tableons on RB ausing three ve). Producmay wish to

mponents ofts of variabve a bioassa

sult from a GSD betwe

nt figures ues equal tndard of proportionaseries of on making son acceptas. The number of figuregures are might be 0.8cant figuresigures will

ory should ity. Becaus

y, more

say will be g the Cpk e is derived and IP. Cpkindependet variability include tar

16

f ility ay

een

to

al

such nce

mber es

80 to s

se

k is ent

is rget

Page 17: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  17

product variability. An estimate of target product variability can be obtained from data 595 from a product, for example, manufactured by a similar process. 596 

597 

Table 3: Cpk and probability of OOS for various restrictions on RB and IPLSL–USL IP (%) RB (%) Cpk Prob(OOS) (%) 0.71–1.4 20 20 0.54 10.5 0.71–1.4 8 12 0.93 0.53 0.71–1.4 10 5 1.54 0.0004

598 The calculation is illustrated for IP equal to 8% and relative bias equal to 12% (n = 3 599 runs): 600 

−= =

⋅ +

= ⋅ − ⋅ =

2 2

ln(1.4) ln(0.71)Cpk 0.936 ln(1.08) 3 ln(1.12)

Pr ob(OOS) 2 NORMINV( 3 0.93) 0.0053 (0.53%) 601 

From Table 3, acceptable performance (less than 1% chance of obtaining an OOS 602 result due to bias and variability of the bioassay) can be expected if the IP is ≤ 8% and 603 relative bias is ≤ 12%. The sample size formula given in the section on Risks in decision 604 making and number of runs can be used to derive the number of runs required to 605 establish conformance to an acceptance criterion for relative bias equal to 12% (using 606 σIP = 8%; α = β = 0.05): 607 

( )+≥ ≈

2 2

2

1.89 2.36 ln(1.08)n 8 runs.

ln(1.12) 608 

Thus 8 runs would be needed to have a 95% chance of passing the target acceptance 609 criterion for relative bias if the true relative bias is zero. 610 

Five levels of the target analyte are studied in the validation: 0.50, 0.71, 1.0, 1.4, and 611 2.0. Runs are generated by two trained analysts using two cell preparations. Cell 612 preparations must be independent and must represent those elements of cell culture 613 preparation that vary during long-term performance of the bioassay. Other factors may 614 be considered and incorporated into the design using a fractional factorial layout. The 615 laboratory should strive to design the validation with as many levels of each factor as 616 possible in order to best model the long-term performance of the bioassay. In this 617 example each analyst performs two runs using each cell preparation. A run consists of a 618 full titration curve of the Standard sample, together with two full titration curves of the 619 Test sample. This yields duplicate measurements of relative potency at each level in 620 each validation run. The example dataset is presented in Table 4. 621 

622 

Page 18: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  18

Table 4: Example of bioassay validation with 2 analysts, 2 cell preparations, and 2 runs

Analyst/Cell

Prep 1/1 1/1 1/2 1/2 2/1 2/1 2/2 2/2

Run 1 2 1 2 1 2 1 2

Level

0.50 0.52 0.45 0.57 0.51 0.52 0.52 0.53 0.51

0.50 0.50 0.45 0.56 0.54 0.50 0.51 0.54 0.55

0.71 0.76 0.67 0.68 0.70 0.70 0.75 0.69 0.74

0.71 0.71 0.62 0.82 0.71 0.64 0.69 0.77 0.74

1.0 1.1 0.98 1.2 1.0 1.1 1.0 1.1 1.0

1.0 1.2 0.88 1.1 1.0 0.92 1.0 1.1 1.1

1.4 1.5 1.3 1.5 1.4 1.4 1.3 1.5 1.5

1.4 1.5 1.3 1.6 1.4 1.4 1.4 1.5 1.5

2.0 2.4 1.9 2.4 2.3 2.2 2.1 2.4 2.0

2.0 2.2 2.0 2.4 2.2 2.1 2.1 2.2 2.3

Note: Levels and potencies are rounded to 2 significant figures following the section on 623 significant figures. 624 

625 Figure 3. A plot of the validation results vs the sample levels. 626 

627 

628 The plot is used to reveal irregularities in the experimental results. In particular, a 629 properly prepared plot can reveal a failure in agreement of validation results with 630 

Analyst 1

Analyst 2

2 . 0

1 . 4

1 . 0

0 . 7 1

0 . 5 0

0 . 5 0 0 . 7 1 1 . 0 1 . 4 2 . 0

Page 19: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  19

validation levels, as well as heterogeneity of variability across levels (see discussion of 631 the log transformation in Statistical Considerations). The example plot in Figure 3 632 includes the unit line (line with slope equal to 1, passing through the origin). The analyst 633 1 and analyst 2 data are deliberately offset with respect to the expected potency to 634 allow clear visualization and comparison of the data sets from each analyst. 635 

A formal analysis of the validation data might be undertaken in the following steps: (1) 636 an assessment of variability (IP) should precede an assessment of relative accuracy or 637 specificity in order to establish conformance to the assumption that variances across 638 sample levels can be pooled,; and (2) relative accuracy is assessed either at separate 639 levels or by a combined analysis, depending on how well the data across levels can be 640 pooled. These steps are demonstrated using the example validation data, along with 641 some details of the calculations for illustrative purposes. Note that the calculations 642 illustrated in the following sections are appropriate only with a balanced dataset. 643 Imbalanced designs or datasets with missing relative potency measurements should be 644 analyzed using a mixed model analysis with restricted maximum likelihood estimation 645 (REML). 646 

647 Intermediate Precision 648 

649 Data at each level can be analyzed using variance component analysis. An example of 650 the calculation performed at a single level (0.50) is presented in Table 5. 651 

652 

Table 5. Variance component analysis performed on log relative potency measurements at the 0.5 level

Sum of Mean

Source df Squares Square Expected Mean Square

Run 7 0.059004 0.008429 Var(Error) + 2Var(Run)

Error 8 0.006542 0.000818 Var(Error)

Corrected Total 15 0.065546

Variance Component Estimates

Var(Run) 0.003806

Var(Error) 0.000818

653 

Page 20: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  20

The top of the table represents a standard ANOVA analysis, and cell preparation have 654 not been included because of the small number of levels (2 levels) for each factor. The 655 factor “Run” in this analysis represents the combined runs across the analyst by cell 656 preparation combinations. The Expected Mean Square is the linear combination of 657 variance components that generates the measured mean square for each source. The 658 variance component estimates are derived by solving the equation "Expected Mean 659 Square = Mean Square" for each component. To start, the mean square for Error 660 estimates Var(Error), the within-run component of variability is 661 

Var(Error) = MS(Error) = 000818 662 

The between-run component of variability, Var(Run), is subsequently calculated by 663 setting the mean square for Run to the mathematical expression for the expected mean 664 square, then solving the equation for Var(Run) as follows: 665 

= + ⋅

−=

−= =

MS(Run) Var(Error) 2 Var(Run)

MS(Run) Var(Error)Var(Run)2

0.008429 0.000818 0.0038062

666 

These variance component estimates are combined to establish the overall IP of the 667 bioassay at 0.50: 668 

( )( )

+

+

= ⋅ −

= ⋅ − =

Var(Run) Var(Error)

0.0033806 0.000818

IP 100 e 1

100 e 1 7.0% 669 

The same analysis was performed at each level of the validation and is presented in 670 Table 6. 671 

672 

Table 6. Variance component estimates and overall variability for each validation level and the average

Level Component 0.50 0.71 1.0 1.4 2.0 Average Var(Run) 0.003806 0.000460 0.003579 0.003490 0.002960 0.002859Var(Error) 0.000818 0.004557 0.004234 0.000604 0.002590 0.002561

Overall 7.0% 7.3% 9.2% 6.6% 7.7% 7.6% 673 A combined analysis can be performed if the variance components are similar across 674 levels. Typically a heuristic is used for this assessment. One might hold the ratio of the 675 

Page 21: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  21

maximum variance to the minimum variance to no greater than 10 (10 is used because 676 of the limited number of runs performed in the validation). Here the ratios associated 677 with the between-run variance component, 0.003806/0.000460 = 8.3, and the within-run 678 component, 0.004557/0.000604 = 7.5, meet the 10-fold criterion. Had the ratio 679 exceeded 10 and if this was due to excess variability in one or the other of the extremes 680 in the levels tested, that extreme would be eliminated from further analysis and the 681 range would be limited to exclude that level. 682 

The analysis might proceed using statistical software that is capable of applying a mixed 683 effects model to the validation results. That analysis should account for any imbalance 684 in the design, as well as other random effects such as analyst and cell preparation (see 685 Modeling Validation Results Using Mixed Effects Models in Statistical Considerations). 686 Variance components can be determined for Analyst and Cell Preparation separately in 687 order to characterize their contributions to the overall variability of the bioassay 688 

In the example variance components can be averaged across levels to report the IP of 689 the bioassay. This method of combining estimates is exact only if a balanced design 690 has been employed in the validation (i.e., the same replication strategy at each level). A 691 balanced design was employed for the example validation, so the IP can be reported as 692 7.6% GSD. 693 

Because of the recommendation to report validation results with some measure of 694 uncertainty, a one-sided 95% upper confidence bound can be calculated for the IP of 695 the bioassay. The literature contains methods for calculating confidence bounds for 696 variance components. The upper bound on IP for the bioassay example is 12.2% GSD. 697 The upper confidence bound was not calculated at each level separately because of the 698 limited data at an individual level relative to the overall study design. 699 

700 Relative Accuracy 701 

702 The analysis might proceed with an assessment of relative accuracy at each level. 703 Table 7 shows the average and 90% confidence interval of validation results in the log 704 scale, as well as corresponding potency and relative bias. 705 

706 

Table 7. Average Potency and Relative Bias at Individual Levels log Potency Potency Relative Bias

Level na Average (90% CI) Average (90% CI) Average (90% CI) 0.50 8 – 0.6608 (– 0.7042, –0.6173) 0.52 (0.49, 0.54) 3.29% (– 1.10, 7.88)0.71 8 – 0.3422 (– 0.3772, –0.3071) 0.71 (0.69, 0.74) 0.03% (– 3.42, 3.60)1.00b 8 0.0441 (– 0.0065, 0.0946) 1.05 (0.99, 1.10) 4.51% (– 0.65, 9.92)1.41 8 0.3611 (0.3199, 0.4024) 1.43 (1.38, 1.50) 1.77% (– 2.35, 6.05)2.00 8 0.7860 (0.7423, 0.8297) 2.19 (2.10, 2.29) 9.73% (5.04, 14.63)

aAnalyses performed on averages of duplicates from each run. 707 bCalculation illustrated in Statistical Considerations, Scale of Analysis. 708 

Page 22: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  22

709 The analysis has been performed on the average of the duplicates from each run (n = 8 710 runs) because duplicate measurements are correlated within a run by shared IP factors 711 (analyst, cell preparation, and run in this case). A plot of relative bias vs level can be 712 used to examine patterns in the experimental results and to establish conformance to 713 the target acceptance criterion for relative bias (12%). 714 

715 Figure 4. Plot of 90% confidence intervals for relative bias vs the acceptance criterion 716 

717 

Note: lower acceptance criterion equal to 100·[(1/1.12) – 1] = – 11%. 718 

719 Figure 4 shows an average positive bias across sample levels (i.e., the average relative 720 bias is positive at all levels). This consistency is due in part to the lack of independence 721 of bioassay results across levels. In addition there does not appear to be a trend in 722 relative bias across levels. The latter would indicate that a comparison of samples with 723 different measured relative potency (such as stability samples) is biased, resulting 724 perhaps in an erroneous conclusion. Trend analysis can be performed using a 725 regression of log relative potency vs log level. Introduction of an acceptance criterion on 726 a trend in relative accuracy across the range can be considered and may be introduced 727 during development of the bioassay validation protocol. 728 

After establishing that there is no meaningful trend across levels, the analysis proceeds 729 with an assessment of the relative accuracy at each level. The bioassay has acceptable 730 relative bias at levels from 0.50 to 1.4, yielding 90% confidence bounds (equivalent to a 731 two one-sided t-test) that fall within the acceptance region of – 11% to 12% relative 732 bias. The 90% confidence interval at 2.0 falls outside the acceptance region, indicating 733 that the relative bias may exceed 12%. 734 

A combined analysis can be performed using statistical software that is capable of 735 applying a mixed effects model to the validation results. That analysis accurately 736 accounts for the validation study design. The analysis also accommodates random 737 effects such as analyst, cell preparation, and run (see Modeling Validation Results 738 Using Mixed Effects Models in Statistical Considerations). 739 

740 Range 741 

742 

1 2 %

0 %

- 1 1 % 0 . 5 0 0 . 7 1 1 . 0 0 1 . 4 1 2 . 0 0

Page 23: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  23

The conclusions from the assessment of IP and relative accuracy can be used to 743 establish the bioassay’s range that demonstrates satisfactory performance. Based on 744 the acceptance criterion for IP equal to 8 %GSD (see Table 6) and for relative bias 745 equal to 12% (see Table 7), the range of the bioassay is 0.50 to 1.41. In this range, 746 level 1.0 has a slightly higher than acceptable estimate of IP (9.2% versus the target 747 acceptance criterion ≤ 8.0%), which may be due to the variability of the estimate that 748 results from a small dataset. Because of this and other results in Table 6, one may 749 conclude that satisfactory IP was demonstrated across the range. 750 

751 Use of Validation Results for Bioassay Characterization 752 

753 When the study has been performed to estimate the characteristics of the bioassay 754 (characterization), the variance component estimates can also be used to predict the 755 variability for different bioassay formats and thereby can determine a format that has a 756 desired level of precision. The predicted variability for k independent runs, with n 757 individual dilution series of the test preparation within a run is given by the following 758 formula for format variability: 759 

( )+ ⋅= ⋅ −Var(Run)/k Var(Error)/n kFormat Variability 100 e 1 760 

Thus using estimates of intra-run and inter-run variance components from Table 6 761 [Var(Run) = 0.002859 and Var(Error) = 0.002561], if the bioassay is performed in 3 762 independent runs with a single dilution series of the test within each run, the predicted 763 variability of the reportable value (geometric mean of the relative potency results) is 764 equal to: 765 

( )+= ⋅ − =0.002859/3 0.002561/3Format Variability 100 e 1 4.3% 766 

This calculation can be expanded to include various combinations of runs and replicate 767 sets within runs as shown in Table 8. 768 

769 

Table 8. Format variability for different combinations of number of runs (k) and number of replicate sets within run (n) Number of Runs (k)

Reps (n) 1 2 3 6 1 7.6% 5.3% 4.3% 3.1% 2 6.6% 4.7% 3.8% 2.7% 3 6.3% 4.4% 3.6% 2.5% 6 5.9% 4.1% 3.4% 2.4%

770 

Page 24: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  24

Clearly the most effective means of reducing the variability of the reportable value (the 771 geometric mean potency across runs and replicate sets) is by independent runs of the 772 bioassay procedure. In addition, confidence bounds on the variance components used 773 to derive IP can be utilized to establish the bioassay’s format variability. 774 

Significant sources of variability must be incorporated into runs in order to effect 775 variance reduction. A more thorough analysis of the bioassay validation example would 776 include analyst and cell preparation as factors in the statistical model. Variance 777 component estimates obtained from such an analysis are presented in Table 9. 778 

779 

780 

781 Because cell preparation is a significant component of the overall potency variability, 782 multiple cell preparations should be utilized to moderate the inter-run variability of the 783 bioassay. 784 

Estimates of intra-run and inter-run variability can also be used to determine the sizes of 785 differences (fold difference) that can be distinguished between samples tested in the 786 bioassay. The critical fold difference between reportable values for two samples that are 787 tested in the same runs of the bioassay is given by (for k runs, n replicate sets within 788 each run, using an approximate 2-sided critical value from the standard normal 789 distribution, z = 2): 790 

⋅ + ⋅= 2 Var(Run)/k Var(Error)/(n k)Critical Fold Difference e 791 

When samples have been tested in different runs of the bioassay (such as long-term 792 stability samples), the critical fold difference is given by (assuming the same format is 793 used to test the two series of samples): 794 

[ ]⋅ ⋅ + ⋅= 2 2 Var(Run)/k Var(Error)/(n k)Critical Fold Difference e 795 

For comparison of samples the laboratory can choose a design (bioassay format) that 796 has suitable precision to detect a practically meaningful fold difference between 797 samples. 798 

799 

C o m p o n e n tV a ria n c e E s t im a t e

V a r(A n a ly s t ) 0 . 0 0 0 0V a r(C e l lP re p ) 0 . 0 0 1 4

V a r(A n a ly s t * C e llP re p ) 0 . 0 0 0 0V a r(R u n (A n a ly s is *C e l lP re p )) 0 . 0 0 2 1

V a r(E rro r) 0 . 0 0 2 6

Ta b le 9 : R E M L E s t im a t e s o f V a ria n c e C o m p o n e n t s A s s o c a t e d w it h A n a ly s t ,

C e l l P re p a ra t io n , a n d R u n

Page 25: USP_1033_Biological Assay Validation.pdf

© 2010 The United States Pharmacopeial Convention All Rights Reserved  25

Confirmation of Intermediate Precision and Revalidation 800 

801 The estimate of IP from the validation is highly uncertain because of the small number 802 of runs performed. After the laboratory gains suitable experience with the bioassay, the 803 estimate can be confirmed or updated by analysis of control sample measurements. 804 The variability of a positive control can be used when the control is prepared and tested 805 as is a Test sample (i.e., same or similar dilution series and replication strategy). The 806 assessment should be made after sufficient runs have been performed and after routine 807 changes in bioassay factors (e.g., different analysts, different key reagent lots, and 808 different cell preparations) have taken place. The reported IP of the bioassay should be 809 modified as an amendment to the validation report if the assessment reveals a 810 substantial disparity of results. 811 

The bioassay should be revalidated whenever a substantial change is made to the 812 method. This includes but is not limited to a change in technology or a change in 813 readout. The revalidation may consist of a complete re-enactment of the bioassay 814 validation or a bridging study that compares the current and the modified methods. 815 

816 LITERATURE 817 

Additional information and alternative methods can be found in the references listed 818 below. 819 

820 ASTM. Standard Practice for Using Significant Digits in Test Data to Determine 821 Conformance with Specifications, ASTM E29-08. Conshohocken, PA: ASTM; 2008. 822 

Berger R, Hsu J. Bioequivalence trials, intersection-union tests and equivalence 823 confidence intervals. Stat Sci. 1996;11(4):283–319. 824 

Burdick R, Graybill F. Confidence Intervals on Variance Components. New York: Marcel 825 Dekker; 1992:28–39. 826 

Haaland P. Experimental Design in Biotechnology. New York: Marcel Dekker; 1989:64–827 66. 828 

Schofield TL. Assay validation. In: Chow SC, ed. Encyclopedia of Biopharmaceutical 829 Statistics. 2nd ed. New York: Marcel Dekker; 2003. 830 

Schofield TL. Assay development. In: Chow SC, ed. Encyclopedia of Biopharmaceutical 831 Statistics. 2nd ed. New York: Marcel Dekker; 2003. 832 

Winer B. Statistical Principles in Experimental Design. 2nd ed. New York: McGraw-Hill; 833 1971:244–251. 834