SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1...

20
STRmix - MCMC Study Page 1 SDPD Crime Laboratory Forensic Biology Unit Validation of the STRmix TM Software MCMC Markov Chain Monte Carlo Introduction The goal of DNA mixture interpretation should be to identify the genotypes of the contributors that comprise the mixture. DNA mixture results can often be explained by multiple possible genotype combinations. Given how many loci there are in the GlobalFiler amplification kit, the number of possible genotype combinations is prohibitively large, and deduction of the component genotypes that comprise the mixture (called a deconvolution) becomes a very complex problem. The Markov Chain Monte Carlo (MCMC) describes a standard statistical methodology that dominates modern analysis of statistical problems across disciplines. STRmix uses MCMC to approach the complex problem of DNA mixture interpretation. Below is an overview of MCMC. The MCMC process involves thousands to millions of iterations, and three main steps in each iteration (Figure 1). First, based on the number of contributors input by the analyst, various genotype combinations that could possibly describe the mixture are determined. This is the prior distribution of genotype sets that could describe the data and all are all sets at each markers are systematically, randomly, and independently sampled to ensure that all combinations are considered. The set of variables describing the amount of DNA in the profile are collectively known as the mass variables, which are: DNA template amount, degradation, and amplification efficiency of each locus. DNA template amount and degradation are variables assigned to each contributor in the mixture, whereas locus specific amplification efficiency is applied to each contributor. An expected DNA profile is built using the possible genotypes combinations and mass variables. Allele-specific stutter is then applied to adjust peak heights. Second, a probability of the expected DNA peaks given the selected mass parameters is calculated by comparing them to the observed peaks in the data. In addition to the mass parameters and genotype sets, STRmix will also select variance values from the distributions determined from ModelMaker. This comparison of expected to observed takes into account the selected allele and stutter variance values. Third, the proposed set of variables are either accepted or rejected depending on whether they are a good description or a poor description of the observed DNA profile data as compared to another expected profile given a different set of mass variables and a genotype set that is different at a single locus. The first 100,000 iterations (termed burn-in) of the MCMC are dedicated to reaching an equilibrium state where a smaller distribution of values for mass parameters and a more limited number of genotype sets are being regularly chosen in accordance with how well they describe the observed data. This prevents putting too much weight on the more random guess that occur

Transcript of SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1...

Page 1: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 1

SDPD Crime Laboratory – Forensic Biology Unit

Validation of the STRmixTM Software

MCMC

Markov Chain Monte Carlo

Introduction

The goal of DNA mixture interpretation should be to identify the genotypes of the contributors

that comprise the mixture. DNA mixture results can often be explained by multiple possible

genotype combinations. Given how many loci there are in the GlobalFiler amplification kit, the

number of possible genotype combinations is prohibitively large, and deduction of the

component genotypes that comprise the mixture (called a deconvolution) becomes a very

complex problem. The Markov Chain Monte Carlo (MCMC) describes a standard statistical

methodology that dominates modern analysis of statistical problems across disciplines. STRmix

uses MCMC to approach the complex problem of DNA mixture interpretation. Below is an

overview of MCMC.

The MCMC process involves thousands to millions of iterations, and three main steps in each

iteration (Figure 1). First, based on the number of contributors input by the analyst, various

genotype combinations that could possibly describe the mixture are determined. This is the prior

distribution of genotype sets that could describe the data and all are all sets at each markers are

systematically, randomly, and independently sampled to ensure that all combinations are

considered. The set of variables describing the amount of DNA in the profile are collectively

known as the mass variables, which are: DNA template amount, degradation, and amplification

efficiency of each locus. DNA template amount and degradation are variables assigned to each

contributor in the mixture, whereas locus specific amplification efficiency is applied to each

contributor. An expected DNA profile is built using the possible genotypes combinations and

mass variables. Allele-specific stutter is then applied to adjust peak heights. Second, a probability

of the expected DNA peaks given the selected mass parameters is calculated by comparing them

to the observed peaks in the data. In addition to the mass parameters and genotype sets, STRmix

will also select variance values from the distributions determined from ModelMaker. This

comparison of expected to observed takes into account the selected allele and stutter variance

values. Third, the proposed set of variables are either accepted or rejected depending on whether

they are a good description or a poor description of the observed DNA profile data as compared

to another expected profile given a different set of mass variables and a genotype set that is

different at a single locus.

The first 100,000 iterations (termed burn-in) of the MCMC are dedicated to reaching an

equilibrium state where a smaller distribution of values for mass parameters and a more limited

number of genotype sets are being regularly chosen in accordance with how well they describe

the observed data. This prevents putting too much weight on the more random guess that occur

Page 2: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 2

in the beginning. During the post-burn-in, deconvolution is the process of creating the list of

genotype sets and assigning weights to each set that reflect how well they ‘fit’ the evidence

profile. If the proposed set of genotypes from each contributor is less likely to lead to the

observed evidence profile then that set will be given a low weighting (close to zero), and if the

proposed genotype set is more likely to lead to the observed evidence profile then that set will be

given a high weighting (close to 1).

Within STRmix, the typical use of MCMC is to ultimately provide weights for genotype sets that

might explain some evidence, given the biological model used to describe DNA profile behavior.

This process describes a fully continuous probabilistic genotyping approach to DNA profile

interpretation.

Figure 1 – A description of the MCMC process used by STRmix. Figure taken from the User’s manual (1).

The circles are modeled by STRmix, and the squares are input parameters by the user (See ModelMaker

Validation Study for details about settings input by the user).

Page 3: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 3

Starting with the process to build a profile with expected peak heights, the steps are as follows:

the input file provides STRmix with detected peaks and peak heights. The first thing it does is to

determine the total possible genotype combos (determined by the formula (n+1)2N, where n = the

number of alleles detected, and N = the number of contributors, input by user). It then removes

duplicates, and goes through and assesses overall peak heights to determine what must be allelic

and what could be stutter (based on the global stutter cutoff -a setting in STRmix - 30%), or

possible drop-in. STRmix then generates a list of genotype combinations that are possible, given

the input file, stutter and drop-in considerations, and the number of contributors input by the

analyst. This list is considerably smaller than what it started with, given these constraints.

A set of genotypes from the list of possible genotypes is randomly assigned. Test values for the

amount of DNA (tn), degradation (dn), and locus specific amplification efficiencies (Al) are

applied. These are known as the mass parameters. Replicates would also be taken into account at

this step, but they are not being assessed as part of the internal validation in this lab. STRmix

determines the total allelic product (TAP) by applying the test values for these mass parameters.

The mass parameters

The total allelic product (TAP) at a locus is equal to the locus specific amplification efficiency

multiplied by the template amount multiplied by the allele count multiplied by the degradation

equation. This is done for every contributor, for every allele detected, at every locus in the

sample. The following is the equation describing the TAP.

offsetmwtdl

ann

l

r

l

an

laneXtAT

Alr = Locus offset, or locus specific amplification efficiencies (LSAE); r = replicate factor (=1

when a single PCR reaction).

tn = template for DNA contributor n

Xlan = count of allele ‘a’ at locus ‘l’ in contributor ‘n’ (2 for homozygotes and 1 for

heterozygotes)

offsetmwtd lane

is the exponential formula that incorporates the base size of the allele.

dn = degradation slope for contributor n

mwt = molecular weight (nucleotide length)

offset = smallest size of a detected peak in the electropherogram

Degradation is dependent on fragment size, so as size increases the amount of degradation

increases. The degradation reported in the output file is a linear approximation of the exponential

curve that is the true degradation factor.

Page 4: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 4

Each of these mass parameters are selected from sliding windows of possible values, and they

are varied at each iteration.

Stutter

The total allelic product is now calculated. Stutter is then taken into account. Some of the total

allelic product expected from DNA amount and degradation will become stutter. STRmix does

this by apportioning total allelic product into allelic and stutter height, using the following

equations:

To get the allele and locus specific stutter, STRmix first looks to the Stutter Exceptions file. If

STRmix sees a non-zero value for stutter ratio for any allele at a locus it will use that as the

stutter ratio expectation. If, instead, a zero is encountered, it will look to the stutter file and use

the stutter ratio value obtained from the regression line for that locus and the observed allele (see

STRmix ModelMaker study for more details about the composition of these two files).

STRmix then calculates the probability of obtaining the expected profile, given the genotype set

and mass parameters if the values above were true.

Metropolis Hastings - probability of an expected profile

If the randomly chosen mass parameters are correct, then the any differences between the

calculated expected (based on the proposed mass parameters) and our observed (the data from

the 3500) are only based on random PCR/injection variation (essentially stochastic or sampling

variation). We have estimates of how much variation we can expect from ModelMaker: allele

and stutter variance parameters (c2 and k2), and LSAE variance. STRmix now compares this

expected profile to the observed profile and gives it a probability. Through modelling, it is

known that the log[observed /expected] has a normal distribution with a mean of 0 and a variance

that is inversely proportional to the expected peak height. There are three assumptions in this

calculation: 1) an approximate normal distribution with a mean of zero, 2) a variance of c2/Elan

for the allele model, 3) a variance of k2/Elan for the stutter model:

What these equations allow is to determine the likelihood for the observed locus given the

parameters chosen by the MCMC.

2( 1)

1

2

log ~ 0, for stutter

log ~ 0, for alleles

a

l l

ana n

a

l l

an an

O kN

E E

O cN

E E

Page 5: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 5

(or k2 instead of c2)

It is important to include Ea in the variance, because we know from the GlobalFiler validation

that the variance is dependent on peak height. For example, the peak height balance is more

variable for lower peak heights than for taller ones. The same is true with stutter. Dividing by Ea

means that we expect less spread (variation) for high RFU alleles (i.e., more template) and more

spread for lower RFU alleles. The probability of the observed peak given the expected peaks

(mass variables) is the “probability” from the normal distribution, centred on zero. Really these

probabilities are determined from the probability density function (curve). This can be done in

excel with the NORM.DIST(test value, average value, 0, standard deviation) function.

These probabilities are calculated for each allele. For the locus ‘l’ we can calculate the likelihood

by multiplying each individual allele likelihoods of the ‘a’ alleles. For the entire profile we can

calculate the likelihood by multiplying the likelihood of each of the ‘l’ loci:

(in the equation above, k is replaced with c2 or k2)

Ultimately, STRmix uses the log of the probability for each allele. This can be calculated in two

ways: take the log(probability) for each peak and then sum them, or multiply the probabilities

across loci and take a log of the product.

Before these probabilities are used for determining whether to accept or reject the current guess,

there are some penalties that are applied. The MCMC will not allow the parameters of allele,

stutter, and LSAE variance stray too far off from reasonable values. Penalties are built in when

random variables are unlikely. A drop-in penalty also occurs if one of the peaks in the profile is

considered to be drop-in in that iteration. Drop-in is considered a rare event and any

combinations that require drop-in for the combinations to occur will be penalized.

LSAE penalties: during the MCMC iterations, LSAE is selected from a distribution. This normal

distribution is centered on 0 with a variance based on the ModelMaker value of LSAE variance.

An LSAE penalty is added to the ~N(log(E),0,c2) for departures that are far off from 1. The

Pr(LSAE) = PLSAE = N(LSAE,0,√𝐿𝑆𝐴𝐸 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒); log(PLSAE) is added to the log(Ppeaks) for the

alleles and the stutter.

Allele and stutter variance penalties: The (stutter and allele) variance is randomly selected from

the gamma distribution during each iteration. The gamma distribution is determined by α and β

and determined during Model Maker. Any extreme values selected for the variance are

)|2Pr()|1Pr()|Pr( MlocusMlocusMprofile

log ;0,l

a

l ll a a a

O kN

E E

Page 6: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 6

(deviations away from the mode) are penalized. Since variance is modeled by a gamma

distribution, deviations that are too far off from the mode (maximum point) are penalized.

Penalty = ~Γ(iteration variance,α,β).

Drop-in penalties: The drop-in penalty is based on the probability density function of a particular

peak being drop-in.

Metropolis Hastings – accept or reject

Metropolis–Hastings algorithm is a Markov chain Monte Carlo method for obtaining a sequence

of random samples from a probability distribution for which direct sampling is difficult. The key

is that the accept/reject criterion gives a sample from the desired probability in the long run as

long as the accept/reject is proportional to the true probabilities. STRmix is comparing the

modelled (expected) profile to the observed profile, and determining if the model built in the

current iteration is better than the model it built in the previous iteration. The goodness of fit is

the heart of the acceptance/rejection step. This is something that can be assigned a numerical

value and can be calculated. STRmix treats each observed and expected peak height comparison

independently of the others. It combines Prob (Observed | Expected) for each peak for all loci

with a few penalties along the way.

If the proposed model is better than the current model then STRmix accepts it, and if the

proposed model is worse than the current model then STRmix accepts it only a certain

percentage of the time. This percentage is determined by the probability of the proposed divided

by the probability of the current guess. In other words: it is a comparison of where the model has

been compared to the new model.

M-Hnew / M-Hold ≥ 1 = take the step

M-Hnew / M-Hold < 1 = take the step in proportion to the ratio

Any proportion >0.5 means that it is more likely going to take that next step.

MCMC iterations

For each step of the MCMC chain, the mass parameters and a genotype set that differs at one

locus are independently chosen (component-wise MCMC). The MCMC is set of algorithms that

act like a calculator for solving very complex equations (those that would take too long to solve

using standard methods). Eventually the MCMC will reach equilibrium where: 1) DNA amount,

degradation, and locus specific amplification efficiency are stable; and 2) Limited number

genotypes are chosen in proportion to their probability. In STRmix the MCMC is ‘solving’ the

equation for genotype weights.

MCMC weightings

There are hundreds of thousands to billions of iterations before reaching the required number of

MCMC accepts (500,000 total accepts; 400,000 post burn-in). During that time STRmix may

spend multiple iterations on the same guess before moving to a better guess. The amount of

iterations STRmix spends on one guess will be proportional to how good a guess it is. STRmix

Page 7: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 7

turns this proportion into the weight of that guess. There is some variability associated with the

MCMC process, and this can be assessed. Each time a sample is run, STRmix gives a different

weighting. Run over and over, these different answers all cluster around each other and the

amount that they would vary is small in relation to the magnitude of the answer.

Burn-in

The MCMC starting point is random and therefore the MCMC chains will (likely) start in a very

bad sample space (bad guess). It takes some time for them to reach good sample space. Tallying

how much time the chain has spent on genotypes isn’t done until 100,000 accepts have been

reached. This “burn-in” period allows the chain to reach better samples space and keeps the bad

genotype combinations from being overrepresented.

Dropout

Dropout is defined as the absence of the observation of a peak above an analytical threshold

where one is expected. Dropout can be considered an extreme form of imbalance. Dropout is a

possibility if one or more of the contributors are providing low levels of DNA to the

amplification reaction. Dropout is designated as a Q allele within STRmix. There are few

important things to note about Q alleles as they are encountered in the MCMC process. Q alleles

within and between donors never sum their RFUs. For the purpose of profile modeling, Q alleles

are always treated as a new allele unlike all others, even though the size will overlap. Also,

Prob(O|E) calculated separately for every Q (drop-out) allele.

Diagnostic Tools

In the summary output of STRmix, there are numerous diagnostics that may indicate

that a deconvolution has not converged on the best sample space. Any of these, on their own do

not indicate a problem with the deconvolution, but can be helpful in identifying aspect of the

sample to go back and double check.

Gelman-Rubin convergence diagnostic: informs the user whether the MCMC analysis has likely

converged. STRmix uses multiple chains to carry out the MCMC analysis and ideally each chain

will be sampling in the same space after burn-in. If the chains spend their time in different spaces

then it is likely that the analysis has not run for long enough. Whether or not the chains have

spent time in the same space can be gauged by the within-chain and between-chain variances.

This diagnostic (GR), is a ratio of the stationary distribution and within-chain variances. For a

converged analysis the GR will be 1. It has been recommended that if the GR is above 1.2 then

there exists the possibility that the analysis hasn’t converged. We would suggest that if the GR

value is above 1.2 the results of the analysis be closely scrutinized. Running the analysis for a

larger number of iterations will likely reduce the GR in these instances to below 1.2.

Page 8: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 8

Effective sample size (ESS): the number of independent samples the MCMC has taken from the

posterior distribution of all parameters. A low ESS in relation to the total number of iterations

suggests that the MCMC has not moved very far with each step or has had a low acceptance rate.

An ESS of NaN indicates that there might be a problem with the input data. For example, a

stutter peak it expects to see is not present in the input data).

Average log(likelihood): this value shows the average log10(likelihood) for the entire post burn-

in MCMC. The larger this value the better STRmix has been able to describe the observed data.

A negative value suggests that STRmix has not been able to describe the data very well given the

information it has been provided.

Allele Variance and Stutter Variance constants: both of these values are the average value for

variance and stutter variance constants across the entire post burn-in MCMC analysis. These

values can be used as a guide as to the level of stochastic variation in peak heights that is present

in the profile. If the variance constant has increased markedly from the mode of the prior

distribution, then this may indicate that the DNA profile is sub-optimal or that the number of

contributors is incorrect. Used in conjunction with the average log10(likelihood), a large

variance or stutter variance constant can indicate poor PCR.

In summary, STRmix creates a list of genotype sets and assigns weights to each set that reflect

how well they ‘fit’ the evidence profile. If the proposed set of single sourced genotypes is

unlikely to lead to the observed evidence profile then that set will be given a low weighting

(close to zero), and if the proposed genotype set is likely to lead to the observed evidence profile

then that set will be given a high weighting (close to 1). STRmix uses information provided by

the user combined with optimized values for properties of the DNA profile being analyzed to

deconvolute a profile and calculate weights.

Purpose

Knowing that STRmix is a fully continuous probabilistic genotyping approach that incorporates

the biological model, the purpose of this study was to assess mixture deconvolution by the

MCMC process. Two different approaches were taken to assess the MCMC.

The first approach utilized one single source sample to examine the results in detail to the level

that they can be reproduced. The extended output provides the iteration-by-iteration detail. In

this extended output each of these results were examined: the genotypes and LSAE at each locus,

template and degradation for each contributor, locus amp probability, allele variance, allele

variance penalty, stutter variance, and stutter variance penalty.

The second approach utilized samples with DNA from more than one person. For mixtures, the

most straightforward way to do this was to use mixtures designed and created in the lab (“ground

Page 9: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 9

truth” mixtures), and compare the STRmix results to known genotype sets of the ground truth

mixtures.

The developmental validation studies for this software included (but not limited to) extensive

evaluation of:

Expected allele and stutter heights given mass parameters

Expected peak heights of drop-out alleles given mass parameters

Probabilities given expected and observed peak heights and varying analytical thresholds

Locus specific amplification efficiency calculations

Summation of probabilities for each allele in a locus and across a profile

Summation of probabilities across multiple replicate profiles

Informed priors on mixture proportion

LR values where there are no assumed contributors

LR values with varying theta values

LR values for propositions with assumed contributors

LR HPD interval values

Sampling from the Beta distributions for theta

Gaussian walk

Gelman-Rubin statistic, ESS, weight resampling

Drop-in function

Model maker

These are described in the manual and in multiple peer reviewed publications. The assessment of

the MCMC in this study is aimed at validating STRmix (v2.3.06) for mixture interpretation at the

SDPD. For the purposes of our laboratory, the MCMC process was assessed by evaluating the

genotype weights determined by STRmix deconvolution. This study included a very wide range

of mixture combinations and template amount so as to assess the MCMC in a variety of contexts

(i.e. in the presence of dropout, in balanced mixtures, and in both high and low template

samples).

Materials and Methods

Two, three, four, and five-person mixtures (a total of 186 mixtures) were created as part of the

GlobalFiler Mixture Study. These were mixtures designed for STRmix that had a range of

contributor compositions – from balanced mixtures to mixtures where there are one or two

contributors that are the source of most of the DNA in the mixture. There are also mixtures in

every set that have at least one contributor dropping out. These mixtures were created and

amplified with the original GlobalFiler master mix formulation. After these results were assessed

in STRmix, a reformulation of GlobalFiler was released. A subset of the mixtures were selected

for re-amplification with the new GlobalFiler product. See the ModelMaker Study for more

details about the new Allele, Stutter and LSAE variance parameters that were collected from the

Page 10: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 10

reformulated GlobalFiler master mix. After determining that new STRmix models were

necessary post-reformulation, a larger subset of the mixtures below were re-deconvoluted with

the new STRmix settings in-order to validate both the new amplification kit, and the new settings

for STRmix. See Table 1 for a list of all mixtures.

Table 1 – A list of mixtures broken down but input level and contributor ratio

All of these mixtures were evaluated assuming the number of contributors the mixture was

designed to have. All of the two and three person mixtures were evaluated extensively. A subset

of the 4 person mixtures were chosen for evaluation of deconvolution but these were studied

more extensively in the STRmix Comparison to Known Contributors study. The five person

mixtures haven’t been run due to a limit in java, unless conditioned on one of the balanced

contributors. The mixtures were assessed for the percent contribution of each contributor,

whether the correct genotypes included in the genotype probability distribution, whether correct

combination was in the top 99%, and whether the STRmix genotype possibilities were intuitive.

Page 11: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 11

Target percentage of contribution for each mixture, and manual calculation of contributor

percentage (see GlobalFiler mixture study for more details) was compared against STRmix

calculations for percent contributor.

Results and Discussion

Single Source MCMC reproducibility

The locus amp probability was reproduced by calculating the normal distribution of the log of

the LSAE for each locus in that iteration (centered around zero with a standard deviation of the

square root of the LSAE variance determined by ModelMaker; equation =

NORMDIST(LOG(LSAE for locus X),0,SQRT(0.01556),0)), and then taking the log of the

product of those normal distributions for every locus. In the single source sample, the locus

amplification probability for three different iterations was recorded from the STRmix extended

output. For iteration 0, it was 10.598; for iteration 28,716 it was 8.351; for iteration 78,292, it

was 8.756. Each of these values was reproduced in excel to at least the 9th decimal place using

the above formula.

The allele variance penalty was reproduced by taking the log of the gamma distribution of the

allele variance for that iteration using the α and β parameters (determined by ModelMaker;

6.6346 and 1.6553). Equation =LOG(GAMMADIST(allele variance, 6.6346, 1.6553, 0)). In the

single source sample, the allele variance penalty for three different iterations was recorded from

the STRmix extended output. For iteration 0, it was -0.9998; for iteration 28,716 it was -0.9998;

for iteration 78,292, it was -1.0369. Each of these values was reproduced in excel to at least the

9th decimal place using the above formula.

The stutter variance penalty was reproduced by taking the log of the gamma distribution of the

stutter variance for that iteration using the α and β parameters (determined by ModelMaker; 7.09

and 2.4927). Equation =LOG(GAMMADIST(stutter variance, 7.09, 2.4927, 0)). In the single

source sample, the stutter variance penalty for three different iterations was recorded from the

STRmix extended output. For iteration 0, it was -1.194; for iteration 28,716 it was -1.5835; for

iteration 78,292, it was -2.5416. Each of these values was reproduced in excel to at least the 9th

decimal place using the above formula.

MCMC performance on mixtures

Results for every two and three person mixture were carefully scrutinized. A subset of the 4

person mixtures were run, and results from these mixtures are described more fully in the

STRmix Comparison to Known Contributors Study.

For this study, 42 two-person mixtures were deconvoluted with the number of contributors set at

2. Originally, these mixtures were run with the Allele, Stutter, and LSAE variance parameters

determined from the original GlobalFiler master mix. Subsequently, a GlobalFiler reformulation

Page 12: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 12

was released and a small subset of these were re-amplified. There was essentially no change in

the way that mixtures were amplified with the reformulated GlobalFiler master mix. However,

new Allele, Stutter, and LSAE variance parameters were determined after the master mix

reformulation. A larger subset of these mixtures were re-deconvoluted with the new STRmix

model settings. The results in Table 2 summarize the most updated results of the deconvolution.

2 person mixtures:

Of the 42 two-person mixtures, 8 of them had alleles from at least one contributor dropping out.

Only one of these mixtures (2-29) had a diagnostic value that warranted a closer look. The

Gelman-Rubin Convergence number was 1.31. This mixture had three instances of allelic

dropout, but full assessment of this sample did not indicate any other problem. Each known

contributor’s genotype fell into the top 99% of weights in the Component Interpretation section,

and all genotypes and weights were intuitive for both contributors.

Only one of the mixtures (2-39) in which one of the contributors genotypes at one locus was not

in the top 99%. This is a low-level balanced mixture where each contributor was contributing

~50%. Upon further inspection of the genotypes, one of the contributors in this mixture had

types that completely dropped out at D5S818, and full dropout of that genotype was considered,

but only with a weighting of 0.40%, which did not make the top 99% cutoff that was investigated

in this study.

All other mixtures, even low level, balanced and imbalanced mixtures were deconvoluted by

STRmix in a way that was intuitive and genotypes from the known contributors fell in the top

99% of weights.

Page 13: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 13

Table 2 – Two person mixture deconvolution results – high level target input amount

Page 14: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 14

Table 2 – Two person mixture deconvolution results, continued – mid level target input amount

Page 15: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 15

Table 2 – Two person mixture deconvolution results, continued – low level target input amount

3 person mixtures:

Of the 67 three-person mixtures, 39 of them had alleles from at least one contributor dropping

out. All of the low level mixtures had alleles dropping out, and in one mixture, all of the alleles

from one of the contributors dropped out. This mixture set allowed us to test a wide range of

scenarios (Table 3).

The mixture ratios provided by STRmix were a lot more accurate than the manual calculation,

and reflected the target values in most mixtures (see Table 3). This is likely because STRmix is

able to use all the loci for this estimate, while for the manual calculation, we were limited only to

Page 16: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 16

a few loci where the genotypes had sufficient separation, and stutter was not taken into account

for the manual estimate of percent contribution from one contributor.

Seventeen of the 67 mixtures had a diagnostic value that warranted a closer look. Of these, all

had Gelman-Rubin convergence numbers greater than 1.2. In looking closer at these 17 mixtures,

11 of them had no problem with the deconvolution into the known contributors. Many of these

mixtures had two or more contributors that were very balanced, and that ambiguity can cause an

increase in the GR number. The other 6 with a GR number higher than 1.2 did have one or more

contributors whose known genotype at one locus did not make the list of the top 99%. Four of

these were 14% contributors (or less) with some dropout associated with them. There simply

wasn’t enough data to deconvolute that minor contributor. The other two contributors in each of

these mixtures were deconvoluted with no problems. The 5th of these had two balanced 10%

contributors where dropout was not adequately accounted for at one locus, and the final of these

6 mixtures with a GR number above 1.2 was a 3 person, low level, balanced mixture with

dropout where none of the contributors were accurately deconvoluted at one locus (due to

dropout not being accounted for highly enough.

Fourteen of the 67 mixtures had only one of three contributors with a known genotype weight of

less than 99% (from the Component Interpretation section of the STRmix results). Ten of these

14 contributors were contributing only 15% or less to the mixture, and there was dropout in all

but one of these. The other 4 were balanced contributors in mixtures with dropout (in one of

these 4, all of that contributors genotypes were dropped from the low level mixture, so it was not

surprising that their genotype was not deconvoluted with sufficient weight).

Three of the 67 mixtures had two of the three contributors that had a genotype at only one locus

not falling into the top 99% of weights. In two mixtures, both contributors were estimated to

provide only 10% & 11%, and 8% and 8%, respectively. The third was a very low level balanced

three person mixture with dropout.

Two of the 67 mixtures had problems with the genotypes of all three contributors. One of these

(3-48) had a deconvolution at TH01 that was not intuitive for the major contributor (70%), and

this resulted in the two known minor genotypes not falling in the top 99%. The other mixture (3-

64) was a low level, balanced mixture in which dropout was not being sufficiently accounted for

at only one locus.

The results presented above are consistent with the idea that the less someone’s DNA is

contributing to a mixture, the more unreliable the genotype weights are. That said, there were 51

contributors that were contributing 20% or less from a total of 201 different contributors making

up the three person mixtures. Only 16 of these 51 ≤ 20% contributors had genotypes that did not

fall into the top 99%. Likewise, 22 of these 201 contributors only contributed ≤ 10% to the

Page 17: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 17

mixture and only 8 of them had a problem with a genotypes not falling into the top 99%. So, just

because someone is only contributing a small fraction of total DNA to the mixture doesn’t make

their deconvolution results unusable, they should just be interpreted with more caution than more

robust contributors.

Table 3 – Three person mixture deconvolution results – high level target input amount

Page 18: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 18

Table 3 – Three person mixture deconvolution results, continued– mid level target input amount

Page 19: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 19

Table 3 – Three person mixture deconvolution results, continued– low level target input amount

Page 20: SDPD Crime Laboratory Forensic Biology Unit Validation · PDF fileSTRmix - MCMC Study Page 1 SDPD Crime Laboratory – Forensic Biology Unit Validation of the STRmixTM Software MCMC

STRmix - MCMC Study Page 20

Conclusion

The MCMC and Metropolis-Hastings are central processes to STRmix. This study was designed

to test STRmix in deconvolution of ground truth mixtures. The internal validation of this

software package was done by providing a wide range of mixture samples designed, amplified,

and electrophoresed in the SDPD crime lab following ModelMaker. Each of these mixtures was

examined in detail to record the known genotype weight of every contributor. Of the 296

contributors making up these 2 and 3 person mixtures, all but one of them had results in which

the known genotype was intuitive when the electropherogram was examined closely for peak

height balance, mixture ratio, and locus specific amplification efficiency. There were several

instances where the GlobalFiler amplification resulted in peak heights at one locus (often SE33)

that were inconsistent from other loci, but given the electropherogram, STRmix was able to

provide reliable, consistent, and robust results for the contributors to those mixtures. The less a

person contributes to the mixture (20% or less), the lower the genotype weight can be associated

with the mixture. Also, when two or more contributors in a mixture are balanced (contributing

equal amounts), there is more ambiguity in their possible genotype combinations. Finally, the

more dropout there is associated with a mixture, the more ambiguous the results can be. These

three principles are to be expected, and are things that have had to be accounted for in the past.

This study was effective for determining some of the interpretation limits within STRmix, which

are important to keep in mind as STRmix results are interpreted. Even with these limits in mind,

STRmix allows interpretation of many more contributors in many more mixtures than was

previously possible in the lab. The level of consistency that STRmix provides is very high, and is

one of the largest benefits in moving to probabilistic genotyping for interpretation of mixed DNA

results. Another benefit in using this software is that different weights are associated with each

genotype choice, and it is dependent on the observed electropherogram generated in the lab.

Having a number associated with a particular genotype allows a very precise calculation for a

likelihood ratio. This provides reliable results as well as clarity for evidence items examined in

the SDPD crime lab.

References

1. STRmix v2.3 Users Manual. Issued by Institute of Environmental Science and Research

Limited; Date of Issue: 20 January 2015