Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two...

86
References Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments Cletus Kwa Kum Mathematics/Computer Science Department University of Dschang Supervisors: Professor Daniel Thorburn: Statistics Department–Stockholm University–Sweden Professor Bitjong Ndombol: Maths/Computer Department – University of Dschang June 14, 2016 Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis1/77

Transcript of Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two...

Page 1: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Probability Models for Estimating HaplotypeFrequencies and Bayesian Survival Analysis of

Two Treatments

Cletus Kwa Kum

Mathematics/Computer Science DepartmentUniversity of Dschang

Supervisors:Professor Daniel Thorburn: Statistics Department–Stockholm University–SwedenProfessor Bitjong Ndombol: Maths/Computer Department – University of Dschang

June 14, 2016

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis1/77

Page 2: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Introduction

Malaria is an old disease which spans over 100 years. It ismostly present in the tropical countries.

WHO in 2008 declared that malaria is endemic in 109countries.

There were about 1 million deaths out of 243 million cases

No vaccine exist and the parasite is continuously developingresistance to many available drugs.

Mathematical modelling in malaria has to play a central role inthe fight against the disease.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis2/77

Page 3: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Overview of models and modelling in malaria

Malaria models

The applications of mathematics to biology started in 1628,when Harvey calculated the amount and prove the circularmovement of blood.

It was in 1910 that the first malaria model was proposed by SirRonald Ross.

Since then Malaria has been studied from differentperspectives, and immense literature exists describing a largenumber of models and modelling approaches.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis3/77

Page 4: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Ongoing clinical Trials

Drug resistance and efficacy

Drug resistance is a major setback in all efforts to eliminatemalaria

Antimalarial drug efficacy clinical trials are on the increase.

One of such trials was conducted in Tanzania in 2004 tocompare the efficacy of two malaria treatments.

The treatments compared were artesunate plussulfadoxine-pyrimethamine (ASP) andsulfadoxine-pyrimethamine alone (SP)

The duration was 84 days.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis4/77

Page 5: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Data from the trial

The data collected carried informaion on single nucleotidepolymorphisms (SNPs) at three positions of DNA

These positions are believed to be related to drug reistance

Mixed infections were present which obscure the underlyingfrequencies of the alleles at each locus and associationsbetween loci in samples where alleles are mixed.

Combinations of alleles at different markers along the samechromosomes are known as Haplotypes.

A major problem has been to accurately determine number ofinfections in multiple infections.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis5/77

Page 6: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Motivation and Objectives of this thesis

Motivation

Malaria has developed resistance to most single drugtreatments e.g chloroquine.

WHO recommends a switched to ACTs to reduce drugresistance.

Therefore knowledge through Mathematical models willcontribute to the fight against drug resistance and malaria.

Main objective

To develop Statistical Methods that can be used for a betterunderstanding of the epidemiological processes

Apply these methods to real data on malaria and henceimprove and add scientific knowledge in this field.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis6/77

Page 7: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Specific objectives

Estimating haplotype frequencies of multi-single nucleotidepolymorphisms (SNPs)

Unveil the hidden combinations of the possible haplotypesthat gave rise to the mixed–phenotype infections observed inthe genotyped data.

Study the effects of treatment on proportion of resistantparasites.

Estimate the average number of different haplotypes a child iscarrying both at baseline and at first recurrence of malaria.

Compare using Bayesian methods the efficacy of ASP and SP

Estimate differences in malaria-free periods for ASP snd SP

Investigate the effect of covariates on pribability of firstrecurrence of malaria

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 8: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Outline of the thesis

Chapter 2 presents a background to some statistical aspects.

Chapter 3 describes the clinical study and data, which is thefoundation of this research.

Chapter 4 focuses on the effects of treatment on parasite drugresistance, estimating haplotype frequencies and theexpected number of infection times a child is infected withdifferent gene types.

Chapter 5 presents the comparison of the efficacy of the twotreatments and estimation of malaria free times.

In Chapter 6, we study the effects of some covariates on theprobability of first recurrence of malaria.

Chapter 7 contains conclusions and future work.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 9: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Chapter 2

Key ideas of Chapter 2

Models and parameter estimation using MLE methods

Models selection and Deviance

Difference between Classical Statistics and BayesianStatistics

Numerical methods: Nelder–Mead simplex method, MarkovChain Monte Carlo method and Gibbs sampling

Binary models and related statistical measures: Cure rates,Odds Ratio and Logistic regression

Event history analysis: survival analysis, discrete time hazard,Restricted mean survival time and Estimator of the survivalfunction.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 10: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Maximum likelihood estimator

The maximum likelihood estimator is defined as

θ̂mledef= arg max

θ∈ΘL(θ) (1)

In maximum likelihood estimation, the best parameterestimates are those that maximize the likelihood.

When L(θ) do not have closed form solutions, we usecomputational methods to obtain estimates for theparameters.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 11: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Model selection

Choosing a statistical model from a set of competing modelsfor a given data set.

Two criteria that are used to make the choice are parsimonyand goodness-of-fit.

The deviance assesses the goodness of fit for the model bylooking at the difference between the log-likelihood functionsof the saturated model and the reduced model.

Classical statistics

The classical approach is the form of statistical inference thatis most widely used of the statistical paradigms.

Its foundation is on the repeatibility of events: E.g. theprobability of an event is the proportion of times it occurs outof a large number of independent repeated experiments.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 12: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Bayesian Statistics

Bayesian paradigm is based on specifying a probability modelfor the observed data D, given a vector of unknownparameters θ, leading to the likelihood function L(θ|D).

We assume θ is random and has a prior distribution π(θ)

Inference concerning θ is then based on the posteriordistribution given by

π(θ|D) =L(θ|D)π(θ)

ΘL(θ|D)π(θ)dθ

, (2)

where Θ denotes the parameter space of θ and θ has anabsolutely continuous distribution.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 13: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Simple example in medicine

Before giving a good diagnosis, a Physician wants to knowsymptoms and family history of cancer. This initial informationform the prior which will be updated when test results arrivegiving a posterior (Diagnosis)

Under uncertainty, further test can be made to update the oldposterior.

The father of Bayesian statistics is Reverend Thomas Bayes(1702 – 1761).

Popular today because of the availability of fast computationalmethods (e.g. MCMC)

Numerical methods

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis7/77

Page 14: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Optimization method used in the thesis

We have formulated a maximum likelihood function for theestimation of haplotype frequencies in this thesis and thefunction is non-smooth.

We rely on direct search method for estimation (specificallythe Nelder–Mead simplex method -1965).

The method is quite popular. About 19,721 papers and bookspublished have made reference to the Nelder–Mead methodas of May 2016.

The Nelder–Mead simplex algorithm is implemented in R as afunction optim(init, f, df, method="neldermead") which can be applied to stochastic functions.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis8/77

Page 15: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Markov Chain Monte Carlo (MCMC) Techniques and othercomputational aspects.

Markov chain Monte Carlo (MCMC) methods enable thedrawing of samples from the joint posterior distribution

One of the most widely used MCMC techniques is Gibbssampling technique.

The idea in Gibbs sampling is to generate posterior samplesby sweeping through each variable to sample from itsconditional distribution with the remaining variables fixed totheir current values.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis8/77

Page 16: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Event history analysis

Survival analysis

Survival analysis is set of methods for analyzing data wherethe outcome variable is the time until the occurrence of anevent of interest. The event can be death, occurrence orrecurrence of a disease, marriage or divorce.

The time to event or survival time can be measured in days,weeks or years.

In the thesis, event of interest is the first recurrence of malariasince start of treatment.

And the survival time is the time in days until a child is testedmalaria positive.

Observations are called censored when the information abouttheir survival time is not complete;

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis8/77

Page 17: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Event history analysis

Restricted mean survival time

The RMST is the expected survival time within a fixedfollow-up interval.

It was first proposed by Irwin1 since the mean survival time isnot estimable in the presence of censoring.

The mean survival time can be mathematically calculated asthe total area between a survival function and the x-axis.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis9/77

Page 18: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Chapter 3: Medical Background and Description of Data

Conduct of the Clinical Trial

The study was undertaken in Tanzania – Uzini and Konde with206 and 178 uncomplicated malaria patients,respectively

The ethical related issues as stipulated in the Declaration ofHelsinki were respected.

The trial conduct was supervised by the Karolinska Institute -Sweden.

On Day 0, the patients were tested for parasites and retestedon days 7, 21, 28, 42, 56 and 84.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis10/77

Page 19: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Description of Data

Genotype malaria

PCR techniques were used in differentiating recrudescenceparasitaemia from new infections (at baseline and atrecurrence of malaria).

The parasites were analysed and the single–nucleotidepolymorphisms (SNPs) at three positions in the pdhfr genewere determined.

The three positions (pdhfr 51, 59 and 108) could be definedas either resistant (R) or sensitive (S).

If both parasites with R and S SNPs were present, this wasdenoted by the letter M.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis11/77

Page 20: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Description of Data continued

Genotype malaria

Each blood sample was classified for its parasites pdhfrcharacterized by three letters from SSS to MMM, denoting thestatus of each one of the individual SNPs.

RSM means that the child had only parasites with resistantSNPs at the first position and only sensitive SNPs at thesecond position, but there were both parasites with resistantand sensitive SNPs at the third position.

To avoid the ambiguity in differentiating between malaria dueto reinfection and recrudescence.

In this thesis, any all first recurrence of malaria within thestudy period is known simply as a first recurrence of malaria.

Table 1 is an extract of such data.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis12/77

Page 21: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Data for Chapter 4

MotivationTable 1: Genotype number at baseline and at first recurrence

KONDE UZINIGenotype Baseline Recurrence Baseline Recurrence

RRR 36 21 64 53RRS 9 1 39 15RSR 14 5 13 9SRR 0 0 0 1RSS 0 0 2 1SRS 0 0 1 1SSR 0 0 0 1SSS 3 0 24 6RRM 19 1 8 6

......

......

...MMM 20 0 9 1

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis13/77

Page 22: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Data for Chapter 5

Cured DataTable 2: Number of patients cured of malaria and those at start (inparentheses)

Location Drug (0 – 42] (0 – 84]

ASP 86 (90) 34 (90)KONDE SP 43 (86) 29 (86)

ASP 63 (94) 56 (94)UZINI SP 63 (110) 57 (110)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis14/77

Page 23: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Data for Chapter 5 contd

Recurrence infectionTable 3: Number with first recurrence of malaria and those at risk (inparentheses)

Location Drug (0–7] (7–21] (21–28] (28–42] (42–56] (56–84]

ASP 3 (90) 8 (87) 21 (79) 16 (58) 7 (42) 3 (35)KONDE SP 7 (86) 17 (79) 10 (62) 9 (52) 10 (43) 4 (33)

ASP 4 (94) 13 (90) 9 (77) 8 (68) 9 (60) 1 (51)UZINI SP 15 (110) 14 (95) 13 (81) 6 (68) 2 (62) 3 (60)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis15/77

Page 24: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Chapter 4- Estimating Haplotype frequencies and theeffects of Treatment

Motivation

The problem of estimating the proportions of differenthaplotypes from blood samples with multiple genes, haveusually been approached using simple methods likeneglecting all mixed infections with multiple genotypes orcounting the multiple genes as resistant.

Wigger et al.(2013) uses MCMC methods where one stepsimulates the true state. Hastings and Smith(2008) present acomputer programme for calculations.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis16/77

Page 25: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Unveiling haplotypes in mixed infections

Example: Hidden combination

The classification MRM is RRR+SRS and RRS+SRR. Mindicates the presence of both R and S

Other combinations are RRR+RRS+SRR, RRR+RRS+SRS.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis17/77

Page 26: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Hypothesis and methodology in this Chapter

There are relatively more resistant SNPs at the firstrecurrence of malaria if we correct for the fact that the childrenat baseline were infected by multiple types of malariaparasites simultaneously.

We derive models at baseline, recurrence models and at bothperiods. Page 102.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis18/77

Page 27: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

One time–point probability models

Saturated model

There were 27 different possible genotypes at baseline givingmalaria (see page 44 ) and 28 at the first reappearance ofmalaria.

We represent 27 genotypes by IJK, where I=M, R or S, J = M,R or S and K = M, R or S.

Let nIJK be the corresponding number of patients with thisinfection.

Then the probability of an infection IJK can be estimated bythe corresponding relative frequency, that is,

π∗

IJK =nIJK

N, (3)

where N is the total number of observed patients with infection

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis19/77

Page 28: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Haplotypes and observations with multiple genes

Model derivation

The parasites can be classified into eight haplotypes.

XYZ ∈ {RRR, RRS, RSR, SRR, RSS, SRS, SSR, SSS}.

We use letters IJK to classify observations into R, S or M, butXYZ when classifying parasites with only R or S.

Let pXYZ = 1 − qXYZ , be the probability of a susceptible childbeing infected by type XYZ.

Assuming that these eight haplotypes infect childrenindependently, the probability that a child stays healthy is

π(H) =8∏

XYZ

qXYZ , (4)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis20/77

Page 29: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Model derivation cont’d

Probability of each infection

The probability of one infection, e.g. RRR infection is

πRRR =

(

pRRR

qRRR

)

XYZ

qXYZ (5)

Also, for the infection MRR infection is

πMRR =

(

pRRR

qRRR×

pSRR

qSRR

)

XYZ

qXYZ . (6)

.

MRR, corresponds to an infection with both RRR and SRRand no other.

There are 12 classifications with 1 M, 6 classifications with 2M and 193 with 3 M.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis21/77

Page 30: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Model derivation cont’d

Probability of MMM infection

The probability of MMM can be obtained by summing 193terms but it is probably simpler to subtract the sum of all theother probabilities from 1.

πMMM = 1 −∑

IJK

πIJK , where IJK 6= MMM. (7)

Our interest is on observed proportions condition on the eventof having malaria. Thus the probability of an observationclassified as IJK is

π′

IJK =πIJK

(

1 −∏

XYZ

qXYZ

) , (8)

where I=M, R or S, J= M, R or S and K = M, R or S.Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis22/77

Page 31: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

One time–point probability models

Our model

Using equation (8) and the fact that the number of observedtypes follows a multinomial distribution, we get that thelikelihood function is

L(p) =27∏

IJK

πIJK/

1 −8∏

XYZ

qXYZ

nIJK

, (9)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis23/77

Page 32: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

One time–point probability models

Estimation of the model

We maximise this function to obtain estimates of theprobabilities of XYZ infections by the maximum likelihoodmethod. A programme code was written and the optimisationtechnique used, was the Nelder – Mead method4 in R.

This technique does not require derivatives which makes itsuitable for optimisation of non–smooth functions. It oftenshows rapid improvements with a relatively small number ofiterations.

The optimisation procedure employed, produced maximumlikelihood(ML) estimates for pXYZ , which are presented inTables 4 and 5, each for Konde and Uzini, respectively.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis24/77

Page 33: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Haplotype estimation

Estimates from modelTable 4: Estimated haplotype frequencies: No distinction betweentreatments

KONDE UZINIBaseline First Recurrence Baseline First RecurrenceHaplotype

p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZRRR 0.61 0.47 0.43 0.70 0.29 0.42 0.29 0.55RRS 0.32 0.20 0.06 0.07 0.17 0.23 0.09 0.16RSR 0.32 0.19 0.14 0.19 0.12 0.16 0.09 0.14SRR 0.03 0.01 0.00 0.00 0.00 0.00 0.02 0.03RSS 0.03 0.02 0.03 0.03 0.01 0.01 0.02 0.03SRS 0.02 0.01 0.00 0.00 0.02 0.02 0.01 0.02SSR 0.01 0.003 0.00 0.00 0.003 0.004 0.005 0.01SSS 0.16 0.09 0.00 0.00 0.12 0.15 0.04 0.07

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis25/77

Page 34: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Haplotype estimation

Further estimates from modelTable 5: Estimated haplotype frequencies: Distinction betweentreatments

KONDE UZINIASP SP ASP SP

Baseline First Recurrence Baseline First Recurrence Baseline First Recurrence Baseline First RecurrenceHaplotype

p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZ p∗

XYZ f ∗XYZRRR 0.584 0.437 0.385 0.465 0.630 0.511 0.591 0.836 0.278 0.381 0.165 0.436 0.301 0.450 0.424 0.705RRS 0.355 0.219 0.073 0.072 0.290 0.176 0.065 0.063 0.217 0.287 0.088 0.223 0.147 0.200 0.092 0.013RSR 0.304 0.180 0.332 0.385 0.326 0.203 0.102 0.101 0.112 0.147 0.050 0.123 0.119 0.160 0.120 0.163SRR 0.028 0.014 0.000 0.000 0.027 0.014 0.000 0.000 0.000 0.000 0.008 0.019 0.000 0.000 0.028 0.036RSS 0.041 0.021 0.078 0.077 0.026 0.013 0.000 0.000 0.009 0.011 0.023 0.056 0.010 0.013 0.009 0.011SRS 0.045 0.023 0.000 0.000 0.000 0.000 0.000 0.000 0.014 0.016 0.000 0.000 0.018 0.023 0.028 0.036SSR 0.008 0.004 0.000 0.000 0.007 0.004 0.000 0.000 0.000 0.000 0.009 0.022 0.006 0.008 0.000 0.000SSS 0.187 0.103 0.000 0.000 0.142 0.079 0.000 0.000 0.126 0.158 0.049 0.121 0.110 0.147 0.028 0.036

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis26/77

Page 35: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Average number of infections per child

Average number assuming Poisson model

If we assume that the number of times a child is infectedfollows a Poisson distribution, then Pr(0) = exp(−λ), where λis the parameter or the mean.

pXYZ is the probability of getting infected at least once withhaplotype XYZ .

Since we assume a Poisson distribution, then λXYZ equals theexpected number of times a person is infected with XYZ .

λXYZ/(1 − pXYZ ) = E(Number of times a person is infectedgiven that he is infected at least once).

The expected number of infection times can be estimated by

−∑

XYZ

ln(qXYZ ) / (1 −

8∏

XYZ

qXYZ ) (10)

These figures can be found in Tables 6 and 7.Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis27/77

Page 36: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Average number of infections per child

Average estimates

Table 6: Average number of haplotypes and expected number of timespatients were infected: No distinction between ASP and SP

KONDE UZINIBaseline First Recurrence Baseline First Recurrence

Average no. of haplotypes 1.74 1.19 1.31 1.21Expected no. of infection times 2.29 1.46 1.46 1.35

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis28/77

Page 37: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Average number of infections per child

Average estimates

Table 7: Average number of haplotypes and expected number of timespatients were infected: Distinction between ASP and SP

KONDE UZINIASP SP ASP SP

Baseline First Recurrence Baseline First Recurrence Baseline First Recurrence Baseline First RecurrenceAverage no. of haplotypes 1.79 1.34 1.69 1.15 1.33 1.15 1.30 1.25

Expected no. of infection times 2.31 1.61 2.27 1.63 1.49 1.22 1.47 1.50

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis29/77

Page 38: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

A model assuming independence between genepositions: M1

Is the occurrence of R or an S independent at all positions?

fXYZ = fX .. × f.Y . × f..Z = λX ..λ.Y .λ..Z /8∑

UVW

λUVW , (11)

Since pXYZ is fairly small, λXYZ = − ln(1 − pXYZ ) ≈ pXYZ . Thusfactorizing fXYZ is approximately the same as factorizing pXYZ , butwe may include a normalizing constant, α. Thus, we tested themodel

pXYZ = α× pX .. × p.Y . × p..Z (12)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis30/77

Page 39: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

Is there a relation between the baseline and first recurrence time?

Starting point the model (M2), with 16 parameters. Denotingthe parameters at baseline with an extra index b, and at firstrecurrence of malaria by an extra index r ,

{

pbXYZ = 1 − exp(−λbXYZ )

prXYZ = 1 − exp(−λrXYZ ): M2 (13)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis31/77

Page 40: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

A model with varying amount of infection: M3

At baseline there are infections than at recurrence due tolonger exposore to different types.

We consider a simple model in which the only differencebetween baseline and at the first recurrence, is that totalamount of infection t is smaller at this first appearance ofmalaria.

prXYZ = 1 − exp(−t × λbXYZ ). (14)

Testing against the model M2, the increase in 2×loglikelihoodratio is 16.5 for Konde and 20.2 for Uzini.

With 7 degrees of freedom, we must reject this model. Wethus look at a model where the decrease is different fordifferent genes.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis32/77

Page 41: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

A model with varying amounts of infection and differences betweengene positions

The eight types of infections seem not decrease equally much.

We try to model the differences by less than eight parameters.

We consider two models, one additive and one multiplicative.

These models describe the hypothesis that the proportion ofsensitive genes decreases between baseline and at firstrecurrence of malaria.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis33/77

Page 42: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

Multiplicative model M4 and additive model M5

Multiplicative model.

pbXYZ = 1 − exp(−t(αI11 × αI2

2 × αI33 × λbXYZ )) : M4 (15)

Additive model

pbXYZ = 1 − exp(−t(α1I1 + α2I2 + α3I3 + λbXYZ )) : M5 (16)

where

I1 = 1 if X in the genotype (XYZ ) = S and 0 otherwiseI2 = 1 if Y in the genotype (XYZ ) = S and 0 otherwiseI3 = 1 if Z in the genotype (XYZ ) = S and 0 otherwise

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis34/77

Page 43: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

Using the multiplicative model M4 to measure the relativeincrease in haplotypes with S-genes at different positions

Table 8 shows estimates for the four parameters.

Table 8: Measure of the relative increase in haplotypes withS-genes at different positions

Parameter Konde Uzini All data ASP SPtK 0.56 – 0.48 0.579 0.439tU – 1.07 1.12 0.512 1.915α1 0.00 1.29 0.98 0.605 1.073α2 0.83 0.69 0.75 1.464 0.486α3 0.37 0.49 0.47 0.882 0.304

The proportion of R–genes at the last position were muchhigher at recurrence and somewhat higher, but not significant,at the second position.Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis35/77

Page 44: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two time–points models: Baseline and first recurrencetime points.

A model combining both health centres

To test if the true relative decrease in the proportion ofS-genes at the three positions can be the same at bothlocations via the three α-parameters.

prKXYZ = 1−exp(−tK (αI11 ×αI2

2 ×αI33 ×λbKXYZ )) at Konde : M6

(17)prUXYZ = 1−exp(−tU(α

I11 ×αI2

2 ×αI33 ×λbUXYZ )) at Uzini : M7

(18)where I1, I2 and I3 are as defined before.

We may accept that the same model can be used at bothcentres.

The hypothesis of the exposure parameters t being the samewas rejected.

All ASP α′s are around 1.Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis36/77

Page 45: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Two–time points models

A model combining both health centres cont’d

The decrease is clear for those treated with SP.

Where we earlier had a decrease by the factors 0.75 and 0.47(see Table 8), it is now only 0.49 and 0.30 for SP.

The decrease is most obvious in Uzini, where there weremore infections.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis37/77

Page 46: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results and Discussion on this Chapter

Important results

We could not reject the hypothesis that the eight haplotypes ofmalaria infected the children independently of each other.

At first recurrence of malaria, the proportions of some parasitetypes were smaller compared to baseline.

In particular, those haplotypes with genes marked S atsecond and third positions decreased when treated with SP.

When the children were treated with ASP, the decrease wasmuch smaller.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis38/77

Page 47: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

More results on the chapter

Important results cont’d

This because SP did not kill all the parasites with resistantgenes which led for the reappearance of malaria.

Treatment with ASP, all were killed and all observed firstrecurrences depended on new infections.

A child with malaria was at baseline on the average infectedby 2 different parasite types.

The estimated number of times they were infected being 3 inKonde and 2 times in Uzini.

At the first recurrence of malaria, the number of haplotypeshad decreased.

Results have been published in3 (International Journal ofBiostatistics - 2013)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis39/77

Page 48: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Chapter 4: Efficacy of Treatments - Cure Rates andMalaria Free Times: A Bayesian approach

Motivation

Many studies have concluded that ACTs are better treatmenttherapies in terms of efficacy.

But ACTs are not a panacea to malaria for they too also fail. Asignal of treatment failure is a recrudescence of malaria.

Some researchers have at the level of the laboratory measurethe duration for parasite clearance in hosts.

Objective

Which of the two treatments is more efficacious?

Present a new methodology that can be used to estimate howlong a treatment can postpone the recurrence of the diseasein case of failure.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis40/77

Page 49: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Assumptions

Some preliminary assumptions:

We consider the first recurrence of malaria during follow–upperiods as a heuristic justification of treatment failure.

All cases of failure must happened before time tmax .

We do not assume any distribution in determining thediffference in mean survival times between the two treatments.

Hence nonparametric procedure!

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis41/77

Page 50: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Statistical Models

Cure rate model

Let ni be the number at start receiving treatment i

Xi equal the number of cured patients before time tmax, theend of the follow–up).

Then our model is Xi ∼ Bin(ni , pi), where i = ASP or SP andpi is the probability to be cured by treatment i .

Assuming a Jeffrey’s prior, the posterior distribution for pi is

Beta(α+ Xi , β + ni − Xi) for i = ASP or SP . (19)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis42/77

Page 51: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Delay time model

Recurrence rate model at each follow-up

Suppose that rescreening is done at some fixed time points,t0 = 0, t1, . . . , tk = tmax.

Let Rj,i and Yj,i denote the number of children who had beenfree from malaria up to the time point tj−1 and those who getmalaria between time points tj−1 and tj (i = ASP or SP),respectively.

Then for each of these intervals, the children Yj,i witnessingthe event of interest can be modelled as Bin(Rj,i , θji)

The posterior distribution of θji is

Beta(αji + Yj,i , βji + Rj−1,i − Yj,i) (20)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis43/77

Page 52: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Delay time model contd

Survival rate model at each follow-up

The posterior distribution of θji is given by model (20).

Thus the posterior for the survival functions at all follow–upsfor each treatment is

S(tj) =j∏

k=1

(1 − θki)

To obtain the full distribution, we assume that it is a piecewiselinear function between the follow–up times,t0 = 0, t1, . . . , tk = tmax.

Victims of recrudescence between tj−1 and tj at the averageget a recurrence at the time midpoint (tj−1 + tj)/2.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis44/77

Page 53: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Delay time model contd

Mean survival time

Assuming that T is continuous with density f (s), then theprobability of surviving the event of interest till time t is

S(t) = 1 − F (t) =∫

tf (s)ds.

Then

E(T ) =

0tf (s)ds =

0S(s)ds. (21)

For a given time tk = tmax < ∞,

E(T ) =∑

j[(tj−tj−1)S(tj)+(S(tj−1)−S(tj))(tj−tj−1)/2]+E(max(T−Tmax,0))

=∑

j[(tj−tj−1)(S(tj)+S(tj−1))/2)]+E(max(T−Tmax,0)). (22)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis45/77

Page 54: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Delay time model

Mean survival time

The last term in the sum in equation (22) is the healthy time inthe period for those who get sick within the same period. Thelast term outside the sum corresponds to the excess time ofthose who are healthy at the end of the follow–up.

For tj < tmax, the conditional survival function of those thathave a first recurrence is

S∗(tj |T < tmax) =

j∏

k=1

(1 − θki)/ (1 − S(tmax)) . (23)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis46/77

Page 55: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Mean delay time model

Mean difference in survival times

For T1 and T2, the expected difference between the twosurvival times is

E [T1 − T2] = E

[∫

0S1(t1)dt1 −

0S2(t2)dt2

]

.

This can further be simplified to

E[

T1 − T2|T̃]

=12

j

[

(tj + tj−1)(

S∗

1(tj |T̃ )− (S∗

2(tj |T̃ ))]

, (24)

where T̃ = T < Tmax. We note that E(max(T − Tmax, 0)) = 0since tk = tmax must not be an event time in the modelassumption

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis47/77

Page 56: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Computations

Monte Carlo implementation

We use the MCMC Gibbs sampler in the Bayesian setup.

For the efficacy posterior estimates and densities, we drawrandom samples from their posterior distributions defined inmodel (19).

For adjornment time computation

Model (24) ia complex posterior and it links (2 × (k − 1)) Betadistributions corresponding to two treatments and k follow–uptimes. which we can call H

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis48/77

Page 57: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Computations

Monte Carlo implementation contd

Sample θ(1)ji from Beta(Yij + α,Rj,i − Yj,i + β), then compute

S∗(1)j,i and H(1),

Go on, . . ., Sample θ(N)ji from Beta(Yji + α,Rij − Yij + β),

compute S∗(N)ij and H(N).

The resulting sequence {H(1),H(2), . . . ,H(N)} constitutes Nindependent samples from H.

From these, we are able to obtain our estimates.

A histogram is plotted for these simulations to obtain theposterior distribution for H = E [T1 − T2|(.)]

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis49/77

Page 58: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Application to data

Cured DataTable 9: Number of patients cured of malaria and those at start (inparentheses)

Location Drug (0 – 42] (0 – 84]

ASP 86 (90) 34 (90)KONDE SP 43 (86) 29 (86)

ASP 63 (94) 56 (94)UZINI SP 63 (110) 57 (110)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis50/77

Page 59: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Data for recurrence

We applied models to the given data

Table 10: Number with first recurrence of malaria and those at risk (inparentheses)

Location Drug (0–7] (7–21] (21–28] (28–42] (42–56] (56–84]

ASP 3 (90) 8 (87) 21 (79) 16 (58) 7 (42) 3 (35)KONDE SP 7 (86) 17 (79) 10 (62) 9 (52) 10 (43) 4 (33)

ASP 4 (94) 13 (90) 9 (77) 8 (68) 9 (60) 1 (51)UZINI SP 15 (110) 14 (95) 13 (81) 6 (68) 2 (62) 3 (60)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis51/77

Page 60: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations

Cure rate modelTable 11: Posterior Estimates for Overall Treatment Efficacy

Location Period Parameter Estimate SE 2.5% 50% 97.5%

PASP 0.51 0.05 0.41 0.51 0.61KONDE (0–42] PSP 0.50 0.05 0.40 0.50 0.60

PASP 0.67 0.05 0.57 0.67 0.76UZINI (0–42] PSP 0.56 0.05 0.47 0.56 0.65

PASP 0.38 0.05 0.28 0.38 0.48KONDE (0–84] PSP 0.34 0.05 0.24 0.34 0.44

PASP 0.59 0.05 0.49 0.60 0.69UZINI (0–84] PSP 0.52 0.05 0.43 0.52 0.61

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis52/77

Page 61: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Cure rate modelTable 12: Posterior Estimates: P(ASP>SP)

Period KONDE UZINI

(0–42] 0.55 0.94(0–84] 0.71 0.87

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis53/77

Page 62: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Cure rate model

Figure 1: Treatment Efficacy after42 days Posterior Distribution

Figure 2: Treatment Efficacy after84 days Posterior Distribution

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis54/77

Page 63: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

First recurrence at each follow–up

Figure 3: KONDE: First recurrenceposterior densities at eachfollow–up

Figure 4: UZINI: First recurrenceposterior densities at eachfollow–up

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis55/77

Page 64: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Day 42 Posterior survival plots

Figure 5: Observed 42 daysposterior survival plots

Figure 6: Truncated 42 daysposterior survival plots

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis56/77

Page 65: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Day 84 Posterior survival plots

Figure 7: Observed 84 daysposterior survival plots

Figure 8: Truncated 84 daysposterior survival plots

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis57/77

Page 66: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Mean Delay time to first recurrence

Table 13: Posterior Estimates for Mean Delay by ASP

Location Period Parameter Estimate SE 2.5% 50% 97.5%

0 – 42 µk1 6.38 1.50 3.35 6.40 9.23KONDE 0 – 84 µk2 2.98 2.73 -2.43 3.00 8.28

0 – 42 µu1 6.11 1.46 3.18 6.14 8.90UZINI 0 – 84 µu2 7.78 2.70 2.34 7.827 12.96

0 – 42 µku 6.24 1.05 4.20 6.24 8.29

KONDE–UZINI 0 – 84 µku 5.41 1.92 1.64 5.41 9.17

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis58/77

Page 67: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Results of simulations contd

Posterior densities

Figure 9: Posterior densities indelay time in 42 days

Figure 10: Posterior densities indelay time in 84 days

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis59/77

Page 68: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Discussion and Conclusions

Discussion

The results obtained still okays ACTs as better treatment foruncomplicated malaria

Contribution to existing knowledge

A major contribution to existing knowledge on the efficacy ofmalaria treatment studies is the delay time.

In case of treatment failure, recipients of ASP will stayasymptomatic for 7 days if treatment was administered for 42days. The delay time will be 6 days, if treatment was providedfor 84 days.

Results have been published in2. (International Journal ofStatistics in Medical Research-2013)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis60/77

Page 69: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Chapter 6: Effects of background variables

Motivation

In Chapter 5, we compared the efficacies of SP and ASPwithout considering any covariates.

In the clinical study, there were some background variables onthe patients and the severity of the infection at baseline.

We extend the work done in the last chapter by studying theeffects of such variables on recurrence or non-recurrence ofmalaria.

This is done using logistic regression analysis.

We choose the classical logistic regression over the Bayesianlogistic regression because the Bayesian logistic approachwith independent non-informative priors will provide almostthe same results as the classical method.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis61/77

Page 70: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

A few background variables noted during the study

Covariates

Time: The date of enrolment of the children into the study incalendar days was known in Konde.

Age: The age of the children recruited for the study.

Drug Type: Artesunate plus sulfadoxine-pyrimethamine (ASP)or sulfadoxine-pyrimethamine (SP)

Ri , Si , Mi for three sites I = 1, 2, 3. They indicate whetherthere were only resistant genes, only sensitive genes or bothtypes present in the blood sample.

D0p: The number of parasites per millilitre of blood on dayzero

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis62/77

Page 71: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Effects of background variables

Logistic model

Here Y , is dichotomous, that is, recurrence or non–recurrenceof disease.

The expected value (or mean) of Y is the probability thatY = 1 and it is limited to the range 0 through 1, inclusive.

If we let π = P(Y = 1), the ratio π/(1 − π) take on values in(0,+∞) and its logarithm (ln) of π/(1 − π) in (−∞,+∞).

In multiple logistic regression, the probability of patient j beingcured subject to some covariates, that is,πj = P(Yj = 1|x1j , x2j , . . . , xkj), is written as

ln

[

πj

1 − πj

]

= β0 + β1x1j + β2x2j + . . .+ βk xkj . (25)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis63/77

Page 72: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Effects of background variables

Logit transformation

Using the logit transformation, we have

πj =exp(β0 + β1x1j + β2x2j + . . .+ βk xkj)

1 + exp(β0 + β1x1j + β2x2j + . . .+ βk xkj). (26)

Concerning the subscript j , we do not actually have logits foreach individual observation but just have 0’s or 1’s. As aconsequence, instead of πj on the left hand side of equations25 and 26, we can simply write π.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis64/77

Page 73: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Proposed models

The following models are motivated by our objective todetermine the probability of a no recurrence of malaria givenclinical or important variables that were measured during theclinical trial.

logit(π)=β1+β2DRUG+β3M+β4R+β5M′

1+β6M′

2+β7M′

3+β8R′

1+β9R′

2+β10R′

3.

(27)logit(π)=β1+β2 log(D0p)+β3TIME+β4PCODE+β5DRUG+β6M′+β7R′.

(28)logit(π)=β1+β2 log(D0p)+β3AGE+β4PCODE+β5DRUG+β6M′+β7R′. (29)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis65/77

Page 74: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Choosing the final models

important covariate

The above models cannot adequately explain the probabilityof no first recurrence of parasites.

However, we had formulate two reduced models keepingclinically important variables.

In our case Drug type, PCODE, M′, R′ and S′ are clinicallyand intuitively important. The two models of interest are:

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis66/77

Page 75: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Final retained models

important covariate

The following two were retained

logit(π) = β1 + β2DRUG + β3PCODE + β4M ′ + β5R′. (30)

logit(π) = β1 + β2DRUG + β3PCODE + β4S′. (31)

Applying these models to data from Konde and Uzini, we haveresults presented in Tables 14 and 15, respectively.

These values are also presented on these Tables.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis67/77

Page 76: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Final retained models

Results from a model in Konde Day 42

Table 14: Effect of some factors on the probability of cure in Konde

ESTIMATES ON KONDE DATA0–42 days β exp(β)SE(β)z-value Pr(> |z|)

Model 30Intercept 0.016 1.016 0.837 0.019 0.985DrugSP -0.152 0.859 0.347 -0.438 0.661PCODE 0.630 1.878 0.381 1.654 0.098

M′ -0.369 0.691 0.302 -1.225 0.221R′ -0.175 0.840 0.303 -0.576 0.564

Null dev: 194.03 on 141 df; Resid dev: 188.67 on 137 df(187.01 on 135 df)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis68/77

Page 77: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Final retained models

Results from a model in Uzini Day 42

Table 15: Effect of some factors on the probability of cure in Uzini

ESTIMATES ON UZINI DATA0–42 days β exp(β)SE(β)z-value Pr(> |z|)

Model 30Intercept 0.381 1.464 0.577 0.661 0.509DrugSP -0.555 0.574 0.309 -1.798 0.072PCODE -0.483 0.617 0.453 -1.067 0.286

M′ 0.266 1.304 0.213 1.249 0.212R′ 0.365 1.440 0.173 2.111 0.035∗

Null dev: 256.97 on 192 df; Resid dev: 248.19 on 188 df(242.95 on 186 df)

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis69/77

Page 78: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Some results

Interpretation of results for Day 42 in Konde

Large number of parasites in the blood increases the chancesof getting cured, especially in Konde

The probability of getting cured is higher with S – genes

Time not significant

Resistance associated more with M – genes

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis70/77

Page 79: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

General Conclusions

Methodology

We have built probability, statistical and survival modelsstimulated by data from a clinical trial on efficacy of twotreatments.

The thesis carefully puts modelling theory and application inone piece.

Probability models for the estimation of haplotype frequencieswere proposed and better model obtain by modeldiscrimination procedures.

Haplotype frequencies would have been underestimated if wedid not use combinatorics to unveil the hidden possiblehaplotypes

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis71/77

Page 80: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Conclusions

Important results

There were eight different haplotypes that could infect victimsindependently.

Haplotypes with sensitive genes at the second and thirdpositions decreased at first episode of the disease since startof administration of SP.

SP was nor effective in killing all parasites with resistantstrains and the surviving parasites were responsible forrecurrence of malaria.

ASP known to have a faster parasites clearance, cleared allparasites and any first episode of malaria should be from anew infection.

Sick children could have on the average been bitten between1 to 3 times by mosquitoes and there were 1 to 3 differenthaplotypes present at baseline.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis72/77

Page 81: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Conclusions

Important results 2

It is well established that the spread of resistance to SP maybe delayed by its combination with artesunate.

How long one may remain free from the parasites has notbeen estimated to the best of our knowledge.

We obtained Bayesian estimates for the duration which canbe up to 7 days for a follow–up period of 42 days and 6 daysfor a follow–up period of 84 days, respectively.

The logistic models cautiously say that the higher the parasitedensity the smaller the risk for a recurrence of malaria.

The children had no partial immunity.

Recurrence of malaria was more common with childrenharbouring multiple infections followed by children carrying thesingle resistance strain.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis73/77

Page 82: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Room for improvement

For the future

We assumed uniform distributions for everyone withinintervals between follow–up dates, which is a limitation. Whynot others such as the exponential?

We focused only on the event of a first recurrence of malaria.Why not second or third recurrence?

The methods can be generalized to clinical investigationsinvolving more than two study sites and using more than twotreatments.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis74/77

Page 83: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Remark

Are we perfect?

Models will be useful in the fight against malaria, if they areformulated with important biological and practical realities inmind and when their results are interpreted with care

Some of the models were rejected, but the rejection of thesemodels does not remove intrinsic biological questions thatmotivated their modelling.

Models in general with malaria models inclusive are notperfect just as the real world.

However, their findings can be useful to the universal malariacontrol community

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis75/77

Page 84: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Thank you for your attention

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis76/77

Page 85: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

Thank you for your attention

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis76/77

Page 86: Probability Models for Estimating Haplotype Frequencies and Bayesian Survival Analysis of Two Treatments

References

References I

[1] Irwin, J. (1949). The standard error of an estimate of expectation of life, withspecial reference to expectation of tumourless life in experiments with mice.Journal of Hygiene, 47(02):188–189.

[2] Kum, C. K., Thorburn, D., Ghilagaber, G., Gil, P., and Björkman, A. (2013a). Anonparametric bayesian approach to estimating malaria prophylactic effectafter two treatments. International Journal of Statistics in Medical Research,2(2):76–87.

[3] Kum, C. K., Thorburn, D., Ghilagaber, G., Gil, P., and Björkman, A. (2013b).On the effects of malaria treatment on parasite drug resistance: Probabilitymodelling of genotyped malaria infections. The International Journal ofBiostatistics, 9(1):1–14.

[4] Nelder, J. and Mead, R. (1965). A simplex method for function minimization.The computer journal, 7(4):308–313.

Cletus Kwa Kum Estimating Haplotype Frequencies & Bayesian Survival Analysis77/77