Simulated-likelihood-based Inference for an outbreak of influenza
description
Transcript of Simulated-likelihood-based Inference for an outbreak of influenza
Simulated-likelihood-based Inferencefor an outbreak of influenza
Marc Baguelin
Health Protection Agency
A few info
Work done at the School of Veterinary Medicine of the University of Cambridge with the Animal Health Trust in Newmarket
Funded by the Horserace Betting Levy Board
Now I work at the HPA currently on the modelling of swine flu in order to inform Public Health policies
Equine influenza
Currently circulating strains : H3N8
(current human strains H3N2 and H1N1 and now H1N1pdm, H5N1 does not transmit human to human)
H7N7 also exists for horses but no longer circulates (but still in the vaccine)
Two separate sub-lineages co-circulating
From the modelling point of view, the main difference with human influenza is the way the population is distributed
Moulton‘98Nt‘1/93
Miami ‘63
Leics ‘00
Suffolk ‘89KY ‘91
KY ‘98
S.Americanfamily
Nt‘2/93
Am
eric
an s
ub-li
neag
eE
uropean sub-lineageEI phylogenic tree
Why modelling?
To help understand epidemics
Risk assesment
Test different scenarios for vaccine policies
Etc.
The 2003 Newmarket outbreak
21 training yards involved over ~60
more than 1300 horses at risk (~2500)
The dynamics of the epidemics cannot be understood with a simple one yard model as previously for EI: need of a new model.
The map of the outbreak
S E I R
Ex: I5 will be the number of infectious horses in yard 5
SEIR Model
Latent and infectious periods
j
i
tij = rate of transmission from a horse infected in yard i to susceptible horses in yard j
T= (tij)
tij
tjj
Within and between yard transmission rates,
Mixing matrix
How to find the mixing matrix?
Depends of the contacts between the horses in the different yards: very difficult to quantify (shared facilities, contacts when moving, routes taken for going to training, vets, air spread…) usually considered as one spatial and one stochastic part
Assumptions can be made to reduce the number of parameters to find: necessity of the expertise from epidemiologists from the fields
Some of the available data
Number of horses in training for most of the yards with age structure (from Raceform ‘Horses in training 2003’)
Serological data giving antibodies level for some yards which allow us to estimate the level of protection of the horses
Geographical location of the yards
Estimation of the proportion of infected horses in each yards at the end of the epidemic
Date of first detected cases in the infected yards
Trainer questionnaires
But…
Though a huge amount of work done to collect data, few input for the model:
A lot of quantities have to be averaged
Stochastic (as opposed to deterministic) model means that each run leads to different output: in that sense, one ‘run’ available
Lack of temporal data
A classical assumption
Model with two-level of mixing λG (global rate; between yards) and λL (local rate; inside yards).
Number of susceptible
As all the horses are vaccinated, the status of the initial population is uncertain. Vaccine coverage (though theoretically 100%) and efficacy in horses is difficult to predict. Less data than for human + circulation of cross protecting but heterologous strains.
A statistical model (log regression) has been proposed to connect the probability that an animal will be infected given the virus entered its yard (using different variables among which the AB level)
Combining threshold theorem + the statistical modelaverage of the risk for the yard from stat model
mean infectious period
Data
Inference method
The likelihood is analytically and numerically impossible to calculate for each of the pairs (λL, λG)
Very easy to simulate the model
-> use of simulated likelihood to estimate λL, λG
Two approaches
1 Estimate simultaneously the pairs (λL, λG)
2 Estimate first λL and then λG, since the transmission is mostly locally driven (see 1)
First method
1) Use a grid of (λL, λG)
2) For each values of the grid simulate N realisations of the epidemic
3) Count anytime the output is close from the real data (ideally exactly-discrete data)
4) For N sufficiently big, the frequency does approximate the likelihood
This is ABC with “flat” priors
Result (first method)
Grey: give the final size (+/- 0.5%)
Black: final size + exact number of yards
λL = 1.03; λG = 1.5e−2) for the exponential distributions andλL = 0.7; λG = 1.5e−2 for the empirical ones.
Non-regular likelihood
The outbreak is essentially locally driven
It is possible to have a more efficient estimation for the local transmission by using the ten yards for which we have the final sizesAs less than 2% (0.63% on average) of cases will come from re-introduction
Second method
Estimating for a grid of local transmission the simulated likelihood to have simultaneously the exact count for the ten final sizes as independent sub-epidemic (seeded from outside) with the number of susceptible as given by the predicted risk
Second methods: results
The estimated values of the intra-yard transmissionλL were 0.78 for the exponential (grey) distribution and 0.69 (black) for the empirical distribution.
Then estimate the global transmission knowing the local
λG =1.7e−2 for the exponential distribution and 1.6e−2 for the empirical distribution
Conclusion
When likelihood are difficult to derive analytically and models easy to simulate, simulated-likelihood-based methods are an efficient solution
It’s the case in many models of transmission of infectious diseases
More work has to be done on the methodological side of this, especially the limits/accuracy of these methods, the most efficient way of implementing them, model selection issues and deviations from standard theory due to the threshold/phase transition behaviour of epidemic models
Acknowledgments
Horserace Betting Levy Board
CIDC
Epidemiology group in AHT (esp. Richard Newton)
Vet School at Cambridge University (esp. James Wood)
Prof Bryan Grenfell from Penn State
Dr Nikolaos Demiris from MRC-BSU, Cambridge, now in Athens