Bayes, Data Mining and Pharmacovigilance

zoomzurichAI and Robotics

Oct 16, 2013 (3 years and 5 months ago)

72 views

Bayes, Data Mining and
Pharmacovigilance

Patrick Graham

University of Otago, Christchurch

Acknowledgements


Funders:
MedSafe

and HRC


Collaborators: Ruth Savage, Janelle Ashton, Michael
Tatley

from New Zealand
Pharmacovigilance

Centre,
University of
Otago

Backgroun
d
(1)


Pharmacovigilance


post
-
marketing surveillance of
medicines.


Seeks early detection of adverse drug reactions,


Traditionally, clinical review of spontaneous reports


Late 1990s? “Data mining” of databases of spontaneous
reports


“signal detection”


Two Bayesian methods prominent:



-
WHO

Independent 2 x 2 tables, Multinomial


Dirichlet



-
FDA (DuMouchel)
-

Analyse all drugs by all reactions table


using a hierarchical Poisson


mixture of Gammas model




Background (2)


Interest is now turning to the potential of longitudinal health
care databases, record
-
linkage, electronic prescribing and
other technologies


MedSafe / HRC formed the “product
-
vigilance partnership”
2007 to advance product
-
vigilance research in NZ.


Feasibility studies in 2008


Collaborative group funded late 2009 led by Dr. Michael Tatley
of the NZ Pharmacovigilance Centre.


Wider project includes work on: risk communication; ethics,
acceptability and methods for accessing general practice data;
epidemiological studies; signal detection in longitudinal
databases.


Signal detection work includes Bayesian methods (me) as well
as investigation of machine
-
learning approaches, text mining
and natural language processing.


Features of health care databases


More representative than spontaneous
reports


Usually longitudinal


Lots of time
-
stamped information


Large size (makes analysis at the level of
individual patients difficult)

Signal detection in health care databases


Aim is exploratory analysis of:


Multiple drugs


Multiple outcomes, some of which will be rare


Large datasets, potentially large number of
comparisons, so computational efficiency will often
be an issue
.

One approach: Noren et al (2010
)

Published in Data Mining and Knowledge Discovery ,


2010, 20, 361
-
387.


Simple Poisson
-
Gamma model for each combination of outcome,


drug, and time period.


Exact posterior immediately available


no smoothing, no pooling
,
just shrinkage towards a prior mean


Applied to a UK General Practice database, > 20 million scripts


Analysed 2,445 drugs x 5,753 outcomes x 72 time periods


Took ~8 hours on a server with 2 dual core, 2.4GHz processors


Nice graphs of temporal trends


Software system developed but not publicly available


A data mining approach?

Noren et al
-

Details

Why did we not use Noren et al’s approach?

(i) A statistical modeller’s perspective


Many similar parameters to be estimated


Should we not be trying to learn from similar estimations
when estimating each particular parameter?


Old idea for both Bayesians and frequentists
.

(ii)
Our paradigm is on a smaller scale, e.g.


Compare a new drug with drugs currently used for
treating the same condition.


20
-
30 outcomes with thought
a priori
likely to be
associated with ADRs, e.g. (Trefiro et al 2009)

Hierarchical Bayesian Model

(similar notation to previously but specifically reference patient sub
-
groups by subscript g)

(Separate model for each outcome, o)

Hierarchical Bayesian Model (cont’d)

Prior model permits full flexibility of statistical modelling, e.g.


-

jump in event rate just after first prescription



-
smooth but nonlinear changes elsewhere


-
drug by time interactions


-
drug by covariate interactions


-
etc

But hierarchical model structure provides some protection against
model
-
misspecification.

Of course we don’t actually condition on

But integrate over the posterior to obtain

Hierarchical Bayesian Model
-

Computation

First part of the integrand is product of independent Gammas,

second part is the posterior for parameters of a negative binomial


model, suggesting Monte Carlo computation via

Should be faster than Gibbs sampler

Hierarchical Bayes model computation
performance


Burn
-
in of 4000 for MCMC seems adequate


Example 1: Cohort of 4531, prescribed one of 4
atypical antipsychotics, 23 outcomes, adjusting
for age and sex, took 17 minutes


Example 2: Cohort of 10,308 children receiving
one of 3 vaccines, 12 outcomes, adjusting for age,
sex and season, took 27.5 minutes


Using 64 bit R on a laptop, 4 GB RAM, quad core
2.2 GHz, but with R not optimised for multi
-
core

Results: Atypical antipsychotics

Ischaemic Heart Disease (IHD
)

Posterior probabilities for standardised IHD rates
exceeding baseline rates, by drug and period

Drug

Period 1

Period 2

Period 3

Period 4

Clozapine

0.08

0.01

0.00

0.00

Olanzapine

0.05

0.03

0.03

0.03

Quetiapine

0.93

0.89

0.81

0.74

Risperidone

0.00

0.00

0.00

0.00

Posterior probability that each drug has the
largest standard IHD rate, by period

Drug

Period 0

Period 1

Period 2

Period 3

Clozapine

0.16

0.03

<0.01

<0.01

Olanzapine

0.31

0.02

0.03

0.03

Quetiapine

0.02

0.89

0.83

0.94

Risperidone

0.51

0.06

0.14

0.03

Results. IVMP


local reactions

Summary


Signal detection in Pharmacovigilance is an important and
interesting area


“Data mining” approach emphasises large scale computation,
simple statistical model, independent analyses, no learning
across groups, outcomes, drugs or time
-
periods.


“Statistical modelling” approach emphasises a more complex
statistical model which permits, for each outcome, learning
across groups, drugs, time
-
periods, (while still permitting
departures from the model) but is designed for smaller scale
computations


Can each approach learn from the other?