A normal prior on -inf,inf which is pushed through the inv_logit function will have a lumpy shape but it won’t be a beta distribution, it may be a good approximation for some values of the parameters, but would be a terrible approximation for things like a beta(1/2,1/2) which has a density that goes to infinity at both ends for example.

]]>Is it mostly for computational reasons (i.e. allow sampling from unconstrained space) or is there any connection between the two approaches?

]]>The direct MC algorithm and Stan results are pretty similar. The differences are probably just an artifact of different prior assumptions or simply sampling errors.

For the first study the MC algo gives prevalence of 1.08% (CI95: 0.06% – 1.96%), with a second mode at 0 that doesn’t show up in the coarse plots above but you can see on the plot here: https://testprev.com/?n=3300&k=50&n_u=401&k_u=2&n_v=122&k_v=103&a_u=1&b_u=1&a_v=1&b_v=1&N=1000&M=1000&CI=95&w=1

For the second Santa Clara study the MC algo gives 1.31% (CI95: 0.72% – 1.93%) – again plotted here: https://testprev.com/?n=3300&k=50&n_u=3324&k_u=16&n_v=157&k_v=130&a_u=1&b_u=1&a_v=1&b_v=1&N=1000&M=1000&CI=95&w=1

I’ve put up a few links to other studies along with data for the algorithm so you can plot the posteriors. All feedback welcome!

]]>Jerome:

Yes, the latest Stanford paper the one I analyzed in the above post. It seems that a key issue is the variation in false positive and false negative rates across studies. The more recent Stanford paper pooled lots of these together, but in my hierarchical model I let them vary.

]]>**An important update**: in the new version of the Santa Clara preprint, the authors have obtained much more data for test specificity and sensitivity (https://www.medrxiv.org/content/10.1101/2020.04.14.20062463v2) . As a result, the antibody prevalence distribution for Santa Clara moves away from zero, with the lower end of the 95% credible interval above 0.5% prevalence.

Bob Carpenter pointed out that such results can be sensitive to

hyperprior parameters. As a check on model robustness, we compare the MCMC results with those from a generalized linear mixed model with bootstrapping. The results are essentially the same.

–

Thanks for fixing it for me.

]]>There’s no markdown, just basic HTML. So you have to use the <pre> tag for code. The inline code format <code> doesn’t match, which really bugs me, but not enough to try to tweak the CSS. Also, no LaTeX in comments, only in posts :-(

]]>Note to self, html EATS Stan code

]]>Andrew,

Agree, just wanted to illustrate how to do something similar to your simpler analysis in a non-Bayesian way, which as you say gives basically the same result (as expected). If there were a large number of labs, it would not be too hard to adjust for lab variability by resampling labs instead of individuals, but since there’s so few labs you really need a model. Your approach seems reasonable.

]]>Will:

You can show code in html with the “pre” tag. And you have to do some special html for angle brackets. (For example, your above code doe not have the “lower=0, upper=0” constraints on the three probabilities.

]]>Ram:

Yes, if we assume all the measurements are independent with equal probabilities, the N’s are large that just about any statistical method will give the same answer. I did Bayes because it’s the simplest solution for me: I can just set up the model and run. But a classical estimate-and-standard-error approach will do the job too, as long as you’re careful (which maybe the authors weren’t in their original paper). Once we allow the probabilities to vary, it becomes more complicated. But, again, I think a classical approach would work, with some care. As discussed in the above post, the challenge is that, with only 3 experiments on sensitivity, there’s a lot of uncertainty, so some additional information needs to be added in some way to get the analysis to work.

]]>The lab variability is very important unfortunately. Your method is I think quite similar to the original authors’.

]]>I’d recommend putting code offsite in a pastebin or dropbox or something like that.

]]>Sorry about the formatting. Is there a guide somewhere to formatting comments on this blog?

]]>I used different priors — specifically, beta(0.7, 5) priors for the false positive rate, false negative rate, and population proportion infected. (A little arbitrary but they put most probability density below 10%.)

I didn’t write it up or post it anywhere so figure here is as good a place as any to share. Stan code:

data { // data from known positives, to estimate false neg rate int n_known_pos; // 122 int n_true_pos; // 103 // data from known negatives, to estimate false pos rate int n_known_neg; // 401 int n_false_pos; // 2 // actual test data int n_study; // 3330 int n_tested_pos; // 50 } parameters{ real pi_s; // sample prevalence real fpr; // false positive rate real fnr; // false positive rate } transformed parameters { real pos_prob; // prob of getting a positive test result pos_prob = (1 - fnr) * pi_s + fpr * (1 - pi_s); } model { // beta priors for proportions - put most mass below 10% pi_s ~ beta(.7, 5); fpr ~ beta(.7, 5); fnr ~ beta(.7, 5); // false positives - estimate false positive rate target += binomial_lpmf(n_false_pos | n_known_neg, fpr); // true positives - estimate false negative rate target += binomial_lpmf(n_true_pos | n_known_pos, 1 - fnr); // study data - estimate population proportion target += binomial_lpmf(n_tested_pos | n_study, pos_prob); }

And results:

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat pi_s 0.010 0.000 0.005 0.001 0.007 0.010 0.014 0.019 9702.639 1.000 fpr 0.007 0.000 0.004 0.001 0.004 0.006 0.009 0.015 9316.485 1.000 fnr 0.155 0.000 0.032 0.098 0.133 0.154 0.176 0.223 13941.622 1.000 pos_prob 0.015 0.000 0.002 0.011 0.014 0.015 0.017 0.020 30584.499 1.000 lp__ -17.178 0.017 1.391 -20.810 -17.801 -16.813 -16.166 -15.610 6436.322 1.001

[Edit: added formatting for code and output.]

]]>y.p = 50

n.tested = 3330

y.tp = 130

n.cases = 157

y.tn = 3308

n.controls = 3324

alpha = 0.05

R = 1000000

p.hat = y.p / n.tested

sensitivity.hat = y.tp / n.cases

specificity.hat = y.tn / n.controls

prevalence.hat = max(c((p.hat + specificity.hat – 1) /

(sensitivity.hat + specificity.hat – 1), 0))

prevalence.tilde = array(dim = R)

for (r in 1:R) {

y.p.boot = rbinom(1, n.tested, p.hat)

y.tp.boot = rbinom(1, n.cases, sensitivity.hat)

y.tn.boot = rbinom(1, n.controls, specificity.hat)

p.tilde = y.p.boot / n.tested

sensitivity.tilde = y.tp.boot / n.cases

specificity.tilde = y.tn.boot / n.controls

prevalence.tilde[r] = max(c((p.tilde + specificity.tilde – 1) /

(sensitivity.tilde + specificity.tilde – 1), 0))

}

(2 * prevalence.hat) – quantile(prevalence.tilde, c(1 – (alpha / 2), alpha / 2))

which yields 0.6-1.8 for the second report. Same deal for the first report yields 0.3-2.2. With all the same caveats as in the OP. Don’t have time to look at the lab variability unfortunately.

]]>Hey Dan

Your Larremore lab People page is the best ever!

]]>mu_spec or mu_sens are also missing priors, but because these values are unconstrained, that defaults to improper uniform priors. These are values where a uniform prior on the log odds scale would make sense, which is logistic(0, 1) or about normal(0, 1.8).

]]>Thanks for the code, Andrew.

Dan, I found _your group’s_ model and javascript inteface really useful, and so was the preprint. I shared and discussed it with several of my students and colleagues, some of whom are in public health — all of them have good things to say.

]]>Cats are well known to be both waves and particles

]]>For fun, a PhD student in my group coded up an MCMC for the non-hierarchical model in javascript (!) a couple weeks ago to compute posteriors for seroprevalence, sensitivity, and specificity. It runs in the browser on the user’s side. https://larremorelab.github.io/covid-calculator2 It’s no stan, but might be helpful if anyone wants to explore in real time how various values affect posteriors.

]]>This’d be easy to compare. The biggest reason to use log odds parameterizations it that they’re easy to extend to predictors when we have them.

I found in my repated binary trial case study that there was a fairly big difference between using hierarchical betas (parameterized as Andrew describes, which we also recommend in the user’s guide for betas because parameterizing alpha and beta separately leads to too much prior dependence) and a normal logistic approach. The normal logistic led to more skewed posteriors. The Efron and Morris data wasn’t enough to conclude which worked better.

]]>More specifically, we just follow the math. When Stan sees:

y ~ foo(theta);

for a vector y and fixed parameter theta, then it’s equivalent (up to a constant) to this

target += foo_lpdf(y | theta)

where

foo_lpdf(y | theta) = SUM_{n in 1:size(y)} foo_lpdf(y[n] | theta).

When size(y) = 0, that sum evaluates to 0, and 0 gets added to the target.

]]>I hadn’t even noticed that (a) there’s not a prior for p, and (b) it doesn’t render the constraint to (0, 1). The HTML really is maddening. I fixed the code in my comment.

Stan’s default is a uniform distribution over the values of a parameter that satisfy the declared constraint. Because the constraint is to (0, 1), that makes it uniform over the interval, or equivalent to a beta(1, 1) prior. We do that by adding the log of the absolute derivative of the inverse transform (from unconstrained to (0, 1)) to the target. You can then add another prior on the (0, 1) scale, like a bta prior, that’ll just multiply.

]]>Thanks, makes much more sense now. This html behaviour sure is maddening.

]]>Zhou:

No, I had the lower=0, upper=1 bounds on p all along. They just had been absorbed by the html. I just went in and fixed that.

]]>Er, the comment html formatting software seems to have made *my* bounds disappear, but hopefully you get the idea.

]]>The thing that confuses me is that in the first code snippet you have

real p;

but in the second hierarchical model the bounds have disappeared and you just have

real p;

So I’m wondering if this is a mistake or does stan somehow realise on its own that p is meant to be between 0 and 1?

]]>Zhou:

In a Stan program, each line with a “~” or a “target +=” adds a term to the target, that is the log posterior density. The default prior is always uniform on the nominal scale of the parameter; thus the default prior for p is uniform(0,1) in this example.

]]>I was just wondering if there’s some prior sensitivity happening here given the informativeness of the data seems weak. I’m not sure what Stan’s default is for p in this context, might it be silly?

]]>Colab notebook: https://colab.research.google.com/drive/1dHIa_ex0IYOoZAtM8ueOEA0Wfbur1n_E

Visual summary: https://imgur.com/HJIicN4

Some caveats:

– Obviously the used data are not the raw data

– No hierarchical model for specificity/sensitivity

– I have been using Stan for less than a week and simply used the above model as “inspiration”

– The “no pooling”-version gives very high estimates for the Santa Clara wide prevalence, I guess because the cells are so small, that the cell-wise CrIs are very wide and include unreasonably high estimates

– Maybe sigma ~ normal(0, 1) is too informative?

Ethan:

Beta could work too. I used logit-normal just because I can easily work with the hyperparameters (in the above example, throwing on that normal(0, 0.2) prior). With the beta, I’d want to reparameterize to mean and sd, or to alpha/(alpha + beta) and (alpha + beta)—we did this in one of the models in BDA, and it’s not so hard to do, but it’s one more step, and right now I’m more used to the logit-normal model.

Usually I’d say that the specific functional form shouldn’t make much of a difference, but in this case we have a lot of specificities very close to 1, so maybe the beta would work better, as it doesn’t have the hard cutoff at the boundary the way the lognormal does. I’m not sure.

]]>David:

Yes, I fixed a couple of typos, also added more thoughts in a P.P.P.S.

]]>Peter:

Yes, when N=0 it does not increment the target.

]]>Zhou:

It would be easy to add priors for p. I did not do this because (a) the focus in the discussion seemed to be about what could be learned just from these data, and (b) any prior for p would be pretty vague in the region of interest. Given all the other studies that have been happening, I think we can be pretty sure that p is less than 0.1, and on the low end we could probably take 10 times the number of positive tests in the county at the beginning of April divided by the population of the county, but then this wouldn’t help so much for the inference. But, sure, it would be fine to do this.

]]>Maybe this is the wrong place to ask, but does binomial_logit simply not increment the target log probability density at all when N=0? I’ve run this code and can see that it gets the right answer, I just don’t understand why…

]]>> p is [0.007, 0.015]—that is, the data are consistent with an underlying

> infection rate of between 0.7% and 1.5%.

Doesn’t the table say 97.5% for p in the second report is 0.019?

]]>Is it worth working with some different prior specifications for p?

]]>+ 1. It’s always amusing to me to see all the hoops that non-Bayesian approaches have to jump through.

]]>Whatever their names are, the picture gives a pretty good illustration of cats (in pairs as well as singletons) as fluids.

]]>The key is having the conceptually homogeneous framework of Bayesian inference, where we just integrate over the posterior to calculate expecations. That makes it possible to have high-level languages in which it becomes easy to write models and perform inference. Then hierarchical meta-analysis isn’t a new math problem, but just another model specification. But this would’ve been possible 20+ years ago with BUGS. Maybe it’s no coincidence that both BUGS and JAGS were developed by epidemiologists.

]]>1. Using the offset/multiplier, you don’t have to go through Matt-trick-style reparameterization arithmetic.

2. The naming convention for location and scale don’t match the variable names. I changed the names to protect the scales. That makes the names super long, which I don’t like. But it follows the Gelman mu_X, sigma_X naming convention for location and scale of parameter X.

3. I use binomial_logit instead of binomial for the sensitivities and specificities. That shouldn’t make a big difference as long as the probability of success parameter isn’t too close to 0 or 1.

At first I thought that the p_sample definition was a bug until I realized you only have the data from one study. I took the liberty of adding some light doc as a hint to that effect.

data { int y_sample; // for j = 1 int n_sample; // for j = 1 int J_spec; int y_spec [J_spec]; // no samples for j > 1 int n_spec [J_spec]; int J_sens; int y_sens [J_sens]; // no samples for j > 1 int n_sens [J_sens]; } parameters { real<lower = 0, upper = 1> p; real mu_logit_spec; real<lower = 0> sigma_logit_spec; real mu_logit_sens; real<lower = 0> sigma_logit_sens; vector<offset = mu_logit_spec, multiplier = sigma_logit_spec>[J_spec] logit_spec; vector<offset = mu_logit_sens, multiplier = sigma_logit_sens>[J_sens] logit_sens; } transformed parameters { vector<lower = 0, upper = 1>[J_spec] spec = inv_logit(logit_spec); vector<lower = 0, upper = 1>[J_sens] sens = inv_logit(logit_sens); } model { real p_sample = p * sens[1] + (1 - p) * (1 - spec[1]); // j = 1 y_sample ~ binomial(n_sample, p_sample); // j = 1 y_spec ~ binomial_logit(n_spec, logit_spec); y_sens ~ binomial_logit(n_sens, logit_sens); logit_spec ~ normal(mu_logit_spec, sigma_logit_spec); logit_sens ~ normal(mu_logit_sens, sigma_logit_sens); sigma_logit_spec ~ normal(0, 1); sigma_logit_sens ~ normal(0, 1); }]]>

Thanks!

P.S. These are the funnest posts to write. It just kills me that the authors of this paper jumped through all those hoops to get those confidence intervals, when they could’ve just thrown it all into Stan, and then the assumptions are so transparent.

]]>