An Easy Layup for Stan

Posted on March 3, 2022 1:29 AM by Phil

This post is by Phil Price, not Andrew.

The tldr version of this is: I had a statistical problem that ended up calling for a Bayesian hierarchical model. I decided to implement it in Stan. Even though it’s a pretty simple model and I’ve done some Stan modeling before I thought it would take at least several hours for me to get a model I was happy with, but that wasn’t the case. Right tool for the job. Thanks Stan team!

Longer version follows.

I have a friend who routinely plays me for a chump. Fool me five times, shame on me. The guy is in finance, and every few years he calls me up and says “Phil, I have a problem, I need your help. It’s really easy” — and then he explains it so it really does seem easy — “but I need an answer in just a few days and I don’t want to get way down in the weeds, just something quick and dirty. Can you give me this estimate in…let’s say under five hours of work, by next Monday?” Five hours? I can hardly do anything in five hours. But still, it really does seem like an easy problem. I say OK and quote a slight premium over my regular consulting rate. And then…as always (always!) it ends up being more complicated than it seemed. That’s not a phenomenon that is unique to him: just about every project I’ve ever worked on turns out to be more complicated than it seems. The world is complicated! And people do the easy stuff themselves, so if someone comes to me it’s because it’s not trivial. But I never seem to learn.

Anyway, what my friend does is “valuation”: how much should someone be willing to pay for this thing? The ‘thing’ in this case is a program for improving the treatment of patients being treated for severe kidney disease. Patients do dialysis, they take medications, they’re on special diets, they have health monitoring to do, they have doctor appointments to attend, but many of them fail to do everything. That’s especially true as they get sicker: it gets harder for them to keep track of what they’re supposed to do, and physically and mentally harder to actually do the stuff.

For several years someone ran a trial program to see what happens if these people get a lot more help: what if there’s someone at the dialysis center whose job is to follow up with people and make sure they’re taking their meds, showing up to their appointments, getting their blood tests, and so on? One would hope that the main metrics of interest would involve patient health and wellbeing, and maybe that’s true for somebody, but for my friend (or rather his client) the question is: how much money, if any, does this program save? That is, what happens to the cost per patient per year if you have this program compared to doing stuff the way it has been done in the past?

As is usually the case, the data suck. What you would want is a random selection of pilot clinics where they tried the program, and the ones where they didn’t, and you’d want the cost data from the past ten years or something for every clinic; you could do some sort of difference-in-differences approach, maybe matching cases and controls by relevant parameters like region of the country and urban/rural and whatever else seems important. Unfortunately my friend had none of that. The clinics were semi-haphazardly selected by a few health care providers, probably slightly biased towards the ones where the administrators were most willing to give the program a try. The only clinic-specific data are from the first year of the program onward; other than that all we have is the nationwide average for similar clinics.

The fact that no before/after comparison is possible seemed like a dealbreaker to me, and I said so, but my friend said the experts think the effect of the program wouldn’t likely show up in the form of a step change from before to after, but rather in a lower rate of inflation, at least for the first several years. Relative to business as usual you expect to see a slight decline in cost every year for a while. I don’t understand why but OK, if that’s what people expect then maybe we can look for that: we expect to see costs at the participating clinics increase more slowly than the nationwide average. I told my friend that’s _all_ we can look for, given the data constraints, and he said fine. I gave him all of the other caveats too and he said that’s all fine as well. He needs some kind of estimate, and, well, you go to war with the data you have, not the data you want.

First thing I did is to divide out the nationwide inflation rate for similar clinics that lack the program, in order to standardize on current dollars. Then I fit a linear regression model to the whole dataset of (dollars per patient) as a function of year, giving each clinic its own intercept but giving them all a common slope. And sure enough, there’s a slight decline! The clinics with the program had a slightly lower rate of inflation than the other clinics, and it’s in line with what my friend said the experts consider a plausible rate. All those caveats I mentioned above still apply but so far things look OK.

If that was all my friend needed then hey, job done and it took a lot less than five hours. But no: my friend doesn’t just need to know the average rate of decrease, he needs to know the approximate statistical distribution across clinics. If the average is, say, a 1% decline per year relative to the benchmark, are some clinics at 2% per year? What about 3%? And maybe some don’t decline at all, or maybe the program makes them cost more money instead? (After all, you have to pay someone to help all those patients, so if the help isn’t very successful you are going to lose out). You’d like to just fit a different regression for each clinic and look at the statistical distribution of slopes, but that won’t work: there’s a lot of year-to-year ‘noise’ at any individual clinic. One reason is that you can get unlucky in suddenly having a disproportionate number of patients who need very expensive care, or lucky in not having that happen, but there are other reasons too. And you only have three or four years of data per clinic. Even if all of the clinics had programs of equal effectiveness, you’d get a wide variation in the observed slopes. It’s very much like the “eight schools problem”. It’s really tailor-made for a Bayesian hierarchical model. Observed costs are distributed around “true costs” with a standard error we can estimate; true inflation-adjusted cost at a clinic declines linearly; slopes are distributed around some mean slope, with some standard deviation we are trying to estimate. We even have useful prior estimates of what slope might be plausible. Even simple models usually take me a while to code and to check so I sort of dreaded going that route — it’s not something I would normally mind, but given the time constraints I thought it would be hard — but in fact it was super easy. I coded an initial version that was slightly simpler than I really wanted and it ran fine and generated reasonable parameter values. Then I modified it to turn it into the model I wanted and…well, it mostly worked. The results all looked good and some model-checking turned out fine, but I got an error that there were a lot of “divergent transitions.” I’ve run into that before and I knew the trick for eliminating them, which is described here. (It seems like that method should be described, or at least that page should be linked, in the “divergent transitions” section of the Stan Reference Manual but it isn’t. I suppose I ought to volunteer to help improve the documentation. Hey Stan team, if it’s OK to add that link, and if you give me edit privileges for the manual, I’ll add it.) I made the necessary modification to fix that problem and then everything was hunky dory.

From starting the model to finishing with it — by which I mean I had done some model checks and looked at the outputs and generated the estimates I needed — was only about two hours. I still didn’t quite get the entire project done in the allotted time, but I was very close. Oh, and in spite of the slight overrun I billed for all of my hours, so my chump factor didn’t end up being all that high.

Thank you Stan team!

Phil.

12 thoughts on “An Easy Layup for Stan”

Andrew on March 3, 2022 8:02 AM at 8:02 am said:

Phil:

We have an updated warnings page here. There’s some issue where it hasn’t yet gone into all the Stan documentation (see discussion here).

Reply ↓
Andrew on March 3, 2022 8:06 AM at 8:06 am said:

Also, yes, I agree 100% on how easy it is to write these models in Stan once you’re familiar with the program. The other day I wanted to write a simple hierarchical model (it was for a comment in our recent discussion of the ivermectin meta-analyses). Usually for that sort of thing I’d look for an example to copy and alter, but in this case I just whipped up the model from scratch—and it worked right away! Super fast to program as well as to run, smooth workflow, I could then play with the model in different ways. Very satisfying. Kinda like ggplot2 for graphics: once you are fluent in the language, you can use it expressively, not just copying phrases out of a phrase book. I think it really makes me a better statistician.

Reply ↓
Chris on March 3, 2022 9:13 AM at 9:13 am said:

Can you paste the model so we have a better sense of the specifics?

Reply ↓
Andrew on March 3, 2022 9:22 AM at 9:22 am said:

Phil:

P.S. If you told him you could do it in X hours and you ended up charging him for 1.5X hours, then he’s the chump, no?

Reply ↓
- Phil on March 3, 2022 11:51 AM at 11:51 am said:
  
  Depends on whether he pays the full amount or just the amount we agreed!
  
  Reply ↓
kj on March 3, 2022 9:43 AM at 9:43 am said:

Phil,

Can you share what you found based on your final model? I felt pretty invested by the end of your post.

Reply ↓
- Phil on March 3, 2022 5:32 PM at 5:32 pm said:
  
  I’ve given enough information for people familiar with the pilot program to recognize it, and I don’t have permission to share the results, so I’ll have to be very circumspect here. But I think I can at least discuss the prior and the extent to which the posterior is consistent with it.
  
  I was told that it was very unlikely that a typical clinic could reduce their per-patient annual cost by more than 20% from the start through the fourth year, suggesting that a 5% reduction per year was at the high end of what might be possible. So the expected per-year benefit was in the very low single digits, as far as percent reduction in cost.
  
  As for variability from clinic to clinic, you expect a fair amount: some of the clinics already had pretty good practices for patient follow-up, others didn’t. The ones with good practices presumably had less room for improvement, while the ones with bad practices could improve a lot. Also, some clinic populations would presumably be less amenable to whatever they did in the pilot program: some would have more language/translation issues, or cultural issues that make them less likely to follow the dietary guidelines, or whatever. So we might expect the standard deviation of the slopes to be not terribly far from the mean slope, i.e. some slopes might be close to zero while others might be double the mean.
  
  So I implemented all of those insights as prior distributions in Stan, using normal distributions (or half-normal, in the case of standard deviations) and setting the widths of the distributions wider than the above paragraph would seem to imply. (Everyone tends to be over-certain, so, for instance, they can tell me that there’s no way most clinics are going to reduce costs 5% per year, but they might be wrong).
  
  And…the posterior estimates all ended up in the meat of the prior distributions, not in the tails; that is, the data were entirely consistent with expectations as I laid them out above. And we did have enough data that the key parameters of interest — the mean improvement per year and the amount of variation between clinics — were estimated with adequate certainty as far as my friend is concerned. I can’t really say much more than that, sorry.
  
  Reply ↓
Jay on March 3, 2022 10:41 AM at 10:41 am said:

I am getting hung up on a simple thing. What does this mean?

> “First thing I did is to divide out the nationwide inflation rate for similar clinics that lack the program, in order to standardize on current dollars”

Are you saying the DV was (dollars per patient / 1 + inflation rate for similar clinics that lack program)?

Reply ↓
- Unanon on March 3, 2022 1:35 PM at 1:35 pm said:
  
  I interpreted it to mean (dollar in year ti)*(1+inflation_rate)^(t_current-ti) where t_current is the (current) analysis year and ti is the year from which the dollar per patient data is from.
  
  So $100 in 2022 is $100, and a $100 in 2019 is $109.30, assuming 3% annual inflation.
  
  Reply ↓
- Phil on March 3, 2022 2:51 PM at 2:51 pm said:
  
  Basically: yes to what Unanon said.
  
  Hmm, looking at what I did, I actually standardized on 2015 dollars, not current dollars. Either would have give the same answer to the slope of the cost savings (as a percent of total), so it doesn’t really matter.
  
  I was given the nation-wide average per-patient cost in 2015, 2016, 2017, etc. So, for instance, maybe it was $65K in 2015, $69K in 2016, … and in 2021 it was $80K, or whatever. Not necessarily a constant rate.
  
  Multiply the 2016 costs by 65/69,…,multiply the 2021 costs by 65/80, to generate an ‘inflation-adjusted’ number. The nationwide average of this adjusted number is 65K per year, every year. If the pilot program is effective in the way ‘they’ expect it to be, then the values for the clinics that participated in the pilot program are not flat at 65K per year but rather decline at some steady rate. This is pretty much what happens in the data.
  
  Reply ↓
Dale Lehman on March 4, 2022 11:28 AM at 11:28 am said:

One comment on the context of this problem: the initial question you stated was “how much should someone be willing to pay for this thing?” What you estimated was how much cost savings might result from this thing? I agree that this is one approach to answering the first question – but clearly not the only one. Cost savings are part of the potential benefits. There may be other benefits. And, from a social point of view, the “value” of the program will generally not be the same thing as how much money is saved, nor may it be measured by how much a clinic (or even a legislative body) is willing to pay for this thing?

I know this is off the point of Stan. I also know that the cost savings are a relevant factor regardless of how this thing is eventually valued. But I just want to avoid the all-too-easy way we can slip into equating willingness to pay with the relatively easier to measure cost savings.

Reply ↓
- Phil on March 4, 2022 1:36 PM at 1:36 pm said:
  
  Dale, yeah, as I said I would hope that the main questions of interest would be related to patient well-being.
  
  I don’t know how financial valuation works, but my friend does: he has a 25-person company that does nothing else. And he wanted to know the answers to the questions discussed in this post. I don’t know what other factors are involved also; I’d be surprised if the answers I provided constitute the only information he needed.
  
  I strongly agree with your point that the value of the program is very different from the amount of money it saves.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

An Easy Layup for Stan

12 thoughts on “An Easy Layup for Stan”

Leave a Reply Cancel reply