Parallel JAGS RNGs | Statistical Modeling, Causal Inference, and Social Science

As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples drawn from distinct RNGs computed on a single processor core.

But we all have multiple cores now, or we’re computing on a cluster or the cloud! So the behavior we’d like from rjags is to use the foreach package with each JAGS chain using a parallel-safe RNG. The default behavior with n.chain=1 will be that each parallel instance will use .RNG.name[1], the Wichmann-Hill RNG.

JAGS 2.2.0 includes a new lecuyer module (along with the glm module, which everyone should probably always use, and doesn’t have many undocumented tricks that I know of). But lecuyer is completely undocumented! I tried .RNG.name="lecuyer::Lecuyer", .RNG.name="lecuyer::lecuyer", and .RNG.name="lecuyer::LEcuyer"
all to no avail. It ought to be .RNG.name="lecuyer::Lecuyer" to be consistent with the other .RNG.name values! I looked around in the source to find where it checks its name from the inits, to discover that in fact it is

.Rng.name="lecuyer::RngStream"

So here’s how I set up 4 chains now:

library(doMC); registerDoMC()
library(rjags); load.module("glm"); load.module("lecuyer")
library(random)
jinits <- function() {
   ### all the other params ###
  .Rng.name="lecuyer::RngStream",
  .Rng.seed=randomNumbers(n = 1, min = 1, max = 1e+06,col=1)
}
jags.parsamples <- foreach(i=1:getDoParWorkers()) %dopar% {
  model.jags <- jags.model(model, forJAGS,
                           inits=jinits,
                           n.chain=1, n.adapt=1000)
  result <- coda.samples(model.jags,params,1000)
  return(result)
}

I would just as soon initialize them to the same state and use sequential substreams, but I think there is no way to do this. Four long separately-seeded streams should be more than fine; a quick look suggests that if you did n.chain>1 (on each core) you’d get sequential substreams.

I should also probably write a better .combine so that it’s an mcmc.list and not just a list, but whatever. This works, almost 4 times (yeah yeah overhead blah blah) faster than the usual n.chain=4 would be!

10 thoughts on “Parallel JAGS RNGs”

The dclone package also allows for parallel processing through jags in one line via the jags.parfit function

I just checked: dclone does not deal with the JAGS RNG at all, much less the lecuyer module.

Peter Solymos on August 8, 2011 1:05 PM at 1:05 pm said:

jags.parfit in dclone does not deal implicitly with RNGs, because it is done via setting up inits as usual for jags.model. The trick is to use jags.model (or jags.fit in case of jags.parfit) with 0 iterations to generate the RNGs. This is then used and inits are distributed to the workers.

Let's wrap it all and put it in Jags() in the r2jags package.

Jags runs parallel chains making use of different cores when the BLAS and Lapack libraries are configured to do so. In any GNU/Linux distribution, JAGS can be compiled against the Atlas-threads library, so that by default it uses different cores for each chain. I think that it is the easiest way.

@gaygoygourmet on July 24, 2011 9:07 AM at 9:07 am said:

That’s cool. Even if you ignore the R::foreach bit, you should probably use lecuyer::RngStream for those, which it definitely does not do by default.

Why should we be using L'Ecuyer's random-number generator? (That was a real question, not rhetorical.) If it uses a different instance per thread, then the instances don't need to be thread safe. I don't think the generators use huge amounts of memory to store their state, so there doesn't seem to be much to be saved from sharing a random-number generator across threads.

Peter Solymos on August 8, 2011 1:09 PM at 1:09 pm said:

I think L'Ecuyer's RNG kicks in when one wishes to use >4 chains thread safe. Otherwise JAGS comes with 4 kinds of RNGs and jags.model assumes different RNGs by default when not declared otherwise by inits. I haven't tried L'Ecuyer with my dclone package, but I will do so soon enough.

Would it be possible to have JAGS use different data for each of the parallel chains? This would be convenient if one would be using multiple imputed datasets. The same models need to be run on different datasets, an option to estimate these chains in parallel would speed things up considerably. Any pointers on how to accomplish this would be greatly appreciated.

Krzysztof Sakrejda-Leavitt on September 18, 2011 3:28 PM at 3:28 pm said:

@Niels: I’ve wrapped the calls to the JAGS functions into one which accepts just a list with all the control arguments (data and inits), then I user it as the first argument with foreach/%dopar% and L’Ecuyer RNG. The main point is that foreach can take an arbitrary list to work with—this is not demonstrated above, but in the foreach docs. Each element of the list can be the setup for a single run, including data, inits, and control parameters for JAGS. I haven’t gotten around to posting the code yet…

Comments are closed.

Matthew Krachey on July 23, 2011 3:59 PM at 3:59 pm said:

The dclone package also allows for parallel processing through jags in one line via the jags.parfit function
Michael Malecki on July 23, 2011 8:46 PM at 8:46 pm said:

I just checked: dclone does not deal with the JAGS RNG at all, much less the lecuyer module.
- Peter Solymos on August 8, 2011 1:05 PM at 1:05 pm said:
  
  jags.parfit in dclone does not deal implicitly with RNGs, because it is done via setting up inits as usual for jags.model. The trick is to use jags.model (or jags.fit in case of jags.parfit) with 0 iterations to generate the RNGs. This is then used and inits are distributed to the workers.
Andrew Gelman on July 23, 2011 9:53 PM at 9:53 pm said:

Let's wrap it all and put it in Jags() in the r2jags package.
Xavier on July 23, 2011 11:49 PM at 11:49 pm said:

Jags runs parallel chains making use of different cores when the BLAS and Lapack libraries are configured to do so. In any GNU/Linux distribution, JAGS can be compiled against the Atlas-threads library, so that by default it uses different cores for each chain. I think that it is the easiest way.
- @gaygoygourmet on July 24, 2011 9:07 AM at 9:07 am said:
  
  That’s cool. Even if you ignore the R::foreach bit, you should probably use lecuyer::RngStream for those, which it definitely does not do by default.
lingpipe on July 25, 2011 1:46 PM at 1:46 pm said:

Why should we be using L'Ecuyer's random-number generator? (That was a real question, not rhetorical.) If it uses a different instance per thread, then the instances don't need to be thread safe. I don't think the generators use huge amounts of memory to store their state, so there doesn't seem to be much to be saved from sharing a random-number generator across threads.
- Peter Solymos on August 8, 2011 1:09 PM at 1:09 pm said:
  
  I think L'Ecuyer's RNG kicks in when one wishes to use >4 chains thread safe. Otherwise JAGS comes with 4 kinds of RNGs and jags.model assumes different RNGs by default when not declared otherwise by inits. I haven't tried L'Ecuyer with my dclone package, but I will do so soon enough.
Niels on September 17, 2011 5:21 PM at 5:21 pm said:

Would it be possible to have JAGS use different data for each of the parallel chains? This would be convenient if one would be using multiple imputed datasets. The same models need to be run on different datasets, an option to estimate these chains in parallel would speed things up considerably. Any pointers on how to accomplish this would be greatly appreciated.
- Krzysztof Sakrejda-Leavitt on September 18, 2011 3:28 PM at 3:28 pm said:
  
  @Niels: I’ve wrapped the calls to the JAGS functions into one which accepts just a list with all the control arguments (data and inits), then I user it as the first argument with foreach/%dopar% and L’Ecuyer RNG. The main point is that foreach can take an arbitrary list to work with—this is not demonstrated above, but in the foreach docs. Each element of the list can be the setup for a single run, including data, inits, and control parameters for JAGS. I haven’t gotten around to posting the code yet…

I know that place! (Being a physics professor at the University of Oregon.) I can't say anything about the position…

Astounding! You might want to link directly to https://bayes.club/@statmodeling_bot to avoid having people encounter a "you are leaving mastodon.social" page.

For anyone on BlueSky: All posts from this blog are automatically added to the StatsSky feed. So pinning that feed…

Added to the above list.

https://statmodeling.stat.columbia.edu/feed/

Why not Mastodon?

How about "figurative statistics" since the statistics are not meant to be taken literally?

Would it be possible to set up an RSS feed as well?

most is people don’t really care what the answer is This refers to trying to answer to the "bat-ball question."…

To be clear, I definitely think there is something interesting to be learned from the answers to this question. I…

Oh, yes, 5pm, I fixed; thanks.

A very interesting idea. Although I'd like to see more empirical evidence to its efficacy. I'm a bit put off…

Looks like 5 pm? Also this page mentions Gilman Hall. https://hub.jhu.edu/events/2024/04/26/storytelling-in-data-science-a-two-talk-event-with-andrew-gelman-and-thomas-basbll-5pm/

I heard years ago, from accomplished statisticians, "Friends don't let friends use Excel." ;)

I guess this makes Excel the true superpower of computational software. It has been called the Swiss army knife -…

The most under-appreciated superpower is skill stacking: https://unchartedterritories.tomaspueyo.com/p/how-to-become-the-best-in-the-world?utm_medium=web

+1 Seems to me the general suggestion is that someone must have a "superpower," and that it's something that would…

I see the question as what is the meaning of "super?" Having a "power" sounds like you can do something…

Years ago a local radio station was having a Father's Day competition for kids to nominate their father's superpower. My…

My superpower is my ability to write incomplete As a result of this superpower, I originated the popular academic email…

The paper is trying to use aggregate responses to study a phenomenon happening at the individual level. That's just as…

I don't think they're quite the same, if you mean the original gorilla experiment as opposed to what might be…

Anon, you might want to read the paper, it's not very long. It contradicts some of your suppositions.

The one at Dicks is an "official" Whiffle bat and ball complete with 9" diameter regulation size ball! That's it.…

Dick's has a whiffle bat and ball for $8.99, so I'm betting they are available at the dollar store for…

There is an alternate history where a historian met Matthew White in the 1990s and encouraged his skepticism, rather than…

"Lýsandi" in Icelandic has dual meanings: Either something that describes, or something that projects light. So, you could probably just…

https://www.slugger.com/en-us/product/2024-meta-5-2-3-4-wbl2846#axis=101925 That's only a bat. And I thought golf equipment was expensive.

David: $110 . . . that's a lot to pay for a bat and ball!

Footnote 5 says solution rates are the same whether they use $1.10 or $110.

For what it's worth, I just asked ChatGPT "How much do a bat and a ball cost?" and it confirmed…

Phil: I assume it's just a really old problem, originally written when the actual prices were something close to that.…

This is so off-topic that I almost feel strange saying it but: it seems weird to me to suggest that…

Link fixed; thanks.

Except the linked article is paywalled!

The term, decorative statistics, is a particularly juicy pun in English because it sounds so much like descriptive statistics and…

is something more akin to “the truth will out”. It doesn’t matter if scientists continue to stand by refuted claims…

The description in the top article is rather different to the “self-correcting” nature of science as commonly understood at least…

I agree. We should have made the connection.

The methodology has to have antenna for picking up on a problem, and a rationale for deeming it a problem.…

Raphael: It's in the article that's being discussed. Just follow the link!

Somehow, in my mind, the bat and ball problem reminds me of the so-called invisible gorilla problem. https://psycnet.apa.org/record/2010-14410-000 In each…

Forgive my ignorance, but what is the meaning of the ASCII graphic? I have seen ASCII art before (although I…

Anoneuoid, Conceptually, there is no problem with attosecond precision. In terms of building a clock, yeah, it's tough. But for…

I guess there were 59 studies reviewed and 70k people answering the bat-ball question.* But did any participant get asked…

I strongly disagree!! Criticism is only appropirate in limited circumstances!! Only certified and authorized expert individuals should be allowed to…

Heres a talk from someone actually working with attoseconds. At 23:20 she explains there is no clock that can be…

Bendy streets, so I guess it's Einsteinian geometry.

Glad to see shooutout for William Cleveland's "Elements of Graphing Data". I still have that and his "Visualizing Data" You'd…