There’s a saying in mathematics that one way to solve a hard problem is to frame it in more generality. For example, instead of summing the series, evaluate the generating function. To prove a theorem about prime numbers, prove it in the larger space of ideals.
A similar idea arises in statistics. If a model is hard to fit, embed it in a hierarchical structure. This came up in our paper on meta-analysis with a single study. Or, in computation: even if all you want is a posterior mean, the most direct route to that end can be to construct an algorithm to sample from the posterior distribution. Some integrals are hard enough to evaluate that it takes less effort to just write Stan.
Here’s another example. Back in 2012, I wrote:
Google Translate for code
Wouldn’t it be great if Google Translate could work on computer languages? I suggested this and somebody said that it might be a problem because code isn’t always translatable. But that doesn’t worry so much. Google Translate for human languages isn’t perfect either but it’s a useful guide. If I want to write a message to someone in French or Spanish or Dutch, I wouldn’t just write it in English and run it through Translate. What I do is try my best to write it in the desired language, but I can try out some tricky words or phrases in the translator. Or, if I start by translating, I go back and forth to make sure it all makes sense.
An R help-list bot
We were talking about how to build a Stan community that will be helpful to a diverse range of users without taking up too much of our time, and that’s when I came up with a brilliant idea. Let’s take a successful existing help group–for example, the R-help mailing list—then make a database of the helpful bits of advice of a distinguished and frequent contributor to the list. The bot would be easy: whatever the question is that comes in, just send back a random tip. I have a feeling that advice such as “PLEASE do, and not send HTML” and “My guess is that this is a Mac-specific question (e.g. you are using the R.app GUI), so please consider if this is the appropriate list” and “The posting guide was not followed” and “Please use the R-devel list to comment on current development versions” would work pretty well for almost any question (maybe after some global sub of Stan for R).
OK, that last one was a joke (followed up here, if you want to know the context).
The point is that here we are, 13 years later, and these tools really exist! There really are computer programs that will automatically translate code from one language to another, and there are computer programs that will give you R help based on what’s been posted on the internet. And the way this was done was not by writing separate translation programs and help-list programs but rather by writing general-purpose chatbots that do both these things (as well as writing stupid government reports and ruining some children’s education). These chatbots are super impressive from a technical perspective and also can be very useful. And it’s interesting how the best way to solve the specific challenges was by constructing something more general.
“And it’s interesting how the best way to solve the specific challenges was by constructing something more general.”
Don’t know if it’s necessarily the “best” way. The general purpose LLMs in existence have cost something like $10 billion to develop. Maybe an R-specific bot would have cost “only” $50 million.
Small:
Well, it was easier for me. I didn’t have to do anything; it just happened.
I only use ChatGPT rarely and that’s because when I’ve asked it for help with R, its answers have been on the right track (usually) but incorrect. I’m not sure I would consider that “the best way to solve the challenge.” I think I would be happier with an R-specific bot, or god help us, the advice of a qualified professional.