(sigma)/(A + b + n)*sqrt( A*(b + n)/(A + b + n + 2) ) > A and n >> b, and tidy up the math under those assumptions. Now you get

(sigma/T)^2 <= A = a + w,

where "w" is the number of lottery wins. This agrees with your frequentist version: if we take the (1,1) Bayes/Laplace prior, we need to win the lottery 99 times if sigma=1, or 399 if sigma=2. We don't care about the value of n, provided it swamps the prior. We can also derive this result from the first approach if we assume s is very large.

Unfortunately, none of it invalidates my blog post. There's a third way to clean up the inequality, by asserting A = 0. This converts it to 0 <= 0, which is true, and thus my subjective (0,1) prior doesn't need a single sample. Frequentist statistics demands a "reasonable" n, but for the typical sample sizes we'd use that still leads to a maximal likelihood of 0 and a confidence interval of 0 the vast majority of the time. That fits your parameters, as well as Neyman's demands for a frequentist confidence interval, so we can reach a conclusion in well under "millions of years."

Frequentism still winds up with two contradictory answers. Unlike Bayesian statistics, it can't pin the blame on the choice of priors, so this is a legit paradox.

]]>How about this interpretation instead: we stop taking samples when the confidence interval is of the form [0.9*x, 1.1*x], for some real value of x. Now we don’t need to know the true value at all.

]]>this is what he means by “relative” as compared with your calculation which is “absolute” size less than 0.1 which can be achieved with the couple dozen or hundred samples he mentions in the third paragraph

my reading is definitely correct in so far as the relative vs absolute error is involved. see the second paragraph where he says “Back of the envelope, to get an estimate within 10% of the true value of 1/300M will take many millions of years. “

10% of the true value of 3.3e-9 means an error width of 3.3e-10

]]>so if p = 1.0/300e6 ~ 3.3e-9 and you want the answer to within 3.3e-10 you will need to take many many samples.

]]>I *can* think of a way to wind up in the millions-of-years zone, though it involves ditching Bayesian statistics for frequentism. Even then, it looks like you accidentally stumbled on a contradiction in frequentist statistics that allows for an answer much smaller than millions of years.

That was the TL;DR. The long, math-y version is here: https://freethoughtblogs.com/reprobate/2020/02/19/dear-bob-carpenter/

]]>first off, this equation is wrong:

var(x) = sqrt(p*(1-p)/M)

it’s sd(x) = sqrt(p*(1-p)/M) not var(x)

second of all, the sd(x) is actually related to sqrt(p*(1-p)) and the estimate is of p, so there is a *relative* error (it’s nonlinear)… the problem is that as x goes to 0, d/dx sqrt(x) -> infinity so even small relative errors get amplified by a big amount.

]]>For discrete outcomes, I talked about the effect on accuracy of marginalizing them out—this is huge for tail probabilities like in the change-point example in the Stan user’s guide chapter on discrete parameters.

And of course, you want to use real knowledge in real situations and not just rely on weak binary outcomes. Like our knowledge of astrophysics in esitmating whehter the sun will rise tomorrow. Or previous launch angle and speed statistics for evaluating whether a batter will get a hit in baseball.

]]>I know this is not the point of your post, but . . . if you want to estimate the probability of rare events, it makes sense to use a mixture of analytical and simulation approaches, as for example discussed in this article from 1998. A little bit of analytical work can make a huge difference in getting fast and stable estimates.

]]>