Then write the equations (including prior information), add the data, fit or calibrate the model to get good estimates for parameters, and do the posterior simulations as Andrew suggested to get the quantities of interest and other insights into what’s going on. Since these are basically ODE models, you should be able to do that in Stan (*); I’ve also done it using MCSim and Vensim.

You can, if it fits your needs, consider expanding the model boundaries to include the effects of the D-day results on other key parts of the organization and the effects of their actions back on the measures. For a simple example, lower D-day results than desired might lead to more hiring, which, after an on-boarding phase, might lead to improved deal closure rates. Of course, the on-boarding could lead to temporarily lower productivity, as existing staff divert some time to training new hires. If not understood, that could lead to excess hiring, much as eating too fast without understanding dietary limits or waiting for a feeling of satiation can lead to eating way too much.

(*) Human and organizational decision processes are often nonlinear, and system dynamics simulators traditionally describe those functions through a set of x-y coordinates that are interpolated into some sort of piecewise linear function or perhaps some smoother splines. Vensim lets you specify the points by clicking on a graph, while MCSim lets you specify both the points and the interpolation method through the GSL in inline code. I don’t know the best, most natural way to approach that in Stan.

]]>dp/dt = Q(covariates,t) * H(t) * (1-p)

p(0)=0

basically H(t) is some “baseline” rate of occurrence per unit time, Q is a function of covariates that multiplies the baseline rate, which might also be time dependent, and the (1-p) factor forces p to never go above 1.

if Q is not a function of t, and p(0)=0 then the general solution is

p(t) = (1-exp(-integrate(Q(c,t)*H(t),s,0,t))

The basic “proportional hazards” model is the one where Q(c) has no time dependence and can be removed from the integral I think.

]]>so in your analysis it’s easy to simply say “the longest a deal can run is 30 days” and mark it lost, but the truth may be that real world deals could take longer than 30 days. Instead you look at the deals won data, which I assume is quite complete because there’s a record of the transaction if it occurs, and you model the survival time of an open deal to be closed by “won”. Then you go back to deals that were opened, but never closed, and impute probabilistically the “time to be won” and mark it lost at that point in time.

Then, by MCMC, you get a probabilistic “time to loss” since you’ll impute a different loss time at each sample step in the MCMC. So, you don’t know at which point in time you lost the deal, but you have a probably fairly narrow window where it “should have closed if it was going to close” which is the point you’re calling “lost”.

]]>Doing this well requires some knowledge about realistic deal lifetimes, which can probably be estimated accurately from deal closed and won data, assuming that is high quality.

]]>