Maintenance cost is quadratic in the number of features

Bob Carpenter shares this story illustrating the challenges of software maintenance. Here’s Bob:

This started with the maintenance of upgrading to the new Boost version 1.69, which is this pull request:

https://github.com/stan-dev/math/pull/1082

for this issue:

https://github.com/stan-dev/math/issues/1081

The issue happens first, then the pull request, then the fun of debugging starts.

Today’s story starts an issue from today [18 Dec 2018] reported by Daniel Lee, the relevant text of which is:

@bgoodri, it looks like the unit tests for integrate_1d is failing. It looks like the new version of Boost has different behavior then what was there before.

This is a new feature (1D integrator) and it already needs maintenance.

This issue popped up when we updated Boost 1.68 to Boost 1.69. Boost is one of only three C++ libraries we depend on, but we use it everywhere (the other two libaries are limited to matrix operations and solving ODEs). Boost has been through about 20 versions since we started the project—twice or three times/year.

Among other reasons, we have to update Boost because we have to keep in synch with CRAN package BH (Boost headers) due to CRAN maximum package size limitations. We can’t distribute our own version of Boost so as to control the terms of when these maintenance events happen, but we’d have to keep updating anyway just to keep up with Boost’s bug fixes and new features, etc.

What does this mean in practical terms? Messages like the one above pop up. I get flagged, as does everyone else following the math lib issues. Someone has to create a GitHub issue, create a GitHub branch, debug the problem on the branch, create a GitHub pull request, get that GitHub pull request to pass tests on all platforms for continuous integration, get the code reviewed, make any updates required by code review and test again, then merge. This is all after the original issue and pull request to update Boost. That was just the maintenance that revealed the bug.

This is not a five minute job.

It’ll take one person-hour minimum with all the GitHub overhead
and reviewing. And it’ll take something like a compute-day on our continuous integration servers if it passes the tests (less for failures). Deubgging may take anywhere from 10 minutes to a day or maybe two in the extreme.

My point is just that the more things we have like integrate_1d, the more of these things come up. As a result, maintenance cost is quadratic in the number of features.

Bob summarizes:

It works like this:

Let’s suppose a maintenance event comes up every 2 months or
so (e.g., new version of Boost, reorg of repo, new C++ version etc.). For each maintenance event, the amount of maintenance we have to do is proportional to the number of features we have. As a result, the amount of maintenance we have to do is quadratic (e.g., a linear growth in features looks like this: 1 + 2 + 3 + … + and we do maintenance at regular intervals, so the amount of time it takes is quadratic.

This is why I’m always so reluctant to add features, especially when they have complicated dependencies.

9 thoughts on “Maintenance cost is quadratic in the number of features

  1. 1d integration and ode solvers are both great features to have, so I’m really happy about that. but I can see that there’s a limit point where you will spend 100% of time on maintenance… and that’s not great.

  2. Oh, it’d be nice if 100% maintenance were a limit point. But with uncontrollable externalities it can creep above 100% with a resulting increase in number of bugs until the whole thing collapses.

    Also, I wasn’t trying to say we shouldn’t add features, just that we need to be careful to keep the dependencies down where we can to avoid this quadratic behavior.

  3. Out of curiosity, does maintenance like this ever uncover previously unknown bugs or edge cases? It was mentioned that Boost does fix some bugs; do those bugs ever directly affect Stan code?

    • Boost updates haven’t, but compiler updates have. The huge problem with C++ is that there is a lot of undefined behavior in the specification that varies by compiler (or even by optimization level in the same compiler). When we change compilers, we sometimes uncover dependencies on undefined behavior. This is because the undefined behavior we used worked the same way in all the configurations we do test.

  4. What’s the scaling in terms of active contributors? To keep up with the quadratic cost of features the best solution seems to be to increase the contributors and maintainers of the code base.

    To skip adding features helps maintainability in the short run but in the long run I’m not sure if that works.

      • It’s very important self-discipline in public software projects to be willing to say no to feature requests for this reason. Even if people give you the code and it’s good code, a year later you are the one maintaining it. And adding contributors as mentioned above also adds complexity and more points of failure. Part of the “crisis” of software sustainability is that people can’t put the time into maintaining and are just sick of it at a certain point.

        • It’s very important self-discipline in public software projects to be willing to say no to feature requests for this reason.

          Yes! Skipping features that don’t justify their maintenance cost is absolutely critical for long-term project stability.

          Stan’s at the point now where we can add contributors pretty easily in a way that scales. Lots of folks have been putting in work just fixing what we have. And also refactoring it so that it’s understandable. I’d like to particularly thank Andrew Johnson on that front, who’s been doing a lot of this in a really productive way. And Edward Roualdes and Andre Zapico are taking on flattening our math repo’s directory structure, which has been a huge burden in its present from (which was introducd to solve our previously tangled dependency include stucture).

          Adding new independent features isn’t so bad. What’s really terrible is when they cascade. This is where we’re at with things like threading and MPI. This stuff’s ridiculously complicated to think through in terms of implications when say, GPUs and MPI is turned on with threading.

Leave a Reply

Your email address will not be published. Required fields are marked *