Summer internships at Flatiron Institute’s Center for Computational Mathematics

[Edit: Sorry to say this to everyone, but we’ve selected interns for this summer and are no longer taking applications. We’ll be taking applications again at the end of 2022 for positions in summer 2023.]

We’re hiring a crew of summer interns again this summer. We are looking for both undergraduates and graduate students. Here’s the ad.

I’m afraid the pay is low, but to make up for it, we cover travel, room, and most board (3 meals/day, 5 days/week). Also, there’s a large cohort of interns every summer across the five institutes at Flatiron (biology, astrophysics, neuroscience, quantum physics, and math), so there are plenty of peers with whom to socialize. Another plus is that we’re in a great location, on Fifth Avenue just south of the Flatiron Building (in the Flatiron neighborhood, which is a short walk to NYU in Greenwich Village and Google in Chelsea as well as to Times Square and the Hudson River Park).

If you’re interested in working on stats, especially applied Bayesian stats, Bayesian methodology, or Stan, please let me know via email at [email protected] so that I don't miss your application. We have two other Stan devs here, Yuling Yao (postdoc) and Brian Ward (software engineer).

We're also hiring full-time permanent research scientists at both the junior level and senior level, postdocs, and software engineers. For more on those jobs, see my previous post on jobs at Flatiron. That post has lots of nice photos of the office, which is really great. Or check out Google's album of photos.

Naming conventions for variables, functions, etc.

The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future.

Three principles of naming follow:

1. Names should mean something.

2. Names should be as short as possible.

3. Use your judgement to balance (1) and (2).

The third one’s where all the fun arises. Do we use “i” or “n” for integer loop variables by convention? Yes, we do. Do we choose “inv_logit” or “inverse_logit”? Stan chose “inv_logit”. Do we choose “complex” or “complex_number”? C++ chose “complex”, as well as choosing “imag” over “imaginary” for the method to pull the imaginary component out.

Do we use names like “run_helper_function”, which is both long and provides zero clue as to what it does? We don’t if we want to do unto others as we’d have them do unto us.

P.S. If the producers of Silicon Valley had asked me, Winnie would’ve dumped Richard after a fight about Hungarian notation, not tabs vs. spaces.

Three informal case studies: (1) Monte Carlo EM, (2) a new approach to C++ matrix autodiff with closures, (3) C++ serialization via parameter packs

Andrew suggested I cross-post these from the Stan forums to his blog, so here goes.

  • Maximum marginal likelihood and posterior approximations with Monte Carlo expectation maximization: I unpack the goal of max marginal likelihood and approximate Bayes with MMAP and Laplace approximations. I then go through the basic EM algorithm (with a traditional analytic example in the appendix). Only then do I get to the (Markov chain) Monte Carlo approach to the marginalization, stochastic averaging EM (SAEM), generalized EM, computing gradients of expectations with Monte Carlo (the trick used in Stan’s variational inference algorithm ADVI), and then I conclude with Andrew’s new algorithm, gradient-based marginal optimization (GMO). My goal is to define the algorithms well enough to be implemented. I was just trying to understand MML and the SAEM algorithm (from Monolix) so I could talk to the folks like Julie Bertrand and France Mentre here at Paris-Diderot. Eventually, it led me to a much better understanding of GMO and why Andrew thinks of MMAP not as a Bayesian-motivated estimator but as the basis of a posterior approximation.

  • C++ parameter packs for (de)serialization: On a completely different note, which gets down to actual C++ code, I show how you can use the parameter packs feature in C++ to implement variadic functions and show how to do it for serialization and deserialization (packing and unpacking structured data into simple arrays). This will be the basis of a Stan feature that should be very helpful for marshaling and unmarshaling arguments to our functionals like the ODE solvers, integrators, and algebraic solvers. I did this one so I could understand Ben Bales’s groovy new work on the variadic adjoint-Jacobian product implementation of reverse mode. This is also the key that’s going to unlock our ability to test and get out a reliable higher-order autodiff implementation, which in turn is the gateway to releasing Riemannian HMC.

  • A new continuation-based autodiff by refactoring: I walk through four stages of developing a new autodiff system in C++. I explain how reverse-mode autodiff can be viewed as continuations. These continuations can be implemented cleanly with C++ lambdas and std::function types, but it’s not very efficient. So I develop custom closures and then show how it can all be put together for matrices without the need to hold matrices of autodiff variables.

Comments welcome, of course, either here or even better, on the linked forum discussions.

P.S. I figured out how to install the old WordPress editor without sysadmin help. The new one’s horrible!