Details of implementation of machine learning algorithms

Ben Bales writes:

I saw a presentation from Costa Huang (a graduate researcher working in reinforcement learning) who had some material which I thought would be interesting for the blog. (Full disclosure: Costa and I work at the same company.)

Apparently it can be a bit tricky to reproduce results in reinforcement learning even when code is shared because there’s lots and lots of extra details embedded in the code that aren’t always totally explained in words.

It all just sounded similar to MCMC-land by analogy — a billion tricks floating around, some things work in some places, some things make implementation simpler, some things make models more general, some defaults just seem to work weirdly well — hard to keep track of it all!

Here is a blog they worked on to document and add citations for all these tips-n-tricks/implementation-details that go into getting the PPO (proximal policy optimization) reinforcement learning algorithm working well.

There’s a Github repo that goes along with this that includes 1-file implementations of a lot of these algorithms. If you click through the files here you can navigate through the details of a variety of different RL algorithms. If you go to that repository’s main page it has a bunch of info for how to use this code — the implementation files are interesting to look at themselves tho).

Anyway, implementation details! Often frustrating if you need to figure them out to get something working, but kinda interesting when someone else has done the work of cleanly documenting them!

I don’t know anything about this, but, yeah, it’s great to get all the steps that are needed to get these algorithms to run.

3 thoughts on “Details of implementation of machine learning algorithms

  1. I am not familiar with the details of implementing PPO, but in general if some detail is relevant, shouldn’t it be part of the algorithm?

    Usually I find it particularly troublesome if anything, including unit tests, is very sensitive to seeding. This usually means that the algorithm is not very robust, or the problem itself is very difficult so results are not credible (eg optimizing with an objective with many local maxima starting from random points, same applies to MCMC).

    That said, documenting these things is great. I hate it when I come across fudge factors or undocumented adjustments in sources for an algorithm that I am trying to implement, with no explanations, or cryptic ones.

    • > shouldn’t it be part of the algorithm?

      I guess ideally, but algorithms also can change a lot depending on where they’re implemented and still carry the same name. Like an eigensolve has a lot of variations depending on where you want your eigenvalues. In this case, the PPOs vary in different ways depending on what game you’re trying to play.

      These things get so complicated these days though it’s hard to keep up.

      We’ve done a reading group thing at work a few times where we randomly pick papers from conferences and try to give 1-slide presentations on them. I got this one: https://openaccess.thecvf.com/content/CVPR2022/papers/Di_GPV-Pose_Category-Level_Object_Pose_Estimation_via_Geometry-Guided_Point-Wise_Voting_CVPR_2022_paper.pdf

      The complexity is through the roof! Had to dig into a couple background papers to even get the vague sense of what is going on. Not that that’s bad. I wouldn’t get far randomly picking a paper in a Math journal either.

      > Usually I find it particularly troublesome if anything, including unit tests, is very sensitive to seeding

      Oh yeah I like to just avoid seeds. If the thing is random, the thing is random and it’s nice to test it that way. Seeds are useful for triaging specific bugs and whatnot but the test is more valuable with more randoms.

      Ofc. that sorta thinking might fail with a bigger codebase, and random failures in automated testing are annoying when the answer is “just try again”

Leave a Reply to Ben Cancel reply

Your email address will not be published. Required fields are marked *