On rules engines and machine learning, graph theory for COVID vaccinations, MLflow and PyTorch, and grad school.

The vulnerable can wait. Vaccinate the superspreaders first. on WIRED, by Christopher Cox.

This article explores two competing ideas in distributing a future (but now, much closer!) COVID-19 vaccine. Most discussions and plans have said to vaccinate the most vulnerable populations first—the elderly, the immunocompromised, healthcare workers, other essential employees. This article makes the point that if your goal is to reduce transmission, you should vaccinate the superspreaders first.

This creates a need for a “utilitarian calculus,” as the author writes. In reality, you’ll want a combination of the two. You might argue that we have a moral imperative to protect those most at risk individually; you might just as easily argue that we have a moral imperative to reduce transmission as much as possible (which will also protect those at risk, but will favor many who are not).

Marc Lipsitch, an epidemiologist at Harvard’s School of Public Health, says the CDC committee is grappling with a fundamental question. “Essentially there are two approaches to using a vaccine,” he says. “One is to protect individuals by vaccinating them, and the other is to reduce transmission and therefore protect the population.”

Bonus: how do you even find superspreaders? Use the Friendship paradox—the idea that, on average, a person has fewer friends than their friends do. Take a random sample of the population, ask them to name someone they know, and vaccinate that person.

Combining rule engines and machine learning by Neal Lathia talks about how machine learning can be used to complement, not replace, rule engines. In the author’s view, this can happen when your goal is to optimise a process that can’t be fully enumerated with rules (as opposed to control one that can).

Some examples are:

  • use machine learning to design new rules (then put the model away and just implement what you learned)
  • use ML as one sub-problem of a larger system
  • use ML to make a decision based on the outputs of rules

Great, actionable advice.

What color is this? Part 2 (the computational parts) by Dan Tasse (StitchFix) is a folllow up to a previous post about finding the colors of clothes from images. The team clusters pixels of an image into “superpixels,” then uses a graph algorithm to merge nearby superpixels with the same color. I particularly appreciated how the post went into detail about ideas they tried that didn’t work—that’s a huge part of the research process that we never get to see.

Machine learning at Monzo by Neal Lathia (again!) is a doc answering “what does machine learning at my company look like?” They wrote this post for prospective candidates, answering questions like “How do you decide what to work on?” and “How is your impact measured?” I would love to see more teams writing up their apporaches to ML in this way,

MLflow and PyTorch - where cutting edge AI meets MLOps by Geeta Chauhan and Joe Spisak discusses the changing relationship between PyTorch and MLflow. MLflow is a platform to manage the machine learning lifecycle, including tracking model runs, managing deployments, monitoring, and more.

This post outlined some of the capabilities of the two ecosystems:

  • saving and loading TorchScript models using MLflow
  • deploying models from MLflow through the TorchServe plugin
  • autologging via models written in PyTorch Lightning

I really want to get more experience with deploying and monitoring models. I am thinking about what kind of projects I could work on to get this experience, but don’t really know. If anyone has any ideas, I’d love to hear them!

I did what I wanted during my PhD and I regret nothing by Kimberly Hirsh talks about the different, non-academic activities that the author did during her PhD. Producing an improv group, hosting a Buffy the Vampire Slayer podcast, and creating her own blog all helped her to grow personally and have more fun. Great advice to keep in mind.