Articles I read this week, including predictions for 2020 and FiveThirtyEight’s primary methodology.

Top minds in machine learning predict where AI is going in 2020

Author: Khari Johnson

How I found this: from /r/machinelearning

Summary: this is an interview with “PyTorch creator Soumith Chintala, University of California professor Celeste Kidd, Google AI chief Jeff Dean, Nvidia director of machine learning research Anima Anandkumar, and IBM Research director Dario Gil.”

  • Chintala: thinks that we haven’t seen anything since Transformer (2017). He predicts that the community will emphasize things besides accuracy, like explainability and “how AI can better reflect the kind of society people want to build.”
  • Kidd: thinks it’s likely that there’ll be an increased connection between what you already know and what you want to learn next. This was one of the subjects in her NeurIPS talk, where she talked about the zone between people’s previous interests and what’s suprising to them as the place where learning is most likely to happen. She also thinks that the “black box” term is meaningless, because brains, too, are black boxes. “In 2020, she wants to see increased awareness of the real-life implications of tech tools and technical decisions and a rejection of the idea that the makers of tools aren’t responsible for what people do with them.
  • Dean: expects to see progress in multitask and multimodal learning, and pointed to the continued growth of language models. He wants to see more on robust models in favor of SOTA advances.
  • Anandkumar: expects progress in “iterative algorithms, self-supervision, and self-training methods,” basically thinking the concept of training a feed-forward net once is where robustness issues come from, and that iterative algorithms can help with this. “Policymakers, individuals, and the AI community will also need to grapple with issues of representation and the challenge of ensuring data sets used to train models account for different groups of people.”
  • Gil: predicts more progress to efficient training “with reduced-precision architectures” (what?). He hopes that the field will move beyond accuracy to “building trusted systems,” and also that it’s important to make AI more accessible to normal data scientists or software engineers. Issues with data fairness and integrity will continue to get more attention.

Thoughts: this is a well-written article with a ton of different ideas. I think one of the recurring themes is a shift away from accuracy and SOTA performance in favor of robustness and explainability. Kidd’s thoughts were the most interesting to me, both that brains are black boxes, too, and the idea of makers being responsible for the tools that they develop. I’m interested to think more about these throughout the year.

FiveThirtyEight: How our Primary Model Works

Author: Nate Silver

How I found this: reading FiveThirtyEight

I love FiveThirtyEight. This is the first time they’ve published a model for a presidential primary election, usually forecasting individual primaries but never the entire race. The linked post walks through their primary methodology at a fairly high level. Broadly speaking, the steps are:

  • Take national & state polls and translate them into a polls-only forecast
  • Take demographics, geography, and fundraising to create a non-polls forecast for each state
  • Simulate the primary process day by day
  • Simulate state and district results, then allocate delegates
  • Simulate bounces & crashes from winning & losing primaries

The first step (a polling forecast) is FiveThirtyEight’s bread and butter, where they adjust state polls based on national trends, adjust for individual pollsters’ biases, and differ in sensitivity depending on what’s going on (more senstitive around Super Tuesday or Iowa, less sensitive when there’s nothing happening). Historically, their poll-based forecasts have performed well, so this all sounds reasonable.

The non-polling forecast is more interesting. Because most states have little or no polling, FiveThirtyEight chose to adjust national polls in two ways: geographically and demographically. The geographic method adjusts national polls for the home states & regions of the candidates, also using state-level fundraising as a proxy for how candidates are doing in each state. The demographic method uses a bunch of regressions on demographic variables to try to figure out what combinations best predict the results.

On the surface, this seems similar to Gelman’s multilevel estimation with poststratification in that both are interested in producing state-level estimates from national polls. But FiveThirtyEight uses 360 different regressions, and weights each differently, to contribute to their forecast. Moreover, this post from Gelman’s blog shows that Nate Silver doesn’t actually think that highly of MRP. And so FiveThirtyEight takes a different approach to survey weighting.

The rest of the article is worth reading, too, but as I’m writing this I’ve taken a detour into Gelman’s blog and Silver’s Twitter to hear more about their views on MRP and other statistical techniques. Maybe there’ll be more on that later.