I’m spending a lot of time reading Cathy O’Neil’s Weapons of Math Destruction, which is excellent. This week’s articles include advice for grad students and some advice on software engineering.

ProDUCT Lab PhD Student Guide

Author: Lana Yarosh

I was looking at the GroupLens lab at the University of Minnesota, which specializes in recommender systems and online communities. This was a “guide to being a PhD student” by one of the professors in the group, Dr. Lana Yarosh. I think her work isn’t quite the right fit for me, but the guide was very helpful because it presented information in a clear, structured format.

Write code that is easy to delete, not easy to extend

Author: tef (at programmingisterrible)

This post opens:

Every line of code written comes at a price: maintenance. To avoid paying for a lot of code, we build reusable software. The problem with code re-use is that it gets in the way of changing your mind later on.

I love some of the more specific pieces of advice:

  • create simple-to-use APIs out of simpler-to-implement but clumsier-to-use parts
  • isolate hard-to-write and likely-to-change parts of code from the rest
  • copy/paste code a few times before making it a shared API, because writing it multiple times gives you the flexibility to change how it’s used
  • build libraries that suit your code, even if they’re light wrappers around other things

I’m tired of summarizing for now, but the rest is really nice advice too. It strikes me as most useful for when you’re starting a project on your own and have an infinite design space in front of you.

Array functions and the rule of least power

Author: Jesse Duffield

The Rule of Least Power is the idea that there is a tradeoff between the power (or flexibility) of a language and the ability to understand what some code in that language is doing.

It’s therefore interesting that some people say say that the ‘functional’ array functions like .filter, .map, and .reduce are powerful compared to their crude for-loop alternatives. I would say the opposite: they are far less powerful, and that’s the point.

filter, map, and reduce are less powerful than for-loops because they each do one thing well. With a for-loop, you have no idea what might be inside; with map or filter, you at least know “do something to every element” or “return some elements.”

Perhaps most importantly, callables past to filter and map are typically assumed not to have side effects. With a for-loop, anything goes. But with the less powerful functions, you have more of an idea of what they will and won’t be doing.

Advice to myself when starting out as a software developer

Author: Gregerly Orosz (The Pragmatic Engineer)

The advice is:

  • read two books per year on software engineering
  • learn the language you use at work in-depth, to the very bottom
  • pair with other developers more often
  • write unit tests and run them against CI
  • make refactoring a habit and master refactoring tools
  • know that good software engineering is experience. get lots of it.
  • teach what you learn

I love the idea of understanding your language in-depth. Orosz talks about how he decided to learn C# in-and-out after two years at his first job, then JS and WinJS. The more you know about a language, he writes, the easier it is to evaluate their strengths and weaknesses.

Pairing and unit tests are things that I struggle with in data science, but that I see the value of. I pair with my teammate every couple of weeks, but it’s mostly while debugging and only once while writing new code. Unit tests in data science are a … perpetual work in progress.

And, finally, “teach what you learn” is also wonderful advice:

And I don’t know any experienced, role model developer who is not a decent teacher and mentor. The earlier you start giving back and teaching, the more natural this will come.

This is definitely true in my experience, too! All the data scientists who I strive to be like are great at teaching and mentoring, and it’s something that I’m trying to work on too.