This week’s reading was all over the place: a talk on Docker and Python for data science, authorial intent in code, the book Clean Code, and an essay on NPM.
Speaker: Tania Allard
This talk, from (the online) PyCon 2020, was a data science-focused introduction to Docker. The start of the talk was a pretty standard overview of what Docker is and why you might want to use it. Allard explained these clearly, which I appreciated, and her slides were amazing. I learned about tools like
repo2docker and cookie-cutter for Docker.
Allard concluded by discussing automation. Using CI to rebuild your Docker images on new pushes and every 2-ish weeks can be a good practice, since it helps you to keep dependencies up to date or (at the very least) install security updates. She showed an example workflow using (I think?) Github Actions.
I wish the talk had gone into a little more depth, but otherwise it was quite good!
Author: Mark Brooker
In this blog post, Mark Brooker makes the point that code only tells you what it is does. As a result, understanding intent is one of the most challenging parts of handling software:
Since most software doesn’t have a formal spec, most software “is what it does”, there’s an incredible pressure to respect authorial intent when editing someone else’s code. You don’t know which quirks are load-bearing.
Brooker suggests design documentation to help with this. Document not only the how and what behind the designs, but also how you decided. What assumptions did you have when building the system? What cases did you believe were out of scope? What did you think the software would, and would not, have to handle?
Oh, and comments. Those help too.
I’m not sure who the author is, but this is a post tearing apart Robert Martin’s Clean Code, a widely recommended book about writing good software. “The major problem I have with Clean Code is that a lot of the example code in the book is just dreadful.”
My criticism of the book, ever since I was first introduced to it, was that it was too Java-focused. The “everything in a class, clear public/private distinctions” didn’t sit well with my Python-trained self, but I chalked that up to different languages having different patterns. Here, though, the author calls out specific examples from Clean Code as being simply terrible code.
The post (unlike the example code!) is quite readable. I recommend it (and do not recommend the book).
Author: Sam Bleckley
The NPM ecosystem seems unwell. If you are concerned with security, reliability, or long-term maintenance, it is almost impossible to pick a suitable package to use—both because there are 1.3 million packages available, and even if you find one that is well documented and maintained, it might depend on hundreds of other packages, with dependency trees stretching ten or more levels deep—as one developer, it’s impossible to validate them all.
This is a much more well-argued take than The NodeJS Ecosystem is Chaotic and Insecure, which makes the rounds every few months, while ultimately coming to the same conclusions. Bleckley takes a data-driven approach by charting dependencies of packages in NPM, which ideally would be low-depth, have few indirect dependencies, and few cycles.
Why? NPM makes it easy to create a package; the
“I suggest that this is a social problem, more than a technical one, and propose a semi-social solution”, he writes.
I cannot offer a batteries-included answer, but I want you to walk away with these conclusions:
- It’s not just a feeling; it’s measurable.
- It’s not really a technical problem, but mostly a social one.
- A functional solution will not be to change how packages are published, but how packages are selected for use.
- That solution will need to take into account not just a package, but all its direct and indirect dependencies, too.
- And, for approval and adoption, it will need the social clout of major names in the field.