Resume

673 words self,

I’m currently a senior data scientist at Nielsen. I did my master’s in computer science and undergrad in computer science and the Integrated Science Program, both at Northwestern (go ‘cats!). My email address is me@(this domain).

This is a web version of my resume; if you’d like, you can view a PDF here.

Employment

Senior Data Scientist - Nielsen (July 2020 - present)

  • Researched and shipped novel Bayesian inference methodology for measuring audiences for 100s of ad campaigns on 3 platforms, using PySpark, NumPyro, and Airflow (US patent pending).
  • Designed and developed evaluation framework for benchmarking cutting-edge Bayesian PyTorch TV viewing model against 100s of GBs of historical data.
  • Productionalized ad-hoc ETL runs and ML model training to be more reproducible, increasing iteration speed.
  • Created matplotlib visualizations to evaluate and build trust in machine learning models.

Data Scientist - Nielsen (July 2018 - present)

  • Rebuilt flagship TV ratings model with machine learning in PySpark to train 10x faster and admit 4x fewer false positives (US patent pending).
  • Developed alpha version of internal Python framework for unifying workflows for 1000+ data scientists.
  • Automated documentation build of internal framework using Sphinx and AWS.
  • Created PySpark libraries to unify team’s workflow and enable consistent comparison of different model candidates.
  • Presented tech talks on Spark, Bayesian modeling, MLflow, and Python to mixed audiences of 200+ data scientists, software engineers, and business leaders.

Software Engineering Intern - Qualtrics (June 2017 - August 2017)

  • Added pagination, custom data types, & UI enhancements to “action planning” module on Employee Experience platform (AngularJS, Java).
  • Redesigned handling of page filters for action planning dashboards by refactoring shared and product-specific code.
  • Increased test coverage for product by 10% and wrote test files from scratch for untested services.

Teaching Assistant - Northwestern University (March 2016 - June 2018)

  • Mentored students in intro programming, intermediate Python, discrete mathematics, and data structures courses.
  • Assisted with curriculum and exam design; led small-group tutorial sections; taught students individually.

Lead Helpdesk Analyst - Northwestern University (September 2014 - June 2018)

  • Developed Chrome extension to automate often-forgotten parts of help desk tickets, deploying to 60 student staff members and reducing incomplete tickets by over 90%.
  • Mentored, managed, trained, and completed performance reviews for 5 student consultants semiannually.
  • Wrote Python scripts to assist with scheduling, accounting for staffing needs, class schedules, and individual preferences.

Skills

Python: fluent in core language features, scientific computing libraries (NumPy, pandas, scikit-learn, PyTorch), visualization (Matplotlib, Seaborn, Altair), Bayesian inference (PyMC3, Pyro, NumPyro)

Technologies: Git, Unix, unit testing and TDD, machine learning, statistical modeling, Bayesian inference, data viz in Python and d3.js, AWS, Docker (basic)

Other languages: SQL (advanced), Spark (advanced), JavaScript (intermediate), R (basic)

Education

M.S. Computer Science - Northwestern University, 2018 (GPA 4.0, focusing on machine learning and data science)

B.S. Computer Science - Northwestrn University, 2018 (GPA 3.96, summa cum laude)

  • Student in the Integrated Science Program, a selective, research-oriented program in science and mathematics (isp.northwestern.edu).
  • Member of Tau Beta Pi Engineering Honor Society.

Projects

PMD Rescues (Python, JavaScript)

  • Built a responsive and mobile-friendly web tool for helping players of the video game Pokemon Mystery Dungeon: Rescue Team DX
  • Used Pyodide to run custom Python code in-browser alongside JavaScript.

Tech for Campaigns (volunteer data scientist & engineer)

  • Volunteer data scientist on a team building a model to predict state & local elections.
  • Improved data ingestion pipeline for election prediction model, resulting in 200+ unit tests passing, fewer build failures, and 10% faster CI runs.
  • Built tools to download & process messy, disparate electoral data from 4 state election boards.

Red Tweet, Blue Tweet

  • Investigated political polarization over time on Twitter by replicating methods of Barberá et al. using Python and R.
  • Collected 53 million Tweets over 3 weeks with Twitter Streaming API, storing in MongoDB database.
  • Used correspondence analysis to estimate political ideology of 3 million users and analyze online polarization.

Sleep Analysis

  • Leveraged Fitbit API to obtain two years of minute-by-minute sleep data.
  • Analyzed and visualized data in Python to draw conclusions and gain insights about personal sleep patterns.