2023 NHL Playoff Predictions
Who will win this year’s cup?
Almost a year ago to the day, I posted what was by far the most popular post on this blog - a summary of my experience applying to jobs in baseball. That application cycle yielded two interviews, but unfortunately, none converted to job offers.
After that experience, I dusted myself off and did some introspection about my future while wrapping up my Ph.D. and applying to jobs. I wasn’t quite ready to let my dream of working in baseball die, and fortunately I received an offer to be a postdoctoral appointee at Argonne National Lab. This gave me the flexibility to continue advancing my skills in statistics and programming, while also doing something interesting (and receiving a paycheck). Also, since I was already was doing graduate research at Argonne, I wouldn’t have to relocate in the midst of a pandemic, which was a nice plus. However, from day one, I made sure that I was focused on the next job and that I did everything I could to make myself a better candidate for the next baseball application cycle, even discussing with my supervisor which projects would translate well across domains.
In my spare time, I worked to advance my analysis skills. One of the major takeaways from the last interview process was that I could use some work on classical statistics. To remedy that, starting in June, I woke up at least an hour before work everyday and read through statistics textbooks and coursework, taking notes and working on practice problems. I went through several books in this manner, the one I spent the most time on was Statistical Rethinking by Richard McElreath. This book got me excited about statistics, specifically Bayesian statistics. It also got me thinking about topics like causal inference, which I’d never spent time heavily considering beforehand. Along with this, I spent time learning libraries for probabilistic programming, especially pyMC3 and turing.jl.
In tandem, I spent a lot of time working on analyses and pieces for this blog, to better show off my abilities. A plot I made in 2018 led to me work on a model to predict MLB hit outcomes, a project which I got extremely in-depth, leading to a 4 part series. I took some of the Bayesian statistics I’d worked on and implemented them in a piece doing parameter estimation of home-field advantage in 2020, and a piece looking at how many runs you might expect knowing only the number of hits. In addition to my personal blog, I also put one of the pieces on FanGraphs’ Community Research blog.
Through much of this work, I worked with a library called pybaseball. This library scrapes several data sources, allowing them to be used within the python programming language. I developed quite a few utility functions while interacting with the library, and decided to implement several of them within the library itself, becoming a contributor. All of this work was in an effort to make myself a better candidate for baseball jobs in upcoming cycles. I also took some time to learn some other things that I found interesting, but weren’t necessarily intended to improve my candidacy, notably learning the Julia programming language and looking into computer vision.
The contract for my postdoctoral appointment was two years, which bought me two hiring cycles to apply. Frankly, I was pretty uncertain about job prospects in the 2020-2021 off-season, with the pandemic and the shortened baseball season, I imagined teams largely didn’t have much money to put toward hiring. To my surprise though, there were actually quite a few listings. I applied to all the jobs I thought I might be reasonably well-suited for. With only two cycles, I couldn’t burn any opportunities, so I applied to listings from Cleveland, Los Angeles (Dodgers), Toronto, and Boston.
I won’t belabor the actual application and hiring processes this year, they were very similar to what I outlined in the last post. However, the end result this year was quite different. I’m proud to announce that I’ve accepted and started a position with the Boston Red Sox, where I’ll be working as an analyst in baseball research and development. I’m incredibly happy to have received this job offer, and super excited to work for this team. It will be a fantastic place to learn the ropes and start building my career.
My reasons for updating the old post are two-fold. First, who doesn’t like a story with a happy ending? But second, and more important, the last post left with some introspection about where to move forward, and I wanted to give some insight on what I personally found valuable. I think the following things led to a different outcome:
Thanks for reading, and thanks to everyone who has kept up with this blog! Of course, I’ll be leaving up old posts, but I doubt that there will be much more baseball content added in the near future. I’m sure I’ll find other new fun projects to work on and post here, so be sure to continue to keep up with me.
Who will win this year’s cup?
Just how lucky have the 18-3 Bruins gotten?
Interoperability is the name of the game
I got a job!
Revisiting some old work, and handling some heteroscadasticity
Using a Bayesian GLM in order to see if a lack of fans translates to a lack of home-field advantage
An analytical solution plus some plots in R (yes, you read that right, R)
okay… I made a small mistake
Creating a practical application for the hit classifier (along with some reflections on the model development)
Diving into resampling to sort out a very imbalanced class problem
Or, ‘how I learned the word pneumonoultramicroscopicsilicovolcanoconiosis’
Amping up the hit outcome model with feature engineering and hyperparameter optimization
Can we classify the outcome of a baseball hit based on the hit kinematics?
A summary of my experience applying to work in MLB Front Offices over the 2019-2020 offseason
Busting out the trusty random number generator
Perhaps we’re being a bit hyperbolic
Revisiting more fake-baseball for 538
A deep-dive into Lance Lynn’s recent dominance
Fresh-off-the-press Higgs results!
How do theoretical players stack up against Joe Dimaggio?
I went to Pittsburgh to talk Higgs
If baseball isn’t random enough, let’s make it into a dice game
Or: how to summarize a PhD’s worth of work in 8 minutes
Double the Higgs, double the fun!
A data-driven summary of the 2018 Reddit /r/Baseball Trade Deadline Game
A 2017 player analysis of Tommy Pham