2023 NHL Playoff Predictions
Who will win this year’s cup?
On reddit, there are dedicated pages for each baseball team, where users can discuss the team, chat during games, and post content. Once a year, the collective baseball community (found at reddit.com/r/baseball) participates in a game that some of my friends set up, called the Trade Deadline Game. It’s scheduled around the true baseball All-Star Game, and meant to mirror the frenzy around the upcoming trade deadline.
Each subreddit has a manager, tasked with trading away any user that chooses to participate from their respective subreddit to another. Once that’s done, users are supposed to participate with the team they were traded to for a week. It can be a lot of fun, and yes, it is incredibly nerdy. Posts and trades are updated throughout the day, some people host a nightly talk show corresponding to the game, and people get really into it. My contribution the last couple of years has been looking at the data that comes out of the game.
For scope, in 2018, there were 1549 people who participated in this game, so keeping up with this is not a minor task. Wrangling data to do an analysis on the game is challenging, but it provides some really cool and interesting insight into the community, and it’s one of my favorite projects I’ve done.
I wrote up a data-driven story looking at the 2018 game, which can be found here. I’m quite proud of this project, the work that went in, and the story that the data painted, so I hope you check it out.
Additionally in 2017, I did take a first, less sophisticated stab at this, which can be found here.
Who will win this year’s cup?
Just how lucky have the 18-3 Bruins gotten?
Interoperability is the name of the game
I got a job!
Revisiting some old work, and handling some heteroscadasticity
Using a Bayesian GLM in order to see if a lack of fans translates to a lack of home-field advantage
An analytical solution plus some plots in R (yes, you read that right, R)
okay… I made a small mistake
Creating a practical application for the hit classifier (along with some reflections on the model development)
Diving into resampling to sort out a very imbalanced class problem
Or, ‘how I learned the word pneumonoultramicroscopicsilicovolcanoconiosis’
Amping up the hit outcome model with feature engineering and hyperparameter optimization
Can we classify the outcome of a baseball hit based on the hit kinematics?
A summary of my experience applying to work in MLB Front Offices over the 2019-2020 offseason
Busting out the trusty random number generator
Perhaps we’re being a bit hyperbolic
Revisiting more fake-baseball for 538
A deep-dive into Lance Lynn’s recent dominance
Fresh-off-the-press Higgs results!
How do theoretical players stack up against Joe Dimaggio?
I went to Pittsburgh to talk Higgs
If baseball isn’t random enough, let’s make it into a dice game
Or: how to summarize a PhD’s worth of work in 8 minutes
Double the Higgs, double the fun!
A data-driven summary of the 2018 Reddit /r/Baseball Trade Deadline Game
A 2017 player analysis of Tommy Pham