Projects

This page contains a wide variety of projects I have completed over the years. It includes both school assignments and personal projects. Please have a look below and also visit my Github to see some more of my coding projects:

Staying Alive: A Context-Dependent Valuation of Foul Balls Link

My team’s final presentation from the 2024 SABR Diamond Dollars Case Competition. We won our section of the competition and finished as runners-up overall. We built a series of models to estimate the relative value of a foul ball in Major League Baseball compared to the expected outcome of a pitch.

Hyannis Harbor Hawks Data Projects – Link to Github

While working in the Cape Cod League during the summer of 2023, I completed several exciting data projects to help the team. These projects include:

-Wins Above Replacement Model for the Cape League – Example Output

-Run Value, Run Expectancy, and Win Probability Models for the Cape League

-An Umpire Scorecard generated from Trackman Files – Example

-Postgame Trackman Reports for Catchers and Umpires

-Catcher Framing Runs Model – Example Output

Please click the link above to visit my Github to see code behind these projects and more.

Also, please visit @CCBLAnalytics on Twitter, where my colleagues and I posted the results of these and other models throughout the summer: Link to Twitter

Hyannis Harbor Hawks Advance Scouting

While with the Harbor Hawks, I also was very involved with game preparation and advance scouting of our opponents. For opposing pitchers and hitters, we would use TruMedia and Synergy to create reports on players’ tendencies, example linked below. I also created split charts to help our coaches make in-game decisions regarding substitutions.

Scouting Report Example: Link

Split Chart Example: Link

The Perfect Umpire: A Weighted Simulation of Alternate Realities – Link

My team’s final presentation from the 2023 SABR Diamond Dollars Case Competition. We won our section of the competition and finished as runners-up overall. We built a Markov Chain model trained on the 2022 MLB season to attempt to re-imagine the results of selected MLB games if umpires did not miss any ball-strike calls.

PigPen+: Using Linear Regression to Grade College Baseball Pitches- Link

For my STAT 410: Linear Regression class, I created a linear pitch grading model called PigPen+. This is a simple version of a Pitching+ Model applied to NCAA Division 1 Baseball. My presentation about my project is linked above.

Analysis of Plate Appearances Following Walks in MLB – Link

I completed this personal research project using R to analyze play-by-play and pitch sequence data from retrosheet.org. This project examines what occurs in plate appearances following walks in MLB, and what strategies hitters can adopt to best take advantage of pitchers’ tendencies in these situations.

Grading MLB Starting Pitcher Outings with Linear Models – Link

This personal research project created a few different linear models to grade MLB Starting Pitcher line scores in a similar method to Bill James’s Game Score.

Analyzing US Trip Data in the Context of COVID-19 – Link

In my R For Data Science course, my group used R, SQL, and Shiny to examine how the travel behavior of Americans was affected by the COVID-19 pandemic on a national, state, and county level.

NCAA Football Elo Rating System – Link

I developed this Elo-based system for rating FBS college football teams before the 2018 season. Each year, I tweak the rating system and calculate weekly updates for all 130 FBS teams. 

March Madness Prediction System – Link

I have worked on an analytical system for predicting March Madness results, based on team efficiencies. It has been very successful in the past and I am looking forward to using it for the 2022 tournament.

Fantasy Baseball Draft Choice Algorithm – Link

I created an algorithm to determine which player to draft in each round for my 2021 fantasy baseball season. My team finished in 1st place easily with a 17-3 record. I hope to update the system and use it again in 2022.

Introduction to Data Science – NFL Data Crunchers Notebook – Link

This was my group’s final project for my Introduction to Data Science class. We used Python’s pandas package to explore a large set of NFL play-by-play data using techniques we had learned in class.

Current Topics in Sports – Robot Umpires Presentation – Link

This was my group’s project for my Introduction to Sport Management class. We researched and presented on whether we believed Major League Baseball should implement an automatic umpiring system.

Syracuse Sports Analytics Camp – Lucas Giolito Improvement Project – Link

I attended Syracuse University’s virtual sports analytics camp in 2020. My final project examined Lucas Giolito’s improvement from 2018 to 2019 using a variety of types of data analysis techniques we learned in the camp.

Syracuse Sports Analytics Camp – Marcus Stroman Free Agent Project – Link

At the Syracuse camp, I also completed a project analyzing Marcus Stroman’s free agent market for the 2020-21 offseason based on comparing his statistics to comparable players and observing the value of their free agent contracts.

Mathematical Modeling Boot Camp – Disease Spread Model – Link

I took a summer enrichment course on Mathematical Modeling at Northwestern University’s Center for Talent Development during the summer of 2020. My group’s final project built a model predicting how a disease will spread among a population.

AP Language and Composition – Junior Theme on Technology Addiction – Link

Attached is my major research paper from high school: my junior theme on technology addiction. This was nearly a year-long project and is without a doubt the most thorough paper I have written.

Wharton Moneyball Academy – Nastiest Pitcher in MLB Project – Link

In 2019, I attended the Moneyball Academy at Wharton in Philadelphia. My group’s final project analyzed which MLB pitcher was the “nastiest” in the league.