I have created multiple Jupyter Notebooks and keeping track of the order creates an efficient workflow experience. There are two parts to every day due to availability of the previous day’s results.
The first part collects future days. The first step is to go to FantasyPros and collect the next three days starters who are owned in less than 20% of leagues.
The second part just uses the streaming pitchers from two days ahead and collects data in five separate areas.
First is information around ERA. Each previous start has an ERA calculation and, in aggregate, VPR and Volatility are calculated. A scorecard color is assigned based on a minimum number of innings pitched per start as well as the ratio of excellent starts to poor starts.
Second is information around WHIP. This will be slightly similar to ERA. Each previous start has a WHIP calculation and, in aggregate, VPR and Volatility are calculated. A scorecard color is assigned based on the ratio of excellent starts to poor starts. A minimum number of innings is not used since it has already been added in the ERA category.
Third is the calculation of luck. There are three measures that determine a pitcher’s luck and all three are available from Fangraphs. I don’t have scorecard colors for this measure because there is doubt on the resiliency of luck.
Fourth is information regarding the opponent’s ERA. This is complex to calculate since I need to calculate what the team did against each opposing starting pitcher. To do that I have to collect the result of every starting pitcher and who they faced. Then reverse the collection. A scorecard color is assigned based on the relative standing versus the other teams.
Fifth is information regarding the opponent’s WHIP. Using the data collected already, this calculation follows the same pattern. A scorecard color is assigned based on the relative standing versus the other teams.
Later in the day, Baseball-Reference is updated with the previous day’s results. There is a six-step process I follow.
Step One collects the results from the previous day’s streamers.
Step Two runs a comparison between the selected streamer and all streamers. Eventually I look like to construct a grade based on the daily rank.
Step Three looks at the next three days of streamers and flags those who are missing their Baseball-Reference code or their Fangraphs code.
Step Four collects the team schedule from Fangraphs. I use this page because it includes the starting pitchers.
Step Five captures the results from all starting pitchers.
Step Six captures the results from all teams. This is a new step that I recently added because I want to analyze a team’s offense through the entire game. It seems that wOBA is the preferred measure and I am developing a view over a few time periods.
Yu can see there are quite a few steps and some are dependent on their predecessors. Having a published order makes everything work together smoothly and I can complete all of this in under fifteen minutes.