2018 Season Lucky Random Predictions
by Bernie “Candide” Gilbert
Spring training always brings predictions for the coming season. Rather than try to predict next season’s final standings and playoff results, I decided to try to see if there’s anything a couple of last season’s random results can tell us about how teams will perform next year.
I focused on two elements: How teams fared compared to their Pythagorean won-lost projections, and what correlation – if any – there was between BABIP and wins.
The Pythagorean projection says that a team’s winning percentage will be a function of the runs it scored versus the runs it allowed. It follows the formula winning percentage = (S2) / (S2+A2), where S is the number of runs scored and A is the number allowed. It generally works very well as a predictor; when it doesn’t work, it’s an indication that a team won or lost an unusual number of games either by blowouts or by narrow margins. For example, if a team should have won 92 games in a season but only won 87, it means they “wasted” a lot of runs winning games by ten or fifteen runs, or they lost a lot of close games they could have won. A classic (albeit small sample size) example of a team “wasting” runs in blowouts is the 1960 World Series, where the Yankees scored almost twice as many runs as the Pirates while losing the series in seven games.
BABIP is well known around here – batting average on balls in play, i.e., The formula calculated as: Hits minus home runs, divided by at-bats minus home runs minus strikeouts plus sacrifice flies — or (H – HR)/(AB – HR – K + SF). It’s largely a function of luck; there’s generally not much deviation from the league average; when the deviation is large, it’s believed to indicate the team was either unusually lucky or unusually unlucky when putting a ball in play. Or maybe that they’re really good or really bad at putting it in play.
Both of these elements are subject to what’s called the “Plexiglas effect,” – the tendency of luck to even out, to revert to the mean. I pulled the relevant 2017 stats for all thirty teams, to see if anything unusual popped out, that would suggest teams would do significantly better or worse in 2018 than last year. Obviously, this does not take into account trades and injuries; the Marlins’ 2017 numbers are based on their pre-fire sale roster, the numbers don’t take into account the Yankees’ acquisition of Giancarlo Stanton, or the Orioles’ Zach Britton’s rupturing his Achilles tendon. But they were fun to look at.
Looking at the Pythagorean wins first: There were a few big outliers. The Padres, Royals, and Blue Jays won 14, 9, and 5 MORE games than their scored/allowed differential would have expected. And the Yankees (-11 wins), Indians (-8), Phillies (-5) and Diamondbacks (-5) all won significantly FEWER games than would be expected. The conventional wisdom is that all other things being equal, they’ll revert to the mean next year; the teams that overperformed, will win fewer games, while those that underperformed will win more.
The BABIP numbers appear to be a little less random than their reputation suggests. The MLB average BABIP in 2017 was .300. I ranked all thirty teams by BABIP and sorted them by their actual (not projected) wins. I found that of the ten teams with the highest number of wins, five of them were in the top ten in BABIP. Three were in the second tier (i.e., 11-20) and only two were in the bottom tier (21-30).
Similarly, of the ten teams with the lowest number of wins, six of them were in the bottom ten of team BABIP. All this would suggest that there is a significant correlation between BABIP and win totals.
Looking at the “outlier” teams individually, starting with the teams that outperformed their Pythagorean projection: The Padres, Royals, and Blue Jays were toward the bottom of the BABIP rankings (numbers 26, 20, and 30), yet still somehow outperformed their Pythagorean predicted wins by significant margins. I don’t follow those teams closely enough to explain how they did it, but the numbers suggest that they won a lot of close games and/or a lot of their losses were blowouts.
Of the other “outliers,” who underperformed their projections, only the D-Backs (BABIP of .306) were in the top tier of team BABIP, with the Yanks and Phils just outside the top tier, tied for 12th place, with BABIP of .304.
The conventional wisdom is that BABIP is a function of luck; does this suggest that some teams are simply better at getting on base when they put the ball in play than others and other teams are better at defending against balls in play?
So, my predictions, based on all this:
- Padres: 14 wins over your Pythagorean projection is a LOT, and they did it despite a BABIP 13 points below the MLB average. Hard to tell whether Pythagoras or BABIP is dispositive.
- Royals – 9 wins over projection, BABIP 5 points under MLB average. Same conclusion as for the Padres.
- Blue Jays – only 5 wins over projection despite the worst BABIP in MLB, at .276. The Jays will improve significantly next season as they revert to the mean, assuming BABIP really is a matter of luck.
- Yankees – their 91 wins was 11 games UNDER their projected wins, despite BABIP 4 points over the MLB average. I would predict the Yanks to do significantly better in 2018, even if Giancarlo accidentally cuts a leg off with a chain saw.
- Indians – 8 wins under projection, with the 19th highest BABIP (or 11th lowest, if you will). Strong probability they’ll do better than their 2017 102-win season, again assuming that BABIP is truly random.
Some notes: I used two methods to compute the Pythagorean number, and they both came out exactly the same, despite what it says here: https://www.sports-reference.com/blog/baseball-reference-faqs/
I don’t know how https://www.baseball-reference.com/teams/NYY/2017.shtml came up with the Yankees only being 9 wins under the projection.
The Nats had the 3rd highest BABIP (.311) and were spot-on with their projected 97 wins.
Twenty teams were +/- 2 wins of their Pythagorean projections.
The Marlins had the second-best BABIP in all baseball and were also spot-on in projected wins at 77. Who wants the over/under for their win total in 2018?
Here’s the full spreadsheet, sorted by plus-or-minus projected wins.
And here it is sorted by actual wins: