Here are a few things I do to fill out my bracket using analytics.
1. Let’s start with what not to do. I usually don’t put a whole lot of weight on a team’s record because strength of schedule matters. Likewise, I don’t put a whole lot of weight on bad ranking tools like RPI that do not do a good job of taking strength of schedule into account.
2. Instead of records, use sophisticated ranking tools. The seeding committee using some of these ranking tools to select the seeds, so the seeds themselves reflect strength of schedule and implicitly rank teams. Here are a few ranking tools that use math modeling.
- LRMC – the standard LRMC and LRMC with no margin of victory[methodology]
- Ken Pomoroy’s rankings [Methodology]
- Sagarin rankings [methodology is proprietary]
- Massey Ratings [some methodology]
- ESPN’s BPI rankings [methodology]
I like the LRMC (logistic regression Markov chain) method from some of my colleagues at Georgia Tech. Again: RPI bad, LRMC good.
3. Survival analysis quantifies how far each each team is likely to make it in the tournament. This doesn’t give you insight into team-to-team matchups per se, but you can think about the probability that Wisconsin making it to the Final Four reflecting an kind of average across the different teams a team might play during the tournament.
- FiveThirtyEight has a nice chart of how far each team will go.
- Ken Pomeroy has odds of each team making it to various rounds of the tournament.
- Luke Winn and John Ezekowitz uses survival analysis to look at how far teams will go. Data is from the 2013 tournament (not useful for this year).
4. Look at the seeds. Only once did all four 1-seeds make the Final Four. It’s a tough road. Seeds matter a lot in the rounds of 64 and 32, not so much after that point. There will be upsets. Some seed match ups produce more upsets than others. The 7-10 and 5-12 match ups are usually good to keep an eye on.
- BracketOdds at the University of Illinois focuses on the seeds.
4. Don’t ignore preseason rankings. The preseason rankings are educated guesses on who the best teams are before any games have been played. It may seem silly to consider preseason rankings at the end of the season after all games have been played (when we have much better information!) but the preseason rankings seem to reflect some of the intangibles that predict success in the tournament (a team’s raw talent or athleticism).
6.Math models are very useful, but they have their limits. Math models implicitly assume that the past is good for predicting the future. This is not usually a good assumption when a team has had any major changes, like injuries or suspensions. You can check out crowdsourcing data (who picked who in a matchup), expert opinion, and things like injury reports to make the final decision.
For more reading:
- how to pick a winning bracket using analytics
- methodologies used to pick the winner of a basketball game
- roundup of March Madness tournament articles and bracket tips
- why is it so easy to forecast the Presidential election but so hard to forecast the NCAA tournament?
- will someone create a perfect bracket this year?