## tips for filling out a statistically sound bracket

Here are a few things I do to fill out my bracket using analytics.

1. Let’s start with what not to do. I usually don’t put a whole lot of weight on a team’s record because strength of schedule matters. Likewise, I don’t put a whole lot of weight on bad ranking tools like RPI that do not do a good job of taking strength of schedule into account.

2. Instead of records, use sophisticated ranking tools. The seeding committee using some of these ranking tools to select the seeds, so the seeds themselves reflect strength of schedule and implicitly rank teams.  Here are a few ranking tools that use math modeling.

I like the LRMC (logistic regression Markov chain) method from some of my colleagues at Georgia Tech. Again: RPI bad, LRMC good.

3. Survival analysis quantifies how far each each team is likely to make it in the tournament. This doesn’t give you insight into team-to-team matchups per se, but you can think about the probability that Wisconsin making it to the Final Four reflecting an kind of average across the different teams a team might play during the tournament.

4. Look at the seeds. Only once did all four 1-seeds make the Final Four. It’s a tough road. Seeds matter a lot in the rounds of 64 and 32, not so much after that point. There will be upsets. Some seed match ups produce more upsets than others. The 7-10 and 5-12 match ups are usually good to keep an eye on.

4. Don’t ignore preseason rankings. The preseason rankings are educated guesses on who the best teams are before any games have been played. It may seem silly to consider preseason rankings at the end of the season after all games have been played (when we have much better information!) but the preseason rankings seem to reflect some of the intangibles that predict success in the tournament (a team’s raw talent or athleticism).

6.Math models are very useful, but they have their limits. Math models implicitly assume that the past is good for predicting the future. This is not usually a good assumption when a team has had any major changes, like injuries or suspensions. You can check out crowdsourcing data (who picked who in a matchup), expert opinion, and things like injury reports to make the final decision.

## roundup of march madness sports analytics articles

1. Michael Lopz (@StatsByLopez) uses analytics to identify which teams and over- and under-valued in the tournament.
2. Evelyn Lamb at Scientific American blogs about the math behind a perfect bracket.
3. Carl Bialik at FiveThirtyEight writes about the odds of getting a perfect bracket using analytical methods. It depends on how good those analytical methods are. Nate Silver claims it might be as high as 1 in 7.4 billion. Interesting.
4. Will a 16 seed ever beat a 1 seed?” by Ed Feng (@thepowerrank) Ed also has a bracket tool that visualizes different game outcomes.
5. My advisor Sheldon Jacobson who maintains BracketOdds was interviewed in the News-Gazette, the local paper in Champaign-Urbana, IL.
6. The Huffington Post has a “Predict-O-Tron” that helps you fill out your bracket using a probabilistic tool that lets you set the importance of different attributes (like seed, offensive efficiency, and even tuition) using moving sliders. It looks interesting but reeks of overfitting.
7. I was on the local NBC 15 affiliate in Madison on March 18 to discuss the odds of a perfect bracket (video included).

## creating a March Madness bracket using integer programming

An Associated Press article on ESPN outlines how the Division I men’s basketball committee wants to make bracket construction to be more fair [Link]. At present, there are 68 teams with no plans to expand the field. However, the committee has many decisions to make when it comes to who makes it in and who doesn’t as well as the seed and the region. All of this together determines potential matches. Previously, the committee tried to entirely avoid rematches in the first few rounds of the tournament. Given the large number of potential match-ups depending on who wins and loses, this constrained the bracket (possibly too much).

“There have been years where we’ve had to drop a team or promote a team; there was even a year where teams dropped two seed lines. We don’t feel that’s appropriate.” – Ron Wellman, the athletic director at Wake Forest

The article doesn’t exactly hint that integer programming could be used to solve this problem, but that’s the next logical step. In fact, there is a paper on this! Cole Smith, Barbara Fraticelli, and Chase Rainwater developed a mixed integer programming model (published in 2006, back when there were 65 teams) to assign teams to seeds, regions, and pods (locations). The last issue is important: constructing the bracket is intertwined with assigning the bracket to locations for play. For example, four teams in a region in the field of 64 (e.g., a 1, 8, 9, and 16 seeds) must all play at the same location to produce a single team in the Sweet 16.

The Smith et al. model minimizes the sum of the (then) ﬁrst-round travel costs (the round of 64), the (then) expected second-round travel costs  (the round of 32), and the reseeding penalty costs while considering typical assignment constraints as well as several side constraints, including:

• no team plays on its home court (except in the Final Four – that location is selected before the tournament),
• no intra-conference match-ups  occur before the regional ﬁnals (what was the fourth round). This is the constraint that may be relaxed somewhat in the new system. Therefore, this existing model can be used to make brackets in the proposed new system.
• the top-seeded team from each conference must be assigned to a different region as the second- and third-highest seeded teams from that conference.
• the best-seeded teams should be assigned to nearby pods (locations) in the ﬁrst weekend (a reward for a good season!), and
• certain universities with religious restrictions must be obeyed (e.g., Brigham Young University cannot play on Sundays).

It is worth pointing out that this model assigns the seeds to teams. A team that could be considered as an 11-13 seed would be assigned its seed (11, 12, or 13) based on the total cost of the system. That may seem like it’s unfair on some level, but it might be better for a team to be a 13 seed and play nearby than a 12 seed but have to travel an extra 1000 miles. (Note: Nate Silver and the 538 team use travel distance in their NCAA basketball tournament prediction model because distance matters). Flexible seeds allows for a bracket that gives more teams a fair shot at winning their games, but too much flexibility would be unfair to teams. The Smith et al. model allows for some flexibility for 6-11 seeds.

The mixed integer programming model by Cole Smith, Barbara Fraticelli, and Chase Rainwater already addresses the committee’s concerns, which begs the question: why isn’t the committee using integer programming??

OK, it’s probably pretty easy to think of a few reasons, and none of them involve math. One concern is that the general public seems to distrust models of any kind. This may be because models are black boxes to non-experts. This lack of transparency makes it hard to generate any kind of public support (Exhibit A: the debate about the model for the BCS football rankings). Perhaps marketing could improve buy in (“The average team traveled 500 miles fewer this year than last” or “Five teams had to travel across all four US time zones last year, and none had to do so this year.”) A better suggestion may be to give a few of the top integer programming solutions to the committee, who can then use and adapt (or ignore) the solutions as they see fit. Currently, the committee looks at several rankings (including the LRMC method, last time I heard), so they are already using math models to influence the decisions ultimately made by humans.

How would you use operations research and math modeling to improve the tournament selection and seeding process?

Reference:

Smith, J.C., Fraticelli, B.M.P, and Rainwater, C., “A Bracket Assignment Problem for the NCAA Men’s Basketball Tournament,” International Transactions in Operational Research, 13 (3), 253-271, 2006. [Link to journal site]

## will someone create a perfect bracket this year?

Warren Buffett is offering \$1B to whomever produces a perfect bracket [Link]. Here is my take.

There are about 9.2 quintillion ways to fill out a perfect bracket. This is often mistakenly used to predict the odds of filling out a perfect bracket – it is not 9-quintillion-to-1 because:

(a) the tournament isn’t like the lottery where every outcome is equally likely, and

(b) monkeys are not randomly selecting game outcomes. Instead, people are purposefully selecting outcomes.

Outcomes for “good” brackets made by people who play the odds and, for example, choose 1 seeds to beat 16 seeds in the second round. These brackets have a much better chance of reaching perfection, somewhere in the range of 128 billion-to-1 or 150 million-to-1 (See here and here).

The limitation here is that these odds give an individual likelihood of getting a perfect bracket; they give no insight into how to construct a pool of brackets that collectively has a high degree of likelihood for producing a perfect bracket.

Just like in the lottery, there is a difference between you willing the lottery and someone winning the lottery (just like in the classic Birthday Problem). Let’s say we have the perfect methodology that gives us the 150 million-to-1 odds. If 150M people filled out brackets, would we expect to see a perfect bracket? Probably not. If everyone used the same methodology that maximized our individual chance of getting a perfect bracket, this wouldn’t necessarily lead to a pool of brackets that collectively guarantee that someone gets a perfect bracket. The problem is, many of the brackets will be identical or almost-identical if they use the same methodology (meaning that they are all perfect or they are all not perfect). There needs to be enough variation between the entries to probabilistically “cover” the possible brackets with a certain reliability level. We would expect to see more variation between entries in the lottery, where many people purchase lottery tickets with randomly generated numbers (and we can more easily estimate the odds that someone will win a lottery based on the number of tickets sold). Recall: randomly generated brackets aren’t the answer! In a nutshell: what is good for the goose isn’t necessarily good for the gander.

The probability of a perfect bracket depends on the tournament. Let’s look at brackets in the last 3 years on ESPN. Let’s only look at how many people correctly select all Final Four teams:
– 47 of 8.15 million brackets correctly picked all Final Four teams in 2013
– 23,304 of 6.45 million brackets correctly picked all Final Four teams in 2012
– 2 of 5.9 million brackets correctly picked all Final Four teams in 2011

Both 2011 and 2013 had “Cinderella stories” of VCU and Wichita State, respectively. A single surprise can drastically affect the number of outcomes and make it less likely for someone to have a perfect bracket. On the other hand, when a 1 seed wins the tournament, brackets have more correct picks, on average. Certain tournaments therefore provide the right atmosphere that could lead to perfect brackets than others.

While having a good methodology for filling out a bracket is key to maximizing your chances, chance plays a much larger role. However, while you cannot control the randomness of the tournament, you can control how you fill out a bracket. In terms of strategy, a person should use statistics, analytical methods, and expert opinions to fill out a bracket to maximize the chance of picking a perfect bracket.

It would be a mistake to look at the two best brackets in 2011 and use the methodology that went into creating those brackets in other tournaments. Basing your bracket methodology on a single tournament is not a good idea (a single tournament is a small sample, no statistically significant conclusions can be drawn from it). If we applied the 2011 methodology to other years, we would quickly see that in the long run, we would do very poorly in March Madness office pools.

If we are acting in our own self-interests (and we are if we want that \$1 billion prize!) then we should use the best models to maximize our personal odds and then hope for the best. Luckily, my colleagues have used analytics, operations research, and math to create some pretty good methods we can use to fill out brackets. This is a terrific place to start.

For my tips on filling out a bracket based on analytical methods: read my post here.

Are you participating in the Warren Buffett contest?

## why is it so easy to forecast the Presidential election and so hard to forecast the NCAA basketball tournament?

This blog post is inspired by my disappointing NCAA March Madness bracket. I used math modeling to fill my bracket, and I am currently in the 51st percentile on ESPN. On the upside, all of my Final Four picks are still active so I have a chance to win my pool. I am worried that my bracket has caused me to lose all credibility with those who are skeptical of the value of math modeling. After all, guessing can lead to a better bracket. Isn’t Nate Silver a wizard? How come his bracket isn’t crushing the competition? Here, I will make the case that a so-so bracket is not evidence that the math models are bad. To do so, I will discuss why it is so easy to forecast the Presidential election and so hard to forecast the NCAA basketball tournament.

Many models for the Presidential election and the basketball tournament are similar in that they use various inputs to predict the probability of an outcome. I have discussed several models for forecasting the Presidential election [Link] and the basketball tournament [Link].

All models that didn’t solely rely on economic indicators chose Obama to be the favorite, and nearly all predicted 48+ of the states correctly. In other words, even a somewhat simplistic model to forecast the Presidential election could predict the correct outcome 96% of the time. I’m not saying that the forecasting models out there were simplistic – but simply going with poll averages gave good estimates of the election outcomes.

The basketball tournament is another matter. Nate Silver has blogged about how models to predict tournament games using similar math models. Here, we can only predict the correct winner 71-73% of the time [Link]:

Since 2003, the team ranked higher in the A.P. preseason poll (excluding cases where neither team received at least 5 votes) has won 72 percent of tournament games. That’s exactly the same number, 72 percent, as the fraction of games won by the better seed. And it’s a little better than the 71 percent won by teams with the superior Ratings Percentage Index, the statistical formula that the seeding committee prefers. (More sophisticated statistical ratings, like Ken Pomeroy’s, do only a little better, with a 73 percent success rate.)

To do well in your bracket, you would need to make small marginal improvements over using the naive model of always picking the better seed (72% success rate). Here, a 96% success rate would be unrealistic — an improved model that would get 75% of the games correctly would give you a big advantage. The big advantage here means that if you used your improved method in 1000 tournaments, it would do better on average than a naive method. In any particular tournament,  the improved method may still lead to a poor bracket. It’s a small sample.

The idea here is similar to batting averages in baseball. It is not really possible to notice the difference between a 0.250 batter and a 0.300 batter in a single game or even across the games in a single week. The 0.250 hitter may even have a better batting average in any given week of games. Over the course of the season of 162 games, the differences are quite noticeable when looking at the batters’ batting average. The NCAA does not have the advantage of averaging performance over a large number of games — we are asked to predict a small set of outcomes in a single tournament where things will not have a chance to average out (it’s The Law of Small Numbers).

It’s worth noting that actual brackets get fewer than 72% of the games correct because errors are cumulative. If you put Gonzaga in the Elite Eight and they are defeated in the (now) third round and do not make it to the Sweet Sixteen, then one wrong game prediction leads to two wrong games in the bracket.

It’s also worth noting that some games are easier to predict than others. In the (now) second round (what most of us think of as  the first round), no 1 seed has ever lost to a 16 seed, and 2 seeds have only rarely lost to 15 seeds (it’s happened 7 times). Likewise, some states are easy to predict in Presidential elections (e.g., California and Oklahoma). The difference is that there are few easy to predict games in the tournament whereas there are many easy to predict states in a Presidential election. Politico lists 9 swing states for the 2012 election. That is, one could predict the outcome in 82% of the states with a high degree of confidence by using common sense. In contrast, one can confidently predict ~12% of  tournament games in the round of 64 teams using common sense (based on four of the games corresponding to 1 seeds). Therefore, I would argue that there is more parity in college basketball than there is in politics.

## methodologies used to predict the outcome of the basketball tournament

My last post was about how to choose a winning bracket in the NCAA men’s basketball tournament. I linked to several tools for predicting which team is likely to win the outcome of a game. These tools

1. provide a rank ordering of the teams from best to worst,
2. compute the odds of which team would win in a matchup based on their tournament seed, or
3. provide odds of a team making it to different levels of the tournament based on specific matchups.

I linked to the methodologies used by these tools in my last post but didn’t get into the details. Here, I am going to discuss the methodologies in more detail. I am going to focus on tools that predict the outcome of specific tournaments (#3 above).

Wayne Winston noted in Mathletics that there is no transitivity in matchups. That is, if team A is favored to beat team B and team B is favored to beat team C, this does not  imply that team A is favored to beat team C. Thus, the team rankings (#1 above) are not a perfect tool for predicting specific matchups. He uses “power ratings” to compute how many points one team is better than the other (a point spread), which takes home field advantage and other factors into account. He then converts the point spread to the probability of winning using historical game outcomes (basically, a normal distribution with a history-derived standard deviation) or simulates the games to compute the odds of winning.

Nate Silver’s model is interesting in that it takes many inputs, including the ranking tool outcomes from #1 above. His model uses blends four ranking models to take a more pluralistic view of who might win. I think this is a strength because it uses the wisdom of crowds (a small crowd in this case). Each of the four tools contributes 1/6 of the total power rating (a margin of victory).  Seed number and whether the team was ranked in preseason polls each contribute 1/6 of the power rating. He then makes adjustments for the geography of the game and player injuries and absences. He doesn’t describe his forecast probabilities in detail, but I suspect that his approach is similar to Wayne Winston’s. A team’s power rating is adjusted in each round based on the outcomes from previous rounds to account for potential errors in the power rating, another strength of the model.

Finally, Luke Winn and John Ezekowitz’s model doesn’t use power ratings [methodology here] – it instead applied survival analysis to predict when a team may drop out of the tournament. This model computes hazard rates for each team based on the team’s RPI and Ken Pomeroy’s ranking. They also consider

1. consistency,
2. tournament experience,
3. out-degree network centrality that captures the number of games played and won against other NCAA tournament teams (see picture below), and
4. the negative interaction of the Experience and Out-Degree Centrality variables

Cox Proportional Hazard regression was used to rerank the teams.

## how to pick a winning bracket using analytics

I’ve written about the NCAA basketball tournament many times before – click on my March Madness tag for my past posts. This time, I am going to summarize the different ways to select the winning teams in your bracket. There are many ways to choose a bracket, but using math models and analytics techniques seem to work best. Plus, it’s more fun than just guessing.

As I see it, there are two ways to pick a bracket: you can look at the team matchups and choose or you can look at the seed number and choose. I lean toward team level matchups when creating my own brackets, but I use the seed numbers, too. Some seed matchups (7/10 seeds for example) have historically high rates of producing upsets.

There are several tools that rank teams, and these ranking tools provide a way to see which team is “better” – the one with the higher ranking. One traditional tool is the RPI (access the RPI rankings here), but it’s not considered to be very good.

There are a number of more sophisticated ranking tools that use math modeling.

These ranking tools are great and do well at predicting individual games, and they do extremely well on average. This means that these methods would do the best when averaged over, say, 1000 basketball tournaments. We don’t have 1000 tournaments – we have just one. Keep that in mind. These rankings also do not necessarily give insight into a matchup in a specific game.

Two models consider individual matchups when computing how far each team will make it in the tournament.

The other way to pick rankings is to look at the seeds in the matchups. This is useful when a weak team from a major conference plays a top mid-major team. See this Business Week article on Sheldon’s advice for picking a good bracket. There is one tool developed by Sheldon Jacobson and his collaborators that focuses on seeds:

Here is one last thing to keep in mind:

• Preseason rankings matter: teams that are in the top 25 before the season starts are likely to go far in the conference despite their seeds, and likewise, top 25 teams at the end of the season who were unranked at the beginning of the season are likely to go home early.

There are other articles out there on how to pick a winning bracket. Here is what I recommend reading: