Miners Mills


Musings on Logic, Analysis, Decision-Making, and Other Elements of Natural and Artificial Intelligence

My Imperfect NCAA Tournament Bracket

After one weekend of the annual NCAA Men’s Basketball Tournament, I’m not sure if I should say we’re one-third of the way through the tournament (two rounds of six played), or more than three-quarters of the way through (48 of 63 games played - not including “play-in” games, which I ignore herein) — either way, I’ve already picked several games wrong, so I have no chance at a perfect “bracket”.

For those who don’t know what an NCAA Tournament “bracket” is -- briefly, it’s a chart of matchups of the 64-team, 6-round, “single-elimination” (i.e. one loss, and you’re out) tournament.  In all, there are 63 total games (to eliminate 63 teams) — leaving one champion remaining at the end, with a 6-0 record.  

Here is an example of this year's bracket:


There is an annual ritual among millions of people, in the week before the tournament match-ups are set and the games begin, of filling out a bracket, and trying to guess the winner of each game, all the way through the championship.

As there are 63 total games, many mathematicians incorrectly state the odds of filling out a “perfect” (i.e. picking the winner of all games correctly) bracket as 1 in 263 ≈ 9 quintillion. However, this assumes that every game is a “coin flip,” where each team has an equal chance of winning… when, in fact, the tournament is set up by experts who seed each of four regions by rankings of 1-16, in which the first round games match seeds totaling 17 (i.e. 1 plays 16, 2 plays 15, etc.). Since the tournament was expanded to 64 teams in 1985, only one 16 has defeated a 1 (out of now 140 games), and only 8 times has a 15 beaten a 2 (also out of 140 tries)… so these are clearly not coin-flip games.

Other mathematicians, using statistical models to estimate the chances of the higher-seed winning in each round, have come up with better odds, even getting as “high" as 1 in 128 billion.


But even these statistical models don’t actually answer the question “What are the chances of picking a perfect bracket?" — what they are trying to calculate is is the answer to “What is the probability that the higher seed wins all 63 games?”  That’s not how bracket-picking works.  Most people, even those who don’t follow men’s college basketball at all prior to March, try to pick upsets.  It’s no fun to only pick the higher seed in each round — well, that is, unless you’re like me, and got tired of picking upsets wrong, so you just minimize effort by picking the higher seed to win every game — boring yes, but quick, and less emotionally draining.

Instead, the question to which most people want the answer is: “What are the chances that *I* have a perfect bracket?”  This is a much more difficult problem, because it is trying to put a probability on a compounding of 63 different 2-stage events… the first stage being “Did *I* pick the higher seed or lower seed to win?” then “Did the higher seed or lower seed win?”

The latter is the one to which we can try to assign a probability (given 34 years of historical results, and assuming a proper ranking by the experts).  However the former is unanswerable in the abstract… how can we assess an a priori probability that any particular person chooses upsets a certain way?


Theoretically, ESPN or CBS could analyze every filled-out bracket to determine each person X's choices of Seed A n defeating Seed B n (or vice-versa) for each of the n = 1 to 63 games, and triangulate that against the historical probabilities that a Seed A n defeated Seed B n in that round. This would then seem to give the probability that this particular bracket X will be perfect. Aggregate that over all brackets submitted, and one could seemingly assess the probability that “At least one of these submitted brackets is perfect”.

But even this relies upon proper current rankings, as well as the assumption that historical results of Seed vs. Seed performance are correct predictors of the probabilities of results this year, which, given the complexities of rankings and the nuances of performance vs. expectation, is more of a quantum probability cloud rather than any sort of precise measurement.

That’s why it’s much easier to state the problem, as many mathematicians have, as that of a series of 63 coin-flip games.


Well, in the quest for a perfectly-picked boring (higher ranked teams picked to win each game) bracket, the experts are doing only “ok" so far this year.  Of the 48 games played in the first weekend, the higher seed has won 34 — only 20 of the 32 games in the first round, but a better 14 of 16 in the 2nd round.

That’s an interesting breakdown.  In theory, it should be easier for the experts to pick the first round (wider gap between high and low seeds) than the second round.  This counter-intuitive result could be a function of a wide disparity between the top 16 teams in the country, and the next 48 teams, or the experts spending more time getting the ranking right at the top of the brackets, and less time on the mid and lower rankings.


Two other interesting statistical anomalies so far:

1) In the theoretically close matchups pitting 8 vs 9 and 7 vs 10 in the first round, even assuming these are all toss-up games, there should be only a 3.5% probability that the 9s and 10s should win at least 7 of the 8 games. However that’s exactly what happened this year (with all 7 of those losing in the second round).

2) In the 2nd round, only 1 of the 4 games in the East region was decided by double-digits, but 10 of the 12 games in the other 3 regions were decided by double-digits.


Since I did simply pick the higher seed to win every game, my bracket has already been busted 14 times.  Ordinarily, I’d be discouraged by failing so often, so quickly.  But it does give me some solace knowing that, according to some mathematicians, I’m no different than the approximately 9 quintillion other people who have failed as well.  Good company indeed.

Thanks for reading! Feel free to email me your thoughts.

David Chariton