Friday, January 16

The BCS Formula

Love it or hate it, the BCS formula has proven its worth as a marketing tool to develop weekly interest in the top college football teams. Expanding the playoff from the current 2 teams to much over 4 teams risks reducing the BCS formula's marketing value to that of NCAA basketball's RPI. This is a very strong argument for maintaining the BCS formula for the proposed system and keeping the field of teams small. As the formula is not going anywhere anytime soon an analysis of the formula is in order.

Critique of the BCS Formula
As a formula to determine the average ranking the BCS formula is not that bad. The polls have enough samples to mitigate against outliers and the computer average mitigates against them by throwing out the highest and lowest values.

Using the points awarded instead of the rankings of the polls significantly increases the accuracy of the resulting average. One improvement that could be made is to use the raw data of the computers as well. Each computer ranking used has an underlying raw score that is analogous to the total points of the polls. By linearly scaling each so that the #2 team has a value of 0.96 and the #25 team a value of 0.04 the precision of the computer ranking average can be significantly increased.

While slight improvements may be suggested, the BCS formula's algorithm is simple and effective.

Critique of the BCS Sampling Methods
One of the primary issues with the formula is its sampling methods. It relies on two polls. The Coaches poll has long been charged with voter prejudice and partisan voting blocks among the coaches who are too focused on their own teams to really consider how the other teams rank. The Harris Poll involves voters who admittedly don't even watch all of the top 25 teams play a game.

The computers used are in the middle of the pack. Several other computer methods exist that are significantly better but are excluded because they use margin of victory. This is despite evidence that margin of victory is the best single NCAA statistic at indicating the winners of bowl games. The BCS computers are too dependent on each teams win-loss records. This is done intentionally to avoid forcing teams to run up the score, which is seen as unsportsmanlike by many.

In the end the historical bias of the voters and the dependence of the computers on each team's record are opposed. I believe they are balanced fairly well but there is plenty of room for disagreement.

Consensus implied by the gaps in the BCS standings
Massey's ranking comparison indicates the standard deviation of the rankings for teams in the top 5 runs about 2. This value increases the further down the standings one looks. Usually only the top is of national interest.

A difference in BCS standings can be multiplied by 25 to give the corresponding across the board ranking difference it represents. Assuming the voters opinions of the value of each team are normally distributed, the standard deviation of the difference would be the square root of 2 times the standard deviation of the rankings. Dividing the difference by the standard deviation of the difference produces a z-score. This can be used to estimate the probability an individual voter will agree with the results, which I will refer to as consensus.

A difference of 0.0181, the difference between Florida and Texas in the final 2008 BCS standings, represents an average ranking difference of 0.4525, less than the descritization error, and represents a consensus of 56.4%. A difference of 0.0600 represents a consensus of 70.2%. This is more agreement than the 2/3 majority needed to pass an amendment to the US constitution.

These values are only valid near the top 5. At a lower ranking the standard deviation increases so the consensus would be lower for the same difference.

Critique of how the BCS formula is applied
Another other issue with the BCS formula is how it is applied. The current system only allows for two teams, so the formula is used to decide between the #2 and #3 team regardless of their difference in the standings.

Rather than use the BCS formula to separate teams at a given ranking it would be much better to separate teams at a given gap in the standings as this would be supported by the claim that the gap indicated a strong consensus agreed with the cutoff.

Using a gap also increases the credibility of the system. It would be significantly more difficult for a few voters, or even a block of voters, to manipulate a large gap than the order of two closely ranked teams.

Conclusion
The proposed championship system keeps the number of eligible teams low enough to allow the BCS formula to maintain its value as a marketing tool and uses the formula to attain a more credible cutoff while increasing the integrity of the formula itself.

No comments:

Post a Comment