| Current Standings | Current Predictions |
| Remaining NL Games | Remaining AL Games |
Method. The data from USA Today is parsed to obtain the game scores to date and the remaining scheduled games (as included in the dataset). An approximate statistical model is then applied to the data to obtain a rating each team. The model states that if team A has a rating of u and team B has a rating of v, and team A has a home team advantage of f (1 is neutral) then the probability that team A will win in a home game against team B is f * u / (f * u + v). A maximum likelihood estimator can be used to obtain estimates for the ratings for each team. To simplify the problem, instead of maximizing the likelihood, a least squares estimate is used
N
-----
\ 2
) n[i, j] (a[i] + t[i] - t[j] - ln(w[i, j]) + ln(l[i, j]))
/
-----
i,j = 1
where n[i,j] is the number of games that team i plays at home against
team j, w[i,j] is the number of those games that team i won,
and l[i,j] is the number of losses.
Here t[i] is the log of the rating u[i] and a[i] is
the log of the home team advantage f[i].
The home team advantage is determined by considering all
days in the season and then held constant for the
estimate of the ratings (this reduces the variance
of the estimate).
The weighting n[i,j] is used so that when two teams
have played many games against each other, then their
records will 'count more' in the estimate than teams
having played fewer games.
In order to account for changes in team ratings
during the season, only the most recent 70 days of
the season are considered when making the estimate.
Finally, the values for the ratings can all
be multiplied by a constant without changing the
prediction of the model.
For consistency, the ratings are scaled so that the sum
is 14. In other words, a ratings of 1 is average for
the given league with smaller values implying
weaker teams.
Prior to August 27, the home team advantage was not
estimated.
Presentation. Once estimates for ratings are obtained for each team, the probabilities of winning can be used to 'predict' games. This is shown in two ways. First, the 'expected' number of remaining wins is used to predict the final records. The expected standings are then shown for each division and the wild card race. Second, 100 seasons of random play are simulated based on the predicted probabilities, and the number of times that each team wins a division or wild card is counted and displayed. This illustrates that a team can be expected to finish 3rd or 4th in a division but still have a positive probability of making the playoffs.
Previous Predictions. To see a record of the predictions made on previous days of the season, take a look at these pages. Eventually they will display a graph but for now the data is presented in a table. This shows what the current algorithm would have predicted for the given day of the season. Because the algorithm changed on August 27, the previous estimates were recalculated to show the performance of the new algorithm.
| National League | American League |
|---|---|
| Ratings | Ratings |
| Predictions | Predictions |
Validation I. The validity of this analysis comes down to two different kinds of questions. (1) Does the model fit the data? and (2) Are the estimates accurate? At the present time not much has been done to answer either of these questions. To help evaluate the first question, a chi-squared value is given for both the full season (records to date) and the critical portion of the season (last 70 days). This value should be distributed something like chi-squared(k/2), but it always seems to be smaller than expected (which would imply a good match). Other than that, the results tend to agree with my own biased opinion of actual team strengths, but there are some exceptions. For example, in the 1993 season, Houston finished at .525 and Colorado at .414 and yet Colorado held an 11-2 advantage over Houston. This cannot be explained by a strictly one-dimensional model (as this is) and what may be needed is a multi-dimensional factor analysis.
Validation II. Towards answering the second question, are the estimates accurate, two values are now computed and displayed along with each estimated team rating: a bias and standard deviation. These values are obtained by assuming the estimated ratings are correct and then simulating 1000 random outcomes for the games of the last 70 days and finding the corresponding estimates. This has shown that the estimates are 'real' in the intuitive sense that they tend to be close to the assumed values. But two things can be said to qualify this. First, the estimates are definitely biased, but probably not by much more than 10%. Superior teams tend to be underestimated (negative bias) and inferior teams tend to be overestimated (positive bias), but the bias amount depends on the actual schedule of the last 70 days. Second, the 'error' as measured by the standard deviation is substantial but not so large that the estimates are meaningless. Typical standard deviations range from 30 to 50 percent and increase with the team rating. I think this is good news because it means that the trends are probably correct (yes, Detroit is currently worse than Cleveland), but there is uncertainty in comparing teams that have close ratings (such as Chicago and Cleveland).
Cross League Comparison. Needless to say, there is no reason to expect that the ratings obtained for the separate leagues are comparable. A purely statistical comparison may not be possible except by assuming that the two leagues are composed of teams coming from an identical statistical distribution. Even then, with a sample size of 14 teams per league this approach is weak at best.