Comparison of prediction methods for batter-pitcher matchups
Abstract
Baseball is full of confrontations and these confrontations between a batter and the pitcher is what makes the game. If a formula would be able to predict the probability of the outcome correctly, when they meet, wouldn’t it instill confidence in the minds of the head coach (or you if you are playing the fantasy) to select someone who would be on the winning end? We would like to know for sure, which of our batters are good, and what out of the small amount of possible outcomes, will be the result when he faces this other good pitcher from the team you face next. It seems the past performance of the batter against this pitcher can be a good indicator, and that is what presumably the methods currently used utilize. But the utility of the Batter vs. Pitcher data in predicting the future outcome is a debate going on for quite a time now. The reason for this debate stems from the fact that the sample size of this data is so small that it becomes hard to comprehend when to prefer information you get from a sample size of thousands of atbats against all pitchers vs. maybe a few dozen against specific individuals. The report will discuss one of the famous methods, called Log5 [1] that has been utilized so far when it comes to measuring the outcomes of these confrontations. It also discusses the other methods like logistic regression based on the past data and the new and upcoming Morey-Z. [3]