Wednesday, March 09, 2011

Most Productive Offensive Players in Baseball History

I thought it would be interesting to try and determine the most productive offensive players in baseball history, and to use advanced statistical analysis for this purpose.

I decided to base this particular analysis on the metric I'm most familiar with, Wins Above Replacement (WAR). But, since this is supposed to be about offense, I needed to look at just the offensive components of WAR.

WAR is comprised of seven different categories, only five of which are relevant here:
  • Runs from Batting
  • Runs from Baserunning
  • Runs from Reached on Error
  • Runs from Grounded into Double Plays
  • Runs from Fielding
  • Runs from Positional Scarcity
  • Runs from Replacement Level
All of these numbers are added together to get Runs Above Replacement (RAR), which is then converted to Wins Above Replacement (WAR). Roughly, ten runs equal one win. That is, for every ten runs someone is better than a replacement player, he contributes one additional win to his team's success.

Basically, I took each player's Runs Above Replacement and subtracted his Runs from Fielding and Runs from Positional Scarcity. The latter statistic is based on the concept that, if a team were to get the exact same production out of its catcher as from its first baseman, the catcher would be far more valuable. This makes sense, but I'm looking to determine overall offensive production, not value.

Lastly, since I'm talking about who was(is) the most productive, I decided to rank the players on a per plate appearance basis. So, let's cut right to the chase and present the 50 most productive offensive players in baseball history, among those with a minimum of 6,000 career plate appearances, based on position-independent offensive RAR per plate appearance:

  1. Babe Ruth
  2. Lou Gehrig
  3. Ted Williams
  4. Barry Bonds
  5. Dan Brouthers
  6. Mickey Mantle
  7. Ty Cobb
  8. Albert Pujols
  9. Rogers Hornsby
  10. Joe Jackson
  11. Billy Hamilton
  12. Jimmie Foxx
  13. Joe DiMaggio
  14. Ed Delahanty
  15. Willie Mays
  16. Tris Speaker
  17. Stan Musial
  18. Frank Thomas
  19. Hank Greenberg
  20. Mark McGwire
  21. Hank Aaron
  22. Johnny Mize
  23. Mel Ott
  24. Alex Rodriguez
  25. Frank Robinson
  26. Roger Connor
  27. Dick Allen
  28. Manny Ramirez
  29. Jeff Bagwell
  30. Sam Thompson
  31. Edgar Martinez
  32. Eddie Collins
  33. Harry Heilmann
  34. Honus Wagner
  35. Jim Thome
  36. Elmer Flick
  37. Jason Giambi
  38. Cap Anson
  39. Jesse Burkett
  40. Eddie Mathews
  41. Nap Lajoie
  42. Ralph Kiner
  43. Rickey Henderson
  44. Mike Schmidt
  45. Duke Snider
  46. Chipper Jones
  47. Joe Morgan
  48. Willie McCovey
  49. Gary Sheffield
  50. Lance Berkman

If you're interested in the actual numbers, you can view the spreadsheet here.

A few observations:
  • Obviously, this is a rate stat, so it favors players with short or incomplete careers. It will be interesting to see if Albert Pujols can sustain his top ten ranking over the remainder of his career.
  • Among the top eight, none of them peaked in the same decade.  This may be an indication of this particular metric's effectiveness at adjusting for era. Obviously, taken at face value, Dan Brouthers's numbers can't possibly compare to Bonds's. Bonds hit over 650 more home runs than Brouthers, had a higher on-base percentage and twice as many stolen bases, yet Brouthers—who played in the power-deficient 19th century—lands right behind him on the list.
  • The list seems fairly representative of all eras, including 19th century and modern players, with 10 beginning their careers prior to 1900, and 14 having played at least part of their careers in the 21st century.
  • Other than the 19th century players, who most people aren't as familiar with, I'd say the list comes pretty close to reinforcing our pre-conceived notions about how these players stack up against each other. Obviously, different people will have differing opinions, but by how much? Do any of these rankings seem significantly out of whack?
I'm certainly not trying to say we should all accept that this is the way we should evaluate players. But, at the same time, there's really no way to compare players who played in entirely different eras without using statistics that are era-adjusted. Statistics such as Wins Above Replacement, and its components, provide us with an excellent starting point for being able to make such comparisons.


  1. Fascinating analysis! The most compelling point about it may be how, as you note, it spans all eras of the game.

    Unless I've overlooked someone, it appears Dick Allen is the highest ranked former player on the list who's not in the Baseball Hall of Fame. Curious what you (and any others) think about whether he should be elected.

  2. Thanks Dan!

    Technically, Mark McGwire ranks highest among Hall of Fame eligibles (so, not counting Joe Jackson and those who retired after 2005) who aren't in the Hall, but I could see how you'd think of him in a different category. That list would be:

    1. Mark McGwire
    2. Dick Allen
    3. Jeff Bagwell
    4. Edgar Martinez
    5. Jack Fournier
    6. Harry Stovey

    If I were kicking everyone out of the Hall of Fame and rebuilding its membership from scratch, Dick Allen would definitely have a place. If you're interested in that sort of thing, check out this series on Beyond the Box Score.

    So, yes, I think Allen's a Hall of Famer, but because that's not the case, there are others who I would advocate for ahead of him.

  3. Overlooking McGwire was a bit of a Steroidian slip on my part.

  4. Do you think WAR is more valuable than OPS+ or offensive winning pct%. Those are strong stats as well. Thanks.