10 Lessons: Creating a Baseball Simulator

For those of us who are interested in baseball statistics, The Hardball Times (THT) is a consistent source of thought-provoking analysis and commentary on the game.

Of course, we in the hobby know that there is a large segment of hobbyists who remain steadfast in their support of Batting Average and RBI for hitters and Wins and ERA for pitchers. These stalwarts remain unimpressed by the newest, hottest Sabermetric data points, and THT might not always be their cup of tea.

In April, however, Matt Hunter of The Hardball Times wrote an article that any fan of baseball simulations can appreciate.

10 Lessons I Learned from Creating a Baseball Simulator

As anyone who’s ever thought they could do it better than the pros could tell you, designing a baseball game isn’t easy.

From the article:

While making my sim, I quickly realized that each additional variable I included would add a layer of complexity and work, complexity that compounded each time I made a change.

I designed my first game at age 10. It was a very basic simulation. I rated players based on how well I wanted them to do in my league – Kevin Maas was a huge star, if that tells you anything. But I loved the game. I played literally thousands of games with it over the years and my league had a 30 season history, with a Hall of Fame, baseball cards, everything.

I’m curious about the homemade games you played. What are your experiences in designing tabletop-sports simulations? Leave them in the comments, please!

Bookmark the permalink.


  1. I too have written a baseball SIM called Statarama Baseball (SARBB); it’s free and available at http://www.statarama.com. Having gone through the exact process Matt describes, I can tell you he’s correct on all points.

    I too had the idea that baseball was simple—-until I tried to capture all the variables and events and generate the probabilistic event outcomes—-then I quickly found out how very difficult it is. In my eyes it all comes down to the Central Limit Theorem and the way that distributions approximate normality as sample sizes increase. For example, a team has 104 wins in one season simulation, and the very next one they win 88—two replays is simply not enough to determine the real level of probabilistic success for an entire team, and thus the results seem inaccurate. One needs far more season sims than this (e.g. greater sample size) to actually see the dominance of a team in the data.

    This being said, my algorithm for my game was to generate a binary random value to determine if the pitcher or batter stats would determine the outcome for an at-bat; then generate a random value
    from 1 to 999, and compare that to the batters average or the pitcher’s opponent-average. If it’s =< the batter's average, and the batter was the binary choice, then a 'hit' occurs (otherwise it's an out)… and the same for the pitcher using opponent batting average. After a 'hit' or 'out' occurs, I just use random numbers to generate what type of out or what type of hit has resulted….and the statistics are pretty well distributed and realistic.

    I did it this way due to that fact that 'real predictability' for an outcome would need to consider such a wide variety of variables—such as the 'missing variables' Matt lists in his article [see Lesson #2 in the article]. The bottom line is that a player with a larger average will get more hits; and a pitcher with a lower opponent-batting average will generate more outs—and this system works good at a fundamental level and does not require huge sample size to recognize fairly accurate 'global outcomes', such as who wins the game.

    It's still fun to play, and I only designed it to test out my algorithm/idea. Essentially, I've shifted my focus to a free Virtual Game Board for those who like to use the actual physical SIM game cards from Strat-O-Matic, Statis Pro, APBA etc, and would like to use a ballpark background on a PC
    screen and drag the game board pieces around on-screen. I think I'm done with 'developing' the SARBB program for the most part—it's gotten far too complex and it's really a huge undertaking due to many of the points Matt makes in the article.

    — Jeff —

    • Jeff – this is really interesting stuff, thank you for offering your perspective. I downloaded your Virtual Game Board and I will be trying your Statarama game, too. A Virtual Game Board is a really good idea, and I’m sure there are many C&D baseball gamers that would appreciate and use such a product. I’ll be watching your site for further developments!

      Thank you for the comment!

      • Thanks Paul. I’ll keep updating them if people find them fun to use. The VGB was designed for people who like to transfer their paper-and-pencil stats to a spreadsheet and work with their data there. I used to do a lot of that. The baseball game builds on the same idea of giving the user more control over the data. I’m getting ready to put a new version of the Statarama Baseball out there that has a couple of fixes and a new feature that hides the batter pitcher data on the interface so that much more of the ballpark is visible at the top. I have not put in an ‘auto update’ feature that checks for updated versions because I don’t like those ‘nag’ features very much. I may put one on the menu at some point in time and users could check when they wanted to.

      • Paul:
        I forgot to mention…. If you’re going to play Strat-O-Matic with the VGB, let me know. I can send you JPG files of the SOM charts I have and you won’t have to recreate them on your own.

  2. I played these games as a child, bought APBA and played it with my brother in 1962, he said he used to cheat. Got dissatisfied with APBA as an adult, tried SOM and several others, then decided to make my own game which I called LF Baseball and sold mail order for 2 years. Then a fan of my game who was in a Replay league published my game for a year because he liked it better than Replay and I had quit. Now I play my own game but don’t sell it. I am currently playing out a 75 game season using players from 1885-1889. You would be familiar with Cap Anson and Pete Browning, the original Louisville Slugger. I self publish books which are sold on Amazon describing the seasons I have played recently. If you wish to correspond, you can reach me at LFBNO7@AOL.COM. My name is Len Feder.

    • Hi Len – I have heard of LF Baseball but never actually seen the game. I’ve also seen your books on Amazon. Thank you so much for reading and commenting on the blog. I will be emailing you!

      All the best,

      • Hi Paul. This is my favorite hobby, and I would enjoy corresponding with anyone else who likes playing baseball sim games. Every day I play another one or two games.

  3. One of my most basic and silliest games I ever created as a kid was a hockey game. I wanted a hockey game and I wanted a game that I could rate my own teams. At the time I recently purchased Go For the Green, and it came with these really cool dice that gave you a number from 10 to 39. The numbers were weighted significantly (for those who have never seen the dice). You’ll roll something in the 10s 1/6th of the time, in the 20s 2/6th of the time and in the 30s 50% of the time. The ones dice were created so that I think a 5 was the most common roll (maybe its a 4, I have to check to be honest), so a roll of 35 was your most common roll, and a 19 was would come up once every 216 rolls.

    I thought these were the coolest dice, so I had to make a game out of it. I purchased what basically was a Hockey Who’s who book and I created a system where players were rated for goal scoring and penalties. That’s about it. To be honest, I can’t even remember how I tracked goalies – probably a +1 to -1 adjustment. The goal range was figured based on how many goal a guy scored in a game. I want to say it was based on 10ths. 0.01 to 0.10 got a goal range of 10, .11 to .20 got a goal range of 10-11, etc.
    Well, I guess you can see the issues I had with this system. The guys averaging hardly any goals, hardly ever scored since 10 had a 2/216 chance of ever being rolled. Even I understood the imbalance of the system at the time. I knew some guys who averaged .4 goals a game vs. someone who scored .2 goals a game would have a ridiculously higher chance of scoring goals than they should, but I kind of had fun with it and rolled with it. It’s actually one of the only games I ever played with friends. I got a friend addicted for about a week and we played 4-5 games.

    A few years later, when Wayne Gretzky scored 92 goals in 80 games, I found those cards and even played a game or two with “new” Wayne Gretzky card. His range actually hit the 20s and if I played a full season, he probably would’ve scored about 150 goals.

    It wasn’t accurate, but it was a ton of fun, and there were hardly any rules. If a player had a penalty, I just yanked him off the ice and kept the others on the ice. When the puck went to the open spot, that meant someone on the other team had a shot. I don’t think I even created an adjustment for shots on the power play – just figured having one extra man on the ice was enough.

    I don’t even think I really had line changes – just eventually changed players when I felt like it.

    It’s too bad I didn’t keep those cards, the fast action cards or the game sheets of the games I played. It would be fun to see them now.

  4. Ever since my early teens, I’ve enjoyed the thrill of a pennant race. Among other systems I came up with, my favorite was this… I really didn’t have a name for it, but here’s how it went. You would list all the teams down the left side of a paper by division. You would write a number (1-100) to the left of each team. This number would be their winning percentage for that season rounded to two digits. So say you’re replaying the 1985 season, the Phillies would be 46 and the Cardinals 62. I would simply replay the MLB schedule that year and here’s the formula… If the Phil’s played the Cards you would subtract 46 from 62. That’s 16. You then add that number to 50 to get 66. Using a random number generator to get a number from 1-100, if the number came out 66 or lower the Cards won. It’s always the same formula. Subtract lower from higher. Add to 50. This was the chance of the better team winning. It always came out very accurate. There were hot streaks and slumps (which tells me that many streaks and slumps are just chance). For example. I replayed that season and the Cards started 2-8 but ended the season with 100 wins. It sounds pretty nerdy but it gave me hours of entertainment while other kids we’re playing video games. I created other more complex systems but would always come back to this more simple system.

    • I made up a season using players from 1876-1879, with each player getting a card with dice spin probabilities, and played out a season. The Chicago White Stockings won the pennant. Then I did one using 1880-1884 and the Providence Grays won the pennant. Now I’m 2/3 done with my 1885-1889 season and the Detroit Wolverines have a 1 game lead over the Chicago White Stockings, with the Philadelphia Athletics 2 games out. I published books showing my season results, available on Amazon. You can find me there. Len Feder

    • very interesting but i think you must have an advantage for home teams.
      what do you think?

    • what do you do for home team?

Leave a Reply