Lone Star Ball: An SB Nation Community

Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Around SBN: Shattered: Wisconsin's home winning streak ends

Run Distribution-20 games

Ok, everyone, I've been around for a while, but this is my first post with graphs/pictures, so go easy on me if I don't do this correctly.

I was reading an interesting post over an fangraphs.com about run distribution.  In it they link to a previous article done by Dave Studeman over at The Hardball Times a while ago, which I read, but is a bit more involved.  The basic concept gets to something that I've seen Rangers fan discuss for a while:  Total runs scored is a good way to tell if an offense is good or not, but it always seems that the Rangers offense isn't quite as good as total runs suggest.  Now, undoubtedly, part of this is due to the RBiA effect.  These two links come up with another aspect of offense to consider:  variance.  Which is another way to say, sure scoring 30 runs in a game is great, but it doesn't help you the next game when you only score 2 runs.  So as a way to chart this, Studeman looked at teams from 2000-2004 to see how many times each team would score a particular number of runs.  

"For instance, if your league averaged five runs a game, and your team scored exactly five runs in every game, it would typically have a .600 winning percentage instead of .500, even though it had scored the average number of runs. That is the power of looking at distributions instead of averages."-Dave Studeman

To make a long story short (though the article is very interesting if you have the time), the most important runs scored (in terms of getting wins) are the 2nd-5th runs scored, followed by the 6th and 7th runs.  What I wanted to know was, how does this hold for the Rangers.  First, here is the Rangers run scored distribution (Rangers) and runs allowed distribution (Opponent):

1zqvxgn_medium

via i40.tinypic.com

Now then, assuming that worked, what does that mean?  For those who didn't go through the Studeman article, the biggest thing to note is that this is skewed to the right.  Obviously, small sample sizes apply.  Anyways, a large part of the right skew is due to the fact that the Rangers simply score (and allow) a lot of runs.  However, it is also worth noting that the Rangers have scored 5 and 6 runs the most this season (40% together), more than they have scored less than 4 runs together (25%).  This is good in the sense that scoring more runs is good, but does lend some credence to the idea that the Rangers are boom or bust with their run-scoring.

Another topic I wanted to delve into concerns the fact that the Rangers allow runs in a similar manner to how they score them, namely a lot.  So, how does this change the importance of the number of runs scored from earlier (2-5 are most important, then 6 and 7)?  This is pretty striking, especially compared to league average, but remember that we're only 20 games in:

16h9clf_medium

via i42.tinypic.com

This is only a cumulative look at wins.  It doesn't mean that the Rangers win 90% of the games they score 12 runs in.  It means that in the Ranger wins, they have scored 12 or less 90% of the time.  Why is that useful (since most teams have a high winning percentage when they score 12 or less runs)?  Compared to the league average winning percentage for 2000-2004, the Rangers don't win very many (or any so far) games when they don't score 5 runs. 

As a final note to this novel:  It could be argued that the Rangers feel the need to swing for the fences so much because their pitching is bad and thus they have to score 5 runs in order to win.  Remember though, that consistently scoring an average number of runs results in more wins (usually) than scoring the same number of runs with more variance.  There are some teams who are the exact opposite.  They score less runs, but get more wins out of the runs they do score.  While a majority of that is good pitching and defense, we can now see that there is another aspect that matters: consistency.

 

 

10 recs  |  Comment 17 comments

Story-email Email Printer Print

Comments

Display:

I take the second plot to indicate

that you will win ~60% of games with 5 runs (league average), but Texas will win ~30%… maybe after the curve smooths out over time, it will be ~40%.

by bhudson on Apr 30, 2009 8:18 AM CDT reply actions   0 recs

now reading the last few paragraphs...

it looks like I’m exactly wrong.

by bhudson on Apr 30, 2009 8:19 AM CDT up reply actions   0 recs

Again...

..small sample sizes are in play. But the way you word it is important. 50% of a normal team’s wins will come from games where they score 4 runs or less. None of the Rangers wins so far have come from that set of games.

by GhettoBear04 on Apr 30, 2009 9:17 AM CDT up reply actions   0 recs

This is my favorite type of analysis

And it angers me that people don’t do it more. I’d give you five rec’s if I could.

The marginal value of additional runs scored is not linear.

If you look at the bottom graph, league average, you see that each of the first five runs scored gives you about an additional 12% chance of winning. But that number begins to drop off substantially (the slope flattens out), meaning that the difference between 5 runs and 3 runs (roughly a 25% boost of winning percentage) is far greater than the difference between 6 runs and 8 runs (~15% boost).

This is a huge effect. This basically means that once you have the Rangers quality of offense, adding more offense gives you diminishing returns, whereas improving pitching or fielding (moving the runs allowed to the left) gives you accelerated returns.

As far as I know, almost no “sophisticated” analysis of players takes this into account. Quantifying guy’s value to a specific team by adding their defensive value and offensive value is inadequate if the team is operating in this nonlinear range (like the Rangers tend to be). It makes more sense for the Rangers to add a player who decreases runs allowed than an alternate player who increases runs scored by the same amount.

by JBImaknee on Apr 30, 2009 12:02 PM CDT reply actions   0 recs

However,

you don’t add runs linearly either. Put a big bat in a lineup full of hitters, and it adds many more runs than if you put a big bat in a weak offense.

Basicly, the marginal runs per reduced out goes up.

4/10/09 - Josh Hamilton's last walk.
"You know a pitching prospect isn't any good if John Daniels doesn't trade him away but keeps him insteaad." - http://crops.mlblogs.com/
"You probably can throw Neftali Feliz on that group of overblown Rangers pitching prospect failures." - http://crops.mlblogs.com/

by DJCahill on Apr 30, 2009 12:30 PM CDT up reply actions   0 recs

That's a good point

hitters aren’t independent of one another (whereas fielders and pitchers for the most part are). Their marginal value to the team is determined by the other hitters. I like that.

But does this make a prediction about the distributions of runs scored – that shifting the runs scored distribution to the right should give a fatter right tail than would be expected from a random distribution? Basically, the different runs scored values aren’t independent, it’s easier to score the 7th run in a game if you’ve already scored 6 compared to scoring the 2nd run if you’ve already scored 1?

This makes sense intuitively – in that runs aren’t independent (tend to be scored in clumps), and that bullpen usage varies by situation; good bullpen arms are far more likely to appear in 3-2 games than 10-3 games.

I think this point is related to your point but not exactly what you are referring to.

by JBImaknee on Apr 30, 2009 12:45 PM CDT up reply actions   0 recs

What about variance?

Adding a good batter to a good lineup increases the expected number of runs in any given game.

What’s more important, I would think, would be increasing your expected number of runs, without significantly increasing the variance.

It’s better to have 90% of your run-scored games come between 3-7 than to have 90% of your games come between 1 – 9.

I think the type of batter — high OBP low SLG, low OBP high SLG as examples — would have different impacts on the variance even if they had identical impacts on the expected value.

My expectation would be that high slugging, low OBP players would increase the variance of the expectation moreso than a high OBP low SLG player.
In my opinion, an 800 OPS with 400 OBP and 400 SLG would be more valuable than an 800 OPS with 350 OBP and 450 SLG.

by Trickman on Apr 30, 2009 12:46 PM CDT up reply actions   0 recs

yeah

High slugging, low OBP guys are the definition of high variance players. (1-OBP) is basically the rate of getting out. So a guy slugging .600 with an OBP of .350 is basically saying that he’s going to have a high value AB or a negative value AB.

The problem is how this works at a team level? How does the variance within one batter affect the variance of the team? It may be that 9 high variance people cancel each other out, whereas one or two high variance people surrounding mostly low variance guys is the most variable state.

by JBImaknee on Apr 30, 2009 12:52 PM CDT up reply actions   0 recs

Totally agreed.

And I frequently think this is the primary cause of our run-scored frustrations. Let’s see:
Marlon Byrd: .333/.338/.530 through 68 PAs.
Hank Blalock: .267/.278/.573 through 79 PAs.

Even our better OBP guys like Ian and Michael trail Andruw Jones for walks (eight to 11). Murphy has as many walks as Ian and Michael, too.

As much as I love the power slugging nights, if Nelson Cruz can really be an on-base guy and Marlon/Hank remember that four balls get you on base too, I think we’ll be a much better team.

by jwiscarson on May 1, 2009 10:03 AM CDT up reply actions   0 recs

Diminishing returns

I always have in the back of my mind Bill James’ calculation (I think it was his) that you have to add about 1.15 runs scored for every 1 run allowed to increase your pythag win number. Improvements in run prevention from either pitching or defense pay bigger dividends than adding more offense.

G G G E-flat_______ F F F D__________....

by t ball on Apr 30, 2009 1:00 PM CDT up reply actions   0 recs

keeping pitching and defense static

wouldn’t the 6th-8th runs means more to the rangers chance of winning than the 3rd-5th if we are giving up 5 runs a game to begin with?

by kevinkinsler on Apr 30, 2009 4:48 PM CDT reply actions   0 recs

By the end of the season...

I expect it to be something like 4th-6th runs, but it may end up being 5th-7th. I agree with you, though, that it should be higher than it is for most teams. That was one of the main things I was trying to get at. The reason I used 5 runs is just because of the current statistical oddity that is the Rangers not having won any games without scoring at least 5 runs.

The best support for the idea that the Rangers are an all or nothing offense would be a clear bimodal distribution in their runs scored at the end of the year. I’ll track it throughout the year and see if their variance in runs scored and the subsequent winning percentages start to resemble the league averages more.

by GhettoBear04 on Apr 30, 2009 5:02 PM CDT up reply actions   0 recs

Seems to me that everyone misses the true point.

The Rangers scored around 900 runs and gave up a roughly equal amount last year. According to pythagorean win theory, that is the simple reason we finished close to a .500 winning percentage. If we give up a slight bit of offense for a slight increase in defensive efficiency (let’s say plus 50 and minus 50), then our runs scored and runs allowed are still equal. Still a .500 winning percentage.

 GhettoBear’s elusive concept of consistency (read his last sentence above) is chasing the chimera. It is the equivalent of hoping that our hitters don’t hit poor pitchers well on a given night (think of Kinsler’s cycle against Baltimore’s Hendrickson), similarly, wishing that we could have our best night at the plate against the leagues top aces (think Greinke out-pitching Millwood), just so that we could get 5 runs in every game.

I will grant this point though: Adding an extra 20 runs to a 700 runs scored, 700 runs allowed team, will benefit that team slightly more than adding 20 runs to a 900 runs scored, 900 runs allowed team. This is a natural consequence of how the pythagorean works though. It’s not really saying much.

One definition of variance is the square of the standard deviation. Do you think you can eliminate the standard deviation? I don’t see that it’s possible.

In short, variance is a natural part of the game, just like luck too.

"Evolution happened, now get over it." Michael Shermer

by rodcarew on Apr 30, 2009 9:43 PM CDT reply actions   0 recs

missed point

Rod, I don’t see how the point GhettoBear is trying to make has anything to do with the pythagorean win theory. GB is basically saying that if the Rangers were to score 5.5 runs/gm (5 for odd # games, 6 for even) they would win more games than if those 891 runs were randomly distrubuted by the Rangers offense.

Any interesting graph to make would be the Rangers run distribution vs last year’ s playoff teams on a bell curve. I would bet that the Rangers curve would be wider with a lower peak indicating they scored 4-6 runs/gm in less games than the other teams.

Elvis Andrus - 2009 AL Rookie of the Year

by RangerMad on May 1, 2009 9:30 AM CDT up reply actions   0 recs

My take
One definition of variance is the square of the standard deviation. Do you think you can eliminate the standard deviation? I don’t see that it’s possible.

In short, variance is a natural part of the game, just like luck too.

You can change the variance/standard deviation of runs scored (and runs allowed) for a team. You can’t eliminate it, of course. But there is no fundamental law that says runs must be distributed in a certain way. And run variance most certainly has a role in translating runs to wins – click through his links up there to the Baseball Times article.

Pythagorean expectations are independent of distributions; that is a shortcoming of using pythagorean expectations, not a drawback of looking at distributions. I imagine that in most cases, the distributions of runs scored and runs allowed are comparable, thus the effects cancel out when using Pythag. wins. But in the event they are not, the expected wins will over or under estimate actual wins.

by JBImaknee on May 1, 2009 10:23 AM CDT up reply actions   0 recs

Exactly the kind of misconception I'm talking about is:
But there is no fundamental law that says runs must be distributed in a certain way.

Of course there’s a fundamental law. It’s just not written down anywhere. The law goes something like this:

More runs will be scored against pitchers with high ERA’s. More runs will be scored against porous defenses. More runs will be scored when the batters have high OPS numbers. When you combine all these factors you can sometimes get strange aberrations like the game in which we scored 30 against the Orioles. Think how unlikely it would be to get 30 runs against Halladay with a weak hitting lineup.

Trying to get to 5 runs each game is chasing the chimera.

"Evolution happened, now get over it." Michael Shermer

by rodcarew on May 1, 2009 9:00 PM CDT reply actions   0 recs

Huh?

First, you’re straw manning my argument by saying “Trying to get to 5 runs each game is chasing the chimera.”

I don’t get what you’re saying. That the variance in baseball is locked in place? Of course there will always be some noise. There is always noise in any system. But there isn’t a +/- 1.24623 runs associated with any given score, and we’re just throwing good money after bad for trying to make that +/- 1.21314.

In any system, the signal represents things that you can measure or control, and noise represents things that you don’t understand and have no power over. As you learn more about the cause & effect relationships, you can figure out what features of the system cause parts of that noise, and it stops being noise. Explaining variance is what scientific analysis is about, and ultimately allows you to control that variance.

by JBImaknee on May 1, 2009 10:34 PM CDT up reply actions   0 recs

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about Texas Rangers.
Start posting about the Rangers »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Highfidelity_small
Rangers 2009 Top Plays/Highlights Video
Rangers_small
Adam J. Morris Facebook Fan Page
Ochomerun_small
Feliz The Cheeez
Small
If Lone Star Ball were a movie
Small
Highlights from the Mid-Winter Banquet

Recent FanPosts

110307_1802_00__small
People in my Keeper Fantasy League (and those interested in joining)
Small
Jose Vallejo out for the year
Eastwood_small
Rank the Baseball Commissioners
Th_buckykatt_small
Super Bowl Thread
39135485-59af19dbb26654095f910f34176af094_4ae8a81e-scaled_small
Predictions Group
Cj_photo_day_small
LSB Community Prospect Project: Post Season #30
110307_1802_00__small
so...
Rangersp_small
Other Rangers uni numbers that should be retired?
Sbn_ds_small
Best In The West

+ New FanPost All FanPosts >

SPONSORS


Managers

Th_buckykatt_small Adam J. Morris