This is not a good study (if you will even call it that). I debated whether or not to even post it, but I decided that since most of my stuff is crap anyway, I may as well go ahead. Anyway, the theory goes something like this: The large outfield in Petco is a major reason why Greene struggles more at home than expected (or, perhaps, it isn’t). Either way, I figured we should forget about Petco and look at how he does at road parks of various sizes (like Phantom suggests). The question then arises: how do we estimate the size of an outfield? MGL used the scales on this site and a computer tracing program to do just that. I’ll do what he did and classify outfields with 106,000 square feet or under as small, and ones with 116,000 or over as large.
For large parks (that, in his career, Greene’s played in) we get Arizona, Colorado, Detroit, and Washington. That’s a total of a whopping 328 PA’s. In small parks (Chicago — NL, Cincinnati, Florida, Houston, Boston, Philly), Greene has racked up a measly 292 PA’s. How has he hit in each?
Small parks: .223/.277/.318
Large parks: .330/.393/.625
Before you start listing the problems, here they are:
- The sample size is ridiculously small — somewhere around half a season in both cases. The variation could just be all, or mostly, randomness.
- Parks have changed. MGL’s calculations were for parks in 2007. Greene’s career numbers are used here.
- Obviously, a large outfield does not equal a pitcher’s park. Colorado and Arizona are two of the best hitters parks in the NL, and they’re also the largest. There are numerous other important factors like weather and air density.
- It could be other factors like pitchers/defenses faced causing much of the disparity (if it isn’t simply random variation).
I’m sure there are many more … again, please don’t take this one seriously at all (not that you will). Anyway, the point is (I think) that a large outfield does not mean Greene will struggle in that park. It is obviously more than that. When Greene hits the ball in the air, his subsequent success is probably determined by multiple factors, including outfield size, weather, air density, and so on.
Petco, by the way, is the 4th largest outfield that he’s played in. Of course, he hits just .230/.292/.377 there. What’s the main difference between Colorado/Arizona and San Diego’s parks: altitude and weather conditions.
Further (er, better) research is clearly needed.
The stuff you write isn’t crap. It’s really good.
I agree with Kevin. I also find it interesting that this jibes with what I thought I had perceived (although it is excruciatingly small in terms of sample sizes).
The thing about larger OFs that would seem to favor Khalil is his ability to hit gappers. One of the things I’ve noticed as Khalil has matured as a slugger is that he’s not hitting the towering drives that he did as a rookie (I think at one point he was the only major leaguer to park one on Western Metal during a game). In 2007, perhaps more than any other year, I distinctly remember Khalil hitting more linedrive-type homeruns. His shots in Milwaukee in September are a good example of this.
Hmmm, THT’s LD stats don’t seem to support my LD theory, as he posted the lowest LD rate of his career in 07 (17.8%). His high was 22.1 in 05.
So I’m not really sure what to think. At this point, maybe we could try to find hitters who have a similar swing/style to Khalil and exmaine how they’ve done in large and small parks to get a better comp?
I was thinking the same thing as Phantom: maybe look at Khalil’s line drive/fly ball ratios and see whether hitters with vastly different ratios have more success at Petco or not. Nice work here, this is all good to think about…
Thanks a lot, guys.
I think the first thing I’m going to do is see if this is actually statistically significant (not this particularly, but the whole splits thing we’ve been talking about with Khalil). I’ve got to brush off the old statistics book there, but I’ll see what I can do. I’ve got to see if this is more than just randomness or whatever (which I assume it’s not) before I keep going along this path.
The idea about looking at similar players (in terms of balls in play) is a pretty good one, I think. That way we’d increase the sample quite a bit, while still looking at somewhat similar players. Another thing to think about down the line here.
By the way, if anyone is good at statistics, feel free to let me know how you’d figure out if Greene’s splits are significant (I think I know the basics, but I’m not sure).
Couldn’t find a place to leave a comment on your newest item. Interesting stuff, especially the projections on Wolf and Prior. Looks like the bullpen will shoulder a lot of innings again this season.
Corey, thanks for stopping by!
That’s one of the things I worried about with this theme — for some reason, it doesn’t show a comment link if there are no comments on that post. For future reference, just click the title of the post, and then the comment box will appear at the bottom.
Anyway, yeah, there’s definitely a risk with having guys like Prior and Wolf — I’m sure the Padres know that. The caveat is that playing time projections are tough, and it’s probably (er, obviously) critical for the Padres to throughly investigate guys like Wolf and Prior medically. They could easily beat the playing time projections if they’re “healthier than their past innings would indicate.”
That said, I’m not expecting a ton, at least innings-wise. At least guys like Germano and Hensley are serviceable, because they’ll most likely be needed some.
Re: 4, I just took a stats class last semester so it’s pretty fresh in my head. Let me know what you want to test and I can do the legwork.
-B
Brian, believe it or not, I took a intro stats class last year. I did pretty decent, but for some reason, a good portion of the things didn’t stick. Basically, I want to figure out if Greene’s OPS (or, perferably, something like weightedOBA … but it doesn’t really matter) at home (below what we’d expect) is statistically significant. That is, if we expect him to have a .750 OPS at hime and he’s really at .650, is that significant at like the .05 level or something?
I’m not exactly sure how you’d do it, to tell you the truth. If you know how or whatever, feel free to post it on your blog. If you want to try to walk me through it, comment here or email me if you’d like (“contact” up top). It may be time consuming or not really practical … I’m not sure. Thanks.
Re: 8, it looks like we’d have to build a formula (or, more likely, formulas) to project Khalil’s home performance and then check it against that. This is actually something I’ve been wanting to work on for a while (i.e. building my own projection system). My school provides me with some software that makes this a lot easier to work on. I have some ideas on how to make this work, so you’ll definitely see me work on this in the near future.
Brian, I’ve always just wanted to put together Marcels, which is the “basic” system Tango does. Anyway, I don’t know how to do much of the coding part, but if you ever need some help or ideas, let me know.