projections

PECOTA on MLB Network

March 9th, 2009  |  Published in Sabermetrics, links, media, projections

by Myron Logan

Here’s the video:

I don’t think the discussion is quite as bad as it is being portrayed on BTF. I mean, some valid points were made. But, yeah, overall, there was a little too much negativity towards sabermetrics from the panelists, outside of Matt Vasgersian, and a little too much faith put into the evaluations of the scouts. And nobody – again, outside of probably Vasgersian – seemed to realize that both things can (and do) coexist. It isn’t one or the other. Couple of random thoughts:

1. The intro to PECOTA was great. It was very concise, but hit on all the main points. Perfect for a mainstream audience, but detailed enough to give people an idea of what PECOTA is really  about.

2. Vasgersian, as we know from his time with the Padres, is very bright, reasonable guy. What a great hire for MLB Network.

3. The phrase “touch and feel” was said way too much. I think Barry Larkin started it, and it kind of caught on.

It is nice to see that MLB Network is at least going to discuss this stuff. Next time, maybe they will get a couple of guys more familiar with PECOTA, or sabermetrics in general, to argue on its behalf. It seems like they’d have the resources to do this, and it would make for more interesting debate, rather than simply going with three ex-ball players who share a similar opinion on the issue.

So far, though, I’ve been a big fan of MLB Network. I don’t even mind Harold Reynolds, only because he reminds me of when I used to watch Baseball Tonight every night, back in my earlier years. Good times ….

Cameron on college stats

March 4th, 2009  |  Published in College baseball, Sabermetrics, baseball, links, projections

by Myron Logan

Dave Cameron writes so much about baseball that if you’re not disagreeing with him once or twice a week, you probably aren’t paying attention. That is, of course, meant as a compliment to Dave, as he is right on the money most of the time.

But I don’t think I agree with this post on the utility of college baseball stats. First, he talks about the hazards:

From the use of metal bats, the huge variances in quality of opponents, some parks that heavily impact run environments, and the smaller sample of games played, there are all kinds of adjustments that need to be made to try to translate NCAA statistics into something that resembles context-neutral. And, once you’ve done all that work, there is still limited value in the numbers.”

Those are all good points, I think, but I don’t see why they render the stats useless. The two biggest adjustments to make are probably the park and quality of opponents, and that can certainly be done. Those adjustments don’t necessarily make the stats predictive, but they are a step in the right direction.

Anyway, Cameron later goes on to say this:

“Good hitting prospects hit well in NCAA ball, but so do less good hitting prospects, and just using numbers, it’s basically impossible to tell them apart. We’re big fans of statistical analysis here, obviously, but we also need to know the limits of what numbers can tell us. When it comes to college performances, scouting reports are what you want - the guys hitting the fields everyday and looking at swings and athleticism do a better job of predicting which college players will hit in the majors and which ones won’t.”

I’m just not sure that is true. Brian Cartwright has some interesting stuff on the projection side. After the stats are translated, he finds that most players perform relatively similarly in college, the minors, and the majors.

If you read one of my Q&A’s with Chris Long, I think you’ll get the sense that he *certainly* does not ignore college numbers, or even put them on the back burner.

It’s also clearly apparent that scouting plays a big role. I think the best projection system for college players would involve combining both adjusted statistics and scouting reports, in some fashion. The only people really able to do that, at this point, are guys like Long, who have the access to tons of scouting reports that we really don’t. You can add things to the adjusted numbers like body type, swing type, bat speed, etc., sort of like PECOTA, combining numbers, physical traits, and actual scouting. I have no idea if this is actually being done, but I’d guess that someone is doing it.

Anyway, I don’t have a clear answer as to how to weight the stats and the scouting. I’m not sure anybody does. But I think it’d be just about as silly to ignore the numbers as it would be to ignore the scouting reports. And when you have both, like they definitely do in front offices, and like we sort of do with Baseball America-type sites, why ignore either?

CHONE on the NL West

February 21st, 2009  |  Published in Padres, baseball, links, projections

by Myron Logan

Rally has posted his CHONE team projections for the NL West, and it has the Padres at second in the division with an 80-82 record. Now, second place might not mean that much, as there’s a difference of three wins between second and last (and a difference of five between first and last in this remarkably close division) . But, still, 80 wins is better than I expected from any projection system, and CHONE is very well respected.

There’s some additional discussion at BTF, with a good portion of it surrounding the Pads’ surprising win total. I agree with the posters there that 80 seems a bit high, but hey, what do I know? One reason we do these projections is because our perceptions aren’t always on the money, especially  when we’re talking about 25+ players and how they are going to perform is the coming year. That said, we should probably weigh this appropriately with other systems, and add in our intuiton/additional information when we can. But, still, it’s good news, and I think we can use some of that.

UZR Updates

February 11th, 2009  |  Published in Mike Rogers, Sabermetrics, contracts, fielding, player evaluation, projections

by Mike Rogers

Fangraphs just keep getting better. They now have updated the UZR Defensive numbers to include outfield arms and double play runs. Back when The Hardball Times updated their 2008 outfield arms data, Myron looked at it and helped bolster his idea that Brian Giles should be moved off of the right field postion and switch to the oppostie corner. So, lets take a look at Brian Giles now with the UZR outfield arms update.

Brian Giles’ arm is bad. Like, on the extreme end of the worst bad. I’m talkin’ -19.5 runs bad over the last three years.  When the talk about outfield arms was being bandied about as being incorporated into UZR, it has been said that it only really effects the guys on the ends of the category — the very good (Jeff Francoeur) and the very bad (Brian Giles). Giles’ arm is averaging -6.5 runs off of his defensive value on average from 2006-08, and that’s not weighting it at all which would change that since he’s declined each of the last three years: -4.2 in 2006, -5.8 in 2007 and -9.4(!) last year. So, let’s just call it -6.5 runs, over his average of 140 games played in those three years. That would then become about -6.9 runs or we’ll just call it -7.

Defensively, as I noted in the comments of Myron’s post I linked to earlier, the arms ratings really puts a dent in Giles overall value. My comment noted that without arms ratings his defense is +4.42 over the last 4 years. Run that to a Wins Above Replacement conversion using CHONE’s projected .346 wOBA (and a league average of .332), and I get +8.52 offense, +4.4 defensively, +20 for replacement level and -7.5 for positional adjustment, converted to wins above replacement I get 2.4 WAR. Multiply by 0.85 to account for playing time and that’s 2.06 WAR — a bit above-average.

However, if you account for his arms ratings, and to keep it on the 4 year average like I used in my comment, his 4 year arms ratings comes out to -19 (2005 was +0.5 for him in RF). Averaged out, that’s -4.75 runs per year with his arm. Run this into a WAR conversion and his WAR drops to +1.97 WAR. A one-year deal on the open market for a 1.97 WAR player is $9.07 million. Value for a 2.4 WAR player for a one-year deal on the open market? $10.96. Basically, his bad arm is worth about $1.89 million to the bad in terms of his value.

PECOTA’S standings

February 10th, 2009  |  Published in Padres, baseball, links, projections

by Myron Logan

Here they are. Also, everything on that page is free, so we can mess around with it a little without worrying about pissing anyone off. Anyway, let’s park adjust some runs scored, runs allowed numbers. In the table below, you’ve got for the NL West, predicted runs scored/allowed, and park adjusted predicted RS/RA:

Team Record RS RA paRS paRA
Dbacks 91-71 818 731 779 696
Dodgers 83-79 761 746 777 761
Giants 79-83 702 716 695 709
Rockies 76-86 829 891 761 817
Padres 74-88 708 770 770 837

Average runs scored/allowed in the NL is 785. So, how about that? Once again, the Padres offense is its strong point, nearly league average, when park adjusted, and right with the top dogs in the division. The pitching, on the other hand, is projected to be much worse, 52 runs below average, and near the bottom of the league with the Pirates, Astros, Marlins, and Rockies.

In terms of the division, the Dbacks are pretty clear favorites, by PECOTA. They are led by a rotation that is anchored by two of the best starters in the league, Brandon Webb and Dan Haren. PECOTA also projects a 3.77 ERA (unadjusted) and 155 innings out of Max Scherzer, the Dbacks young right hander.

The Dodgers could really use Manny, as PECOTA/Clay Davenport project major playing time out of Juan Pierre in left field. While Pierre’s fielding and base running certainly cuts the gap between his and Manny’s value, it’d still be a big improvement to add Manny’s bat. Further, Pierre could be better utilized as a pinch hitter, pinch base runner, and replacement fielder.

As bad as things seems for the Pads, well, they really aren’t *that* bad. The offense is pretty decent, and with a couple of holes plugged up, could really be excellent. The pitching, outside of a few guys, obviously needs to be rebuilt. A bullpen, I think, can be fixed on the fly, in one offseason. The rotation, though, will be tougher. The Pads need, at least, two or three guys who are legitimate starters, depending on whether or not Peavy stays with the club or not. Whether those guys emerge from the organization is still an unanswered question, but I have a feeling there will need to be some shopping in free agency or the trade market, at some point (most likely, next offseason).

With quite a few interesting story lines, like the ownership transition, the high draft pick, and – oh, yeah – the games, this season should be a fun one to follow. And who knows, if some things break right, it’s not out of the question that this ‘09 team makes things interesting.

Hit Tracker projection system

February 6th, 2009  |  Published in Sabermetrics, baseball, projections

by Myron Logan

I think this projection system, created by Greg Rybarczyk, has a chance to be the best one out there. Why? Because it’s working with a different, more accurate dataset than the other systems, which are all of course developed by very smart people. There is, however, only so much you can get out of the traditional data.

If you’ve been around here for a while, you may remember me talking about what Greg’s model is trying to take into account: the idea that a double isn’t necessarily a double, a homer isn’t a homer. All hits aren’t created equally, especially when you’re trying to predict the future. Why count a bloop double the same as a rope off the wall, when you want to know how a player is going to perform in the future?  We’d all take the rope, right?

For the most part, these things are supposed to even out over time, and that’s what the other projection systems assume. They also regress data back toward some mean, to try to account for the noise. Well, the Hit Tracker model takes it one step further, and actually attempts to correct for this problem using weather information, batted ball speed off the bat, spin off bat, etc. The article I linked to, if you’re into projections, is a must-read.

By the way, my thoughts regarding this were certainly not original. I remember reading an interview of Mark Shapiro where he stated that he thought there was a lot of unexplored work to be done with offensive stats. Surely, they were talking about it in the Spalding Baseball Guides in, like, 1912* : ) The thinking isn’t original, but as far as I know, nobody has actually gathered the data or put the model together, like Greg has. I’m also looking forward to HITf/x, which will apparently debut this year, and could also advance hitting projections (among other things, like fielding analysis). Good time to be a baseball geek!

*And, no, I’m really not kidding. There were some super-advanced articles in those things.

A Mess of a Rotation

January 27th, 2009  |  Published in Daniel Gettinger, Padres, baseball, projections

by Daniel Gettinger

The Padres’ projected 2009 rotation is an absolute disaster.  The depth chart on Padres.com lists (in order) Peavy, Young, Baek, Correia, Geer, LeBlanc, and Prior.  After Peavy and Young, the quality of the pitchers relative to each other is debatable, but none of them can genuinely be thought of as reliable options, either this year, or in future years.  That’s the problem.  Not only is the Padres starting rotation likely to be terrible in 2009, but things do not look poised to improve much in 2010. 

It is one thing to state the Padres rotation will be awful in 2009, but aside from the obscure hodgepodge of names, what can we really expect?*  To answer this question, I used an average of the Bill James, CHONE, and Marcel predictions available at fangraphs.  The numbers for Correia were massaged because CHONE considered him a reliever, and Bill James predicted he would start in only 41% of his appearances.  Also, to make up the gap in games started, I assume replacement level pitchers will start 18.5 games, and last an average of 5 innings per start.  I then calculated the predicted win values using the method outlined by Dave Cameron.  The table posted below contains the relevant data:

*Note: For the purposes of this analysis, I am assuming Peavy will not be traded.  In actuality, I believe Peavy will be traded, but I have no good justification, just a hunch.  Likewise, I am assuming the Padres will not sign a free agent pitcher.  While I am longing for the team to sign Ben Sheets to a 2 deal worth $12 million per season, I think the probability of that happening is close to 0.

Player FIP IP Starts WAR
Peavy 3.42 182.66 30 3.62
Young 4.18 131.66 27 1.28
Baek 4.36 125.67 21.5 0.93
Correia 4.50 120 23 0.68
Geer 4.53 120.5 21 0.64
LeBlanc 5.10 60.5 11 -0.09
Prior 4.66 57 10 0.21

This model predicts the starters will throw 890.5 innings, and total 7.27 wins more than a staff full of replacement pitchers.  For comparison purposes, last year’s starting pitchers threw 909.4 innings, and totaled approximately 5.87 wins over replacement level.  (Note: The 5.87 wins are an approximation, arrived at by dividing runs above replacement by 10).  To put into perspective how awful this projected rotation is, consider Tim Lincecum.  Last year, he contributed 7.5 wins more than a replacement level pitcher.  Surrounding the 2008 version of Tim Lincecum with a bunch of replacement pitchers would result in more wins than the expectation of this predicted 2009 Padres staff.  Another, equally depressing view, is the 2009 staff is not expected to be much better than the 2008 version.

Of course, this analysis only confirms basic intuition.  The 2009 Padres are expected to be bad, and their lack of quality starting pitching is a big reason why.  What bothers me is the lack of quality options available in 2010 as well.  Peavy is young enough that he is unlikely to experience a massive decline in the next few years.  At 29, Chris Young probably has a few good years left in him as well.  So, assuming Peavy is not traded, the 2010 rotation should have at least two solid pitchers. 

The other three rotation spots are dicier.  Neither Baek, nor Correia should be expected to suddenly blossom in 2010.  At best, their performance might be a little better than this year’s predicted performance.  The same goes with Geer and LeBlanc, but with perhaps a little more upside.  Honestly though, a good, contending team would probably not have more than one of these guys in the rotation.  So, what other options does the team have?

Matt Latos is the Padres top pitching prospect, but he has never pitched above A ball.  To expect a great 2010 season from him at the major league level is extremely optimistic.  Will Inman has had success in the minors, but most scouts see his peak as nothing more than a fourth or fifth starter.  I could see both of these guys making solid contributions in 2011, but expectations for 2010 should be tempered.

The way I see it, for the rotation to improve in 2010, the Padres will either need to sign a quality free agent, or make a trade.  Both of these methods have significant drawbacks.  Free agents, and pitchers in particular, are often very expensive, and very risky.  Trading for pitchers is okay in theory, but few teams are looking to give up quality arms, and acquiring one via trade requires the relinquishment of equally valuable assets. 

Making predictions is always tough.  Making predictions about the distant future is nearly impossible.  Too many things can change.  With that said, I am confident the 2009 Padres will have a terrible rotation, and as of this moment, things don’t look much better for 2010.  

Breaking Down the Draft

January 20th, 2009  |  Published in College baseball, Mike Rogers, Padres, Sabermetrics, draft, park effects, player evaluation, projections, prospects

by Mike Rogers

Over the past couple of months, I’ve been filling any down time I’ve had (within reason) to importing and quantitatively evaluating college hitters. After about 3-4 weeks of constant tweaking, deciding what works and what doesn’t, I finally started to settle in to a system that analyzed what I felt were the key points to hit on - however, I’m still not 100% satisfied with the results.

Thus far, I’ve got the 2007 and 2008 numbers inputted into an excel spreadsheet (that’s too big to upload to google docs as-is, so I’ll have to do some more copy/pasting and get it up on google docs or edit grid). The things that I’m tracking are:

Avg/OBP/SLG
Isolated Power (IsoP)
Strikeout and Walk percentages (K/PA, BB/PA  — Note: PA’s are estimated since I don’t have Reached On Error results)
A Speed Score that resembles the one Bill James invented many years ago (before I was born).
Stolen Base Runs (SB*.22)-(CS*.33)
Weighted On Base Average

In addition to this, I’m using the Park Factor numbers from Boyd’s World (invaluable tool in my analysis). This allows me to use the Total Park Factor (TPF), which is the park factor of all the stadiums in which a college team played in over a 3 year stretch (much more reliable to use the 3 year stretch than the single year park factors due to their vast fluctuations year-to-year — especially in college baseball). In turn, I’ve used this to park-adjust all of the hitting statistics to give me park adjusted Average, OBP, Slugging %, Isolated Power, and wOBA. I also got the average wOBA for each conference and then averaged the park factors for each conference and got an Average Park Adjusted wOBA (APAwOBA?) for each of the conferences I tracked. Using this and a players Park Adjusted wOBA, I’ve calculated a Runs Above Average total versus their conference peers.

What I plan I doing in my next post (maybe two posts), is looking at the Padres 2008 draft. The Pads took 21 college players, 16 of which are among the 1988 entries into my system thus far. Of the 5 I don’t have, one was a Division 2 college player and the other 4 were from smaller schools who weren’t in conferences I tracked. The ones that I did track, and have 2 seasons worth of data for, are the SEC, ACC, Big East, Big West, Big 10, Big 12, Pac 10, Mountain West, and Conference USA. I would like to add in the other conferences (like the Mid American, for instance), but some of the data is missing and that’s not as straight-forward of a process, but it’s not impossible either.

However, having 2 years of data and 16 Padres 2008 draft picks, I think is a decent starting point for a possible two-part breakdown of the hitters they’ve selected. I’ll hope to have the first 8 college hitters the Padres selected up Thursday night or Friday afternoon.

2009 Season Projections

January 11th, 2009  |  Published in Mike Rogers, baseball, projections

by Mike Rogers

The Replacement Level Yankees Weblog has got very early projections for the 2009 campaign.

Well, um, at least the Pads don’t make a run at the Mets 1962 season or the Tigers 2003 season! Outside of that, there really doesn’t seem to be much to like.  The 850 runs allowed is 4th worst in baseball over these 100 sims and the 712 runs scored is 3rd fewest in baseball.  It likes Colorado to rebound and I’m guessing a big part of that will be Troy Tulowitzki getting back healthy alongside Todd Helton.

Outside of the NL West, the AL East is stacked, but mostly because the Yankees have signed 932 free agents this offseason, the Red Sox were just a good team that’s getting better by picking up John Smoltz, Rocco Baldelli, and Mark Kotsay (among other low risk, cheap, high-reward deals that no one else in baseball seems to be doing this offseason for reasons unbeknownst to me). I doubt that a team in the AL East wins the division with 100+ wins just due to the fact that the Yankees, Red Sox and Rays should all be in the hunt with the Blue Jays not being a pushover, either.

AL Central is interesting in that the White Sox fall to the bottom of the division. But with the rash of moves that Kenny Williams made last year that made me chuckle, I put nothing past that franchise in terms of defying the odds. Which brings me to the Twins being mediocre — something every statistical category thought they should be last year except for having a .948 average with RISP in 2009.

Predicting Trevor Hoffman’s Market Value

December 21st, 2008  |  Published in Daniel Gettinger, Padres, baseball, projections

by Daniel Gettinger 

Before I begin, I want to take a quick opportunity to introduce myself.  I am Daniel Gettinger, and have been blogging over at There are Better Deals in August for the past few months.  I am very happy to announce that I will now be a part of Friar Forecast, and would like to thank Myron for the opportunity to join him, Mike, and Ben/Padman in contributing to the site.

Back in November, the Padres withdrew their $4 million contract offer to Trevor Hoffman.  Hoffman, who did not respond to the offer in a timely manner, clearly believed he was worth more than $4 million.  The Padres, who refused to increase their offer, thought $4 million was more than generous, and was the maximum they were willing to spend to retain their closer.  This begs the question: what can Hoffman expect to earn as a free agent? 

To try and answer this question, I have built a simple model that predicts the average annual salary of a free agent reliever.  The model is not intended to predict performance, or return on investment, but merely what a reliever can expect to receive as a free agent.

The model is constructed using data from the 2007/2008 free agent class.  Because teams (try to) pay players for future production, not past production, the model uses predicted performance rather than actual past performance.  Marcel forecasts were used as a proxy for predicted performance.

I spent a while messing around with various combinations of independent variables, but the best equation (as is often the case) ended up being one of the simplest.  Predicted innings pitched, predicted runs allowed, and a dummy variable noting whether a player is considered a “closer” were able to explain 83% of the variation in average annual salary received.  All of the variables were highly significant.  The final equation was:

Average Annual Salary=-3.2+0.3462*mIP-0.4772*mR+6.2627*CLOSER(dummy)

To test whether the model continued to accurately predict free agent paydays, I tested its predictions against what 2009 free agent relievers have signed for.  Due to this year’s rough economic climate, I expected to see the model overestimate reliever salaries.  This was in fact the case, with only Kyle Farnsworth exceeding his predicted salary.  On average, relievers received approximately $500K less than the model predicted they would receive.  Once the change in economic climate is corrected for, this simple model was able to predict average annual salary relatively accurately.

Notable relievers who have signed this year include Francisco Rodriguez, Kerry Wood, Damaso Marte, and Jeremey Affeldt.  The model predicted the relievers to earn an average annual salary of $13.47 million, $10.32 million, $3.92 million, and $3.70 million respectively.  Their actual salaries were: $12.33 million, $10.25 million, $4 million, and $4 million.  As a super quick estimator, the model seems to work pretty well.

Now, the whole point of this exercise was to estimate what Trevor Hoffman can expect to receive as a free agent.  This is where things get interesting.  As an established “closer,” the model predicts Hoffman would command a $9.45 million salary.  The Padres however probably recognize that both “closer experience” and the relative importance of protecting 9th inning leads are overrated.  It is quite conceivable that the Padres care not to pay for abstract qualities such as whether somebody is a closer, but instead mostly for measurable run prevention.  When Hoffman is not classified as a “closer,” but instead as a typical reliever, his predicted salary is $3.2 million. 

Remember, the Padres offered Hoffman $4 million.  The difference between their offer, and his predicted salary (discounting the closer premium) could be attributed to a number of factors including: Hoffman’s large fan support in San Diego, a slight boost for his proven reliability in the “closer role,” and forecast error.

Regardless, the model provides some interesting insight into the divide between Hoffman and the Padres.  The Padres have no desire to pay the large “closer premium,” and instead wish to pay Hoffman a salary in line with his predicted performance.  Meanwhile, Hoffman sees no reason he should not get paid like other closers.  After all, he has earned more saves than any other player in major league history.  Whether the Padres and Hoffman are able to close this gap remains to be seen.