It must be something: explaining a close baseball series

Earlier in the week, the Washington Nationals lost their opening-round playoff series against the San Francisco Giants, falling 3-2 in Game 4 in San Francisco. The series offered a lot of gripping, very exciting baseball; and for one Nationals fan, at least, it was an enriching experience even with the loss. After a close playoff series, it is natural to try to understand what happened. I’d like to look at an idea which has surfaced in prominent places in recent days:

** The Nationals suffered from a lack of poise in the face of the heightened pressure in the playoffs; and the Giants exhibited more poise, in a manner which contributed significantly to their victory.

This idea can be found in two recent columns by the Washington Post’s Thomas Boswell (“Washington Nationals must recognize, and embrace, that October is whole new ballgame” and “Hard truth is Nationals are not yet a match for the poised, traditional powers of the NL”, both from October 8). There is similar praise of the Giants in Jayson Stark’s article “For Giants, it’s ‘ugly, but it works’” (also October 8).

I’m afraid I think reactions like this are superficial. Both teams scored nine runs over four games, so by this familiar measure they were equal. But we all share a tendency to think that the Giants must have won for a good reason: there must be something which distinguishes the two teams. Rather than being unique to curious baseball fans, this desire for an explanation has deep roots far outside the sporting world; it is codified in some circles as “the principle of sufficient reason.” Regarding the baseball playoffs, this principle is applied as follows:

Playoff contests between evenly matched teams are often won by the team which possesses more poise. As compared to the regular season, there is more pressure in the playoffs, and what really matters is whether you respond to this with poise. In fact, poise is so important in the playoffs that it often allows a less talented team to beat a more talented team.

Several factors combine to make the poise theory an inevitable diagnosis of the Nationals-Giants series. The Nationals had a better regular season record (96 wins vs. 88 for San Francisco) and are perceived as having more talent. Also, the Giants had established a reputation as a very poised playoff team by winning two of the previous four World Series, and in the course of that, winning six playoff series (now seven) in a row. From my side of the country, it sounds like they also picked up a reputation for outperforming their regular season record in the playoffs.

Not only that, but in 2012 the Nationals had another excellent regular season before losing to the Cardinals in a five-game first round playoff series. As you know, the Cardinals also have a reputation for being a poised playoff team. And it should not be a surprise that the Nationals-Cards series seemed to lend itself to the diagnosis that the Cardinals exhibited more poise.

Our series pitched a post-season poise team against a regular-season performer with question marks surrounding its playoff poise. So, after the series concluded in the manner that it did, it was obvious that the poise theory would quickly make its presence felt.

The problem with the poise theory is that it starts with the winner and works backwards. It cherry-picks moments that are easy to remember, at the expense of more gradual or incremental dynamics. The theory routinely assigns these moments too much significance. Often, this mindset looks at only one side of what happened at various points in the game. The analytical result is that the winner won via poise, and the loser gets no credit for exhibiting poise, or any other positive qualities.

The poise account of the Giant-Nationals series is that the Nationals were frozen by the moment and didn’t hit well, that the Giants tied game 2 when down to their last out (and won it with a poised HR in extra time 9 innings later), that the Nationals made several on-field errors in game 4, and made two questionable (or just bad) bullpen-decisions in games 2 and 4…and that the Giants played gritty, opportunistic, mistake-free baseball throughout the series.

One obvious flaw in the poise account is that the last idea is false: Madison Bumgarner’s throwing error in game 3 allowed the Nationals to score 2 runs in their 4-1 victory. In addition, this error was triggered by a two-strike bunt from Wilson Ramos, which would seem to qualify as an exhibition of playoff poise (and of a player adapting to the moment, etc.).

Why doesn’t Bumgarner’s two-run throwing error count against our attribution of poise to the Giants? One reason is because we are working backwards from the fact that the Giants ultimately won the series. Since the Giants won a close series which can only be explained in terms of poise, elements of the series which clash with this narrative are suppressed to preserve the integrity of the explanation.

The “poise” explanation of the Giants’ victory is also challenged if we admit that the Nationals exhibited poise, because then the two teams do not differ in a way that explains the Giants’ victory.

Unfortunately for the poise theory, the Nationals displayed loads of this quality throughout the series – for example, via a two-strike bunt, via Jordan Zimmeran’s game 2, or via Doug Fister’s game 3. (If you are currently protesting that Ramos’ bunt was very improbable, you are just tracking the series outcome and the prior reputations of the teams).

Also, in game 4, although the Nationals certainly struggled in innings 2 and 7, including loading the bases twice, walking in a run, and throwing a wild pitch — they kept themselves in the game by limiting the total damage to 3 runs. This fact would have played very well in “poise” articles written in the scenario where the Nationals went on to win. It is now somewhat difficult for us to see poise at work in those innings. But again this is perception well shaded by the outcome. This illustrates how in baseball the attribution of poise just tracks who won a close game or series.

The poise theory cherry-picks parts of games; it also cherry-picks parts of plays. In the seventh inning of game 4, after his wild pitch, Aaron Barrett threw a ball over the catcher Ramos’ head; they were trying to walk the batter. But Ramos was able to recover the ball, Barrett covered the plate; and, in a poised, well-executed play, they threw out Buster Posey at the plate, thus preventing another run.

In game 2, with Drew Storen pitching in the 9th, Pablo Sandoval hit a ball down the left-field line which scored one run, which tied the game, and which threatened to score two. But the Nationals made two accurate throws starting from deep left field, and a good tag at the plate, to get Buster Posey (again, so to speak) at the plate.

The poise theory presumably gives Posey credit for pushing the action in close games; and here I agree. But we should also give credit to the Nationals for showing the poise, and, relatedly, the baseball fundamentals, to throw him out twice to prevent runs.

I think a normal look at poise finds it in abundance on both teams in this series. However, the baseball variant of this concept has a different logic. This variant just tracks the winner when the outcome is close.

In addition to the poise issue, there were other interesting aspects of the series.

Although the Nationals were regarded as the better team, the two clubs were not far apart with respect to many regular-season statistical measures.

Nationals batting (pitchers excluded):
.261 avg. / .330 oba / .407 slg. *** 107 wRC+, 151 HR *** 8.6% BB / 20.0% K

Giants batting (pitchers excluded):
.263 avg. / .319 oba / .401 slg. *** 107 wRC+, 128 HR *** 7.2% BB / 19.3% K

The two teams had very similar offenses, although the OBA and HR numbers represent real differences. Also, their K and BB rates cohere (to a small degree) with the idea that the Giants are more of a contact hitting team, in that they swung more (i.e., walked less) and struck out less than the Nationals. One suggestion I’ll make below is that some of the Nationals should have swung a bit more.

Turning to pitching, although the Nationals came in with a better pitching reputation, and although the Nationals have better pitching, this point is not straightforwardly validated by the full range of ERA-like measures made available by contemporary analysis:

Nationals: 3.03 ERA / 3.18 FIP / 3.43 xFIP
Giants: 3.50 ERA / 3.58 FIP / 3.59 xFIP

The pitching stats converge as we move to measures which factor out balls in play (roughly, FIP) and then factor out the home run/fly ball rate (roughly, xFIP).

FIP and xFIP bring the teams together; so do somewhat blunter measures like runs allowed per game:

Nationals: 3.43
Giants: 3.79

The teams’ xFIP’s were very close, and they were closer than I would have guessed in terms of Runs Allowed. The Nationals had a better record, but I think this was due in part to the Giants playing the Dodgers more! These teams were closer than the lead-in fanfare communicated.

I’ll offer two observations about the Nationals’ hitting, both of which cut somewhat against the playoff poise theory. The first is that while the Nationals’ offense certainly has a high-gear mode, this is not the only face they present to the world on an ongoing basis. For instance, the non-pitchers were .252 avg. // 101 wRC+ in the first half of this season…vs. a .273 avg. // 115 wRC+ in the second half.

The streakiness is due in part to a group of more or less low-average, high-power players (LaRoche, Desmond, Ramos). These players are somewhat prone to 4-0-0-0 nights anyway, and in the playoffs series the Giants appeared to have good plans for them. My subjective recollection is that there were many at-bats when these players were not close to getting a hit.

But what about the Nats’ better hitters? I am thinking of Rendon and Werth in particular, and again the Giants appeared to have a plan. Here I do have a concrete suggestion about what was going on. Werth and Rendon each had 20 plate appearances in the series, and they both had 10 appearances where they took the first pitch as a called strike. This may be a surprise to you, but I doubt it’s a surprise to the Giants. Werth and Rendon are both deliberate hitters, and I think the Giants resolved to take advantage of this and to keeping throwing early strikes until Werth and Rendon made them pay.

Of course, Rendon batted .368 for the series, and Werth batted .056. However, Rendon’s hits were all singles, from a 39 2B/21 HR hitter. The Giants gained an edge here – in a specific, tangible way – and Rendon and Werth didn’t make the requisite adjustment. But this is one piece of a story which could easily have been different. For example, Rendon hit a very deep fly ball in extra-innings game 2, which might have made it to the wall or farther in different wind conditions. Werth had similar misfortune on deep fly balls, the most memorable of which was Hunter Pence’s excellent catch late in Game 4.

The Giants deserve credit for executing a good approach against the Nationals’ hitters. On the other side, the Giants did not exactly light up the Nationals’ pitching. After game 2, the Giants did not score a run off a hit. So I suspect that the Nationals’ pitchers executed similar strategies as well. These layers of the competition are more remote to those of us who observe the game from the outside; but they are probably more significant than psychological differences between the teams.

What about Bryce Harper?

Bryce Harper did more than exhibit poise in this series. Bryce Harper displayed the superlative animal dynamism which our games can extract from us and showcase, the best they can offer. More than any other player, Harper elevated a series marked largely by deadlock and attrition. A series like that does require poise, which both teams showed. A series like that is exciting, but not transcendent. Poets celebrate poise when a contest offers little other inspiration.

OK, what are the proper takeaways?

Looking at what the Nationals should take from this series, Boswell writes

If you send the winning run home on a wild pitch (Aaron Barrett); if you can’t field a two-hop grounder back to the mound (Gio Gonzalez); if three players look at each other and none of them picks up a sacrifice bunt attempt (Gonzalez, Anthony Rendon, Ramos); if you can’t throw a strike with the bases loaded and walk home a run (Gonzalez); if you get confused and throw home when no Giant is actually running toward the plate (LaRoche), squandering an out, then you have no business staying at baseball’s October party.

Amen! But why not issue a similar edict against the Giants, who, again, did not score a run off a hit in the last two games? Out of context, that doesn’t sound like a terribly promising formula either.

Boswell also draws an analogy to golf: “Right now, the Nationals are like professional golfers who win a bunch of weekly Tour events but falter under the pressure in major championships.” His remark connects us with a long-running discussion in golf about competitors with various records in the majors (the Masters, the US Open, the British Open, and the PGA Championship) and in regular events. This discussion of golf players is characterized by an all-too-familiar blend of mythology, pop psychology, and information gaps. Nonetheless I think there are instructive parallels between the majors and the baseball playoffs, which help us understand the recent Nationals-Giants series, and perhaps offer some lessons for the Nationals looking ahead.

Boswell’s peroration about disqualifying mistakes is wrong. Golfers win major tournaments despite serious, embarrassing, incriminating blow-ups. At Carnoustie’s 18th hole on Sunday of the 2007 British Open, Padraig Harrington twice hit his ball into a narrow, winding waterway, but ended up winning a playoff against Sergio Garcia. I am fine with the idea that you have no business trying to win a major if you find the water twice on the 18th hole. But this plausible moral stance is falsified by events. Similarly, in 1999, on the same hole at Carnoustie, Jean van de Velde elaborated an even greater disaster; he blew a three-shot lead, but still qualified for a playoff.

The significance of an error depends on where you are in the competition and on what your opponents are doing. In a high-pressure situation, they may not be doing very much. At Carnoustie in 2007, the course and the moment got the better of everyone, in that the top three finishers (Harrington included) were a combined six over par for the last two holes. At Carnoustie in 1999, the course had been winning all week, in that no one finished under par for the tournament. van de Velde’s blow-up brought him into a three-way tie at 6 over par. Looking at a different golf course, in the 2006 US Open, won by Geoff Ogilvy, the top four finishers all suffered serious damage on the final day, with Phil Mickelson and Colin Montgomerie taking double bogeys on the final hole.

The Nationals should work on their play inside the diamond, but they shouldn’t beat themselves up about it. Everyone is likely to screw up in the furnace of playoff pressure, including the Giants…who yielded two runs on one bunt.

Let’s say that an attrition contest is one in which even the winner takes a beating. Although this model is prominent in major golf, it is not universal. (I’m sure it isn’t in baseball either. But I have a better grasp of recent golf). Some players get a lead early and are never seriously threatened. Many of Tiger Woods’ victories fit this pattern. A recent, more mortal example is Martin Kaymer’s 8-shot victory in the 2014 US Open.

Another interesting major winner is Charl Schwartzel, who birdied the final 4 holes at the 2011 Masters, to resolve a highly fluid final-day horserace in which 8 different players had at least a tie for the lead at different times during the day. Five past or future major winners finished behind Schwartzel in the top 10, as well as Rory McIlroy, who lost a two-stroke lead, shot an 80 for the day, and finished out of the top 10. (McIlroy won the next major and has since won three more majors). Schwartzel elevated his play above his competitors at the climax of one of the world’s great sporting events. In this setting, against this group, poise is out as an explanatory variable. Schwartzel won with the sort of imperious dynamism which I have already praised as the most admirable character trait athletic competition reveals to us.

I think the Nationals can win an attrition playoff series, because they almost did. (Just ask the Giants in a candid moment). But playoff success for them is likely to go by a different path. A team which can post a second-half 115 wRC+ without a healthy Ryan Zimmerman and Bryce Harper, while posting a team 2.96 ERA over the same period, may not need to change the way it plays. It may need to embrace the way it plays.

Less poetically, I’m optimistic about what the team can do with a full season of Zimmerman and Harper, Harper, Harper, Harper :-).


3 ways to improve the Washington Nationals’ offense by about 10 percent

In lieu of a full “preview” of the Nationals’ 2014 season, I want to look at one of the main questions about the team going in to the season, which is whether they can improve their production on offense. In 2013 the Nationals were an average National League offense. They scored 656 runs, just above the NL average of 649 runs.

I’d like to look at why the Nationals were merely average last year, why we think they are better than this, and also look at how they might get better in 2014. Many other people have already cited the poor performance of the Nationals’ bench last year (including Dave Cameron on Fangraphs); and my suggestions are broadly in line with this diagnosis. I’ll offer some development of the fairly common idea that there is a lot of room for improvement in certain parts of the Nationals’ batting order. My optimistic take is that it would not be difficult for them to get about 40 percent more offense from 20 percent of their lineup. That would be about an 8 percent improvement overall, which is real money in this part of the world.

As we discuss the Nationals’ 2013 season, I will look at an index statistic which is related to runs scored, called weighted Runs Created, or wRC+. This statistic attempts to capture a player’s overall run production or contribution to team offense. wRC+ gives MLB-average run production a value of 100, and looks at the percentage by which you surpass or fall short of this level. For example, a player with a 103 wRC+, such as Washington’s Adam LaRoche, outperformed MLB average by 3 percent. Washington’s Denard Span, at 97 wRC+, underperformed MLB average by 3 percent.

Although wRC+ is a stat for individual players, you can also apply it to teams by looking at the team’s total plate appearances for the year. wRC+ involves a more complex formula than simply adding up a team’s runs, but the two methods for assessing a team’s offensive performance are convergent.

For instance, in 2013 the Nationals had a wRC+ of 95, which was just above the NL average wRC+ of 93.9. (The National League team average is less than 100 because pitchers bat in the NL, and bring down the team and league wRC+ averages). The Nats’ total of 95 was good for sixth in the NL:

National League offense, 2013 (top seven teams)

Team

PA

HR

R

RBI

ISO

BABIP

AVG

OBP

SLG

wRC+

Cardinals

6202

125

783

745

0.133

0.314

0.269

0.332

0.401

106

Dodgers

6145

138

649

618

0.133

0.308

0.264

0.326

0.396

104

Braves

6133

181

688

656

0.153

0.3

0.249

0.321

0.402

101

Giants

6168

107

629

596

0.121

0.304

0.26

0.320

0.381

99

Pirates

6135

161

634

603

0.151

0.294

0.245

0.313

0.396

98

Reds

6293

155

698

664

0.142

0.293

0.249

0.327

0.391

97

Nationals

6047

161

656

621

0.146

0.292

0.251

0.313

0.398

95

Average

6141.1

143.8

648.67

616.2

0.137

0.297

0.251

0.315

0.388

93.9

Std Dev

92.1547

23.64

60.5

58.3

0.013

0.011

0.012

0.011

0.019

8.21

The Nats were 127 runs behind the Cardinals, and 11 points behind them in wRC+. Can the Nats close this gap in 2014, and get up to around 105 wRC+? Although it’s an obtuse question out of context, we are going to look at several fairly specific ways in which the Nationals might get to 105 wRC+. Doing so would give them a top offense and a much better foundation for a playoff run and success in the playoffs.

As you probably know, the Nats’ offense was very bad in the first half of the season, and very good in the second half.  In the first half of the year (through the All-Star break) the Nationals’ offense was at 88 wRC+. This number corresponded in particular to an unimpressive team batting average (.241) and on-base percentage (.301). After the All-Star break, the Nationals caught fire and dramatically improved their offensive performance. They batted .265, their OBP went up to .329, and they out-homered and outscored everyone else in the league.

National League offense, second half 2013 (top seven teams)

Team

PA

HR

R

RBI

ISO

BABIP

AVG

OBP

SLG

wRC+

Dodgers

2608

63

289

274

0.141

0.314

0.267

0.329

0.408

108

Nationals

2554

75

299

282

0.149

0.301

0.265

0.329

0.414

105

Pirates

2624

72

277

264

0.158

0.296

0.250

0.32

0.408

103

Giants

2571

45

251

240

0.116

0.299

0.256

0.322

0.372

98

Cardinals

2642

43

321

305

0.123

0.306

0.259

0.324

0.382

98

Reds

2587

63

285

268

0.138

0.291

0.247

0.327

0.385

96

Braves

2512

67

273

261

0.14

0.299

0.247

0.317

0.387

96

Average

2564.2

56.93

265.2

252.5

0.132

0.298

0.250

0.316

0.382

92.7

Std Dev

70.4548

11.94

31.4

29.3

0.016

0.013

0.014

0.013

0.024

10.02

 The second-half statistics provide grounds for optimism going in to 2014, and they also provide our first answer to how the Nats might get up to 105 wRC+ in 2014. Namely, the Nationals should just keep doing what they did from mid-July onwards in 2013!

Although I think we should all be encouraged by how the Nationals finished the season, we shouldn’t forget the beginning either. In fact, the “first half” sample that we have in hand comprised about 60% of the 2013 season’s plate appearances. In addition, there are a couple of other somewhat optimistic analyses available which build up from the full-season data. Anecdotally, these analyses still support giving a little more evaluative weight to the Nationals’ performance in the second half of the season. 

As Nationals fans will remember, in 2013 several players from the bench, and players who were fighting injuries, got many more plate appearances than ideally they would have received (Espinosa, Bernadina, Suzuki, Tracy, Lombardozzi, Moore). These players — who each had136 or more at-bats — by and large performed very poorly. They performed far below league average, and often well below their previous career levels. I believe that combinations of playing hurt, playing different positions, and getting irregular turns in the lineup all contributed to their collective poor performance.

Although the effect was at its worst in the first half of the season, the six players got enough playing time to grab 1236 plate appearances, or about 20 percent of the season’s total.  And the overall impact on the Nats offense was very damaging, in that the Nats ended up getting a terrible total return from these 1236 plate appearances:

PA

HR

R

RBI

AVG

wRC+

1236

18

95

97

.214

54

 Leaving aside a couple of nuances, the Nats’ 2013 performance from these plate appearances is comparable to having two players hitting .214 with 9 homers each (etc.) batting 7th and 8th in your lineup, for every game of the season. In terms of wRC+, two spots in the Nationals’ lineup — not including the pitcher — were performing at levels almost 50 percent below league average.

But we can also look at various bright sides. To begin with, if two spots in the batting order were at 54 wRC+ for the year, the rest of the lineup must have been doing pretty well to post an overall 95 wRC+. More precisely, if 20 percent of the Nationals 2013 plate appearances were at 54 wRC+, what did we get from the remaining 80 percent?

The answer is actually 105 wRC+, or five percent above MLB-average run production. This is a more intriguing exposition of the Nationals’ offensive potential. If you subtract the contributions of six struggling players who got an atypically high number of at bats in 2013, you get a full-season offensive number which would have topped the National League. This is the second way for the Nationals to get up to 105 wRC+.

Of course, as a practical matter, we can’t simply eliminate ~1200 plate appearances from the Nationals’ season. They had around 6000 plate appearances in 2013, and they will have a similar number in 2014. But we can ask who might take those ~1200 plate appearances in 2014, as opposed to the players who got them in 2013. And we can look at what sort of composite wRC+ the 2014 replacements might produce with those ~1200 plate appearances.

This is a story about who was hurt on the Nationals in 2013, who else emerged in 2013, and who they picked up in the offseason. In some ways, this is going to be a pretty casual exercise, in that I’m simply going to apply various numbers from last season to 2014. But I think the results still have some value.

  • Bryce Harper (LF) played 118 games in 2013, with a total of 497 plate appearances. For present purposes, I will assume that his plate appearances go up to 600 in 2014. And I’m going to assume that he repeats his 2013 wRC+ number of 137, or 37 percent above MLB average. That is to say, Harper gets 100 of the ~1200 available plate appearances, at a wRC+ of 137.
  • Wilson Ramos (C) played 78 games in 2013, with a total of 303 plate appearances. For present purposes, I’ll assume that Ramos’ plate appearances go up to 450. So he gets 150 of the available ~1200 plate appearances, at his 2013 wRC+ of 114.
  • Anthony Rendon (2B) played 98 games in 2013, with a total of 394 plate appearances. For present purposes, I’ll assume his plate appearances go up to ~600; so he gets 200 of the 1200 plate appearances, at his 2013 wRC+ of 100.
  • Nate McLouth (OF) is an offseason free agent pickup for the Nationals who is slated to be their fourth outfielder. I am making a very casual estimate that he will get about 250 plate appearances in 2014, at his 2013 wRC+ of 100.
  • Jose Lobaton (C) is an offseason free agent pickup for the Nationals who will back up Wilson Ramos at catcher. For present purposes I will assume that Lobaton gets 150 plate appearances in 2014, at his career wRC+ of 87. His wRC+ was 103 in 2013, but I’m going to going with a more conservative number.

These assignments take 850 of the 1236 plate appearances in question, with a balance of 386. I’m going to assume that the remaining 386 plate appearances go to players to be named later, and that these players perform at 80 wRC+. More and less conservative estimates would of course yield different outcomes.

Note that the plate appearance figures listed below for Harper, Rendon, and Ramos are not all the at-bats they will presumably get in 2014. The figures in the table below are the at-bats beyond their 2013 totals which, we hope, will cut in to the ~1200 at bats last year in which the Nats performed very poorly.

Here is what we get on the present assignments:

Player

PA

Percent  of Avail PA’s

Assumed wRC+

Raw input to composite wRC+

Bryce Harper

100

8%

137

11.1

Wilson Ramos

150

12%

114

13.8

Anthony Rendon

200

16%

100

16.2

Nate McLouth

250

20%

100

20.2

Jose Lobaton

150

12%

87

10.6

Others

386

31%

80

25.0

Total PA

1236

100%

Composite wRC+

96.9

 The above paints a scenario in which the Nationals get a 42-point wRC+ improvement from 20 percent of their plate appearances, by relying on healthier starters, and on a better performing bench. If we combine these improved figures with the other 80 percent of the Nationals’ 2013 offensive performance, we get a full-season figure of 103.6 wRC+. This is the third and final way for the Nationals to (almost) get to 105 wRC+ for 2014. My discussion has not been free of speculation, assumptions, and shortcuts. But I think we are right to focus on the areas where the Nats performed worst in 2013, and on simple ways in which they might get better in 2014.


“Be first; be smarter; or cheat.”

With spring training getting going in a few days, the Nationals face several roster decisions, notably in the infield and at the back of their starting rotation. The spring season and workouts represent an opportunity for players to compete for spots on the roster. I have been thinking about the nature of this competition, partly in connection with some likable players who may not make the roster. What conclusions can we reach about a player’s ability based on his performance in spring training, and based on whether he makes the roster? Conclusions of this kind are not as solid as we might think. First, the spring training season is too short to provide a solid projection of how players might perform in the regular season. Second, since there is only a small sample of performance to look at, other factors play a large role in teams’ decisions about who to keep. These factors move spring training away from being an ability-driven process through which the players who play the best make the team.

Moreover, the smaller role accorded to ability in spring training fits the dynamics which we find in other segments of society, looking at other sports, and looking beyond sports as well. As others like Malcolm Gladwell have pointed out, the idea that ability or merit often does not play a large role in selection or decision making may be discomforting; this idea may clash with firmly held parts of our self-image. In this post I introduce these issues with some starting points outside baseball, before returning to spring training at the end.

At one of many dramatic high points in the film Margin Call, the investment-banking CEO played by Jeremy Irons tells his assembled executive team that “there are three ways to succeed in this business,” and then lays out the options listed above. Irons’ remark is neither idle nor indulgent. The company is experiencing the beginning of what will become the 2008 financial crisis. Jeremy Irons delivers the remark at an emergency meeting being held at around 3:30 in the morning. Irons and other executives have come in because the company has discovered that its bond portfolio is about to experience catastrophic losses. And Irons has just asked his immediate reports for recommendations on what to do. Because of the severity of the situation, the suddenness of the news, and perhaps because it’s kind of an unusual time for a meeting, initially Irons’ team is not able to say anything. So Irons perorates a bit with the remark above, which they’ve all heard many times before. Irons also adds that he doesn’t cheat, and says that while they have lots of smart people in the company, in many cases it’s a lot easier to just be first.

So prompted, the company decides to try to sell all their bonds when the markets open up later that morning. Selling first will allow them to get out ahead of the massive losses that are about to affect the investment banking industry and in turn much of the rest of the world. Although Irons’ remark plays a catalytic role in the film, you may have discerned that I think it has larger significance beyond this, and beyond the industry to which Irons applies it.

The three avenues provide a nice framework for understanding how individuals and companies can get ahead or obtain a competitive advantage. There are not sharp divisions between the boundaries. Many effective courses of action will display elements of two or even all three of the strategies. For example, although Jeremy Irons clearly sees the “sell” strategy as an application of “be first,” many others would say that knowingly selling securities at normal market price when their value is plummeting is a straightforward example of “cheating.” This view is well represented by Kevin Spacey’s character in the film.

In addition, since the company discovered the impending losses by way of complex technical analysis performed by highly educated former engineers, the idea that the company was  “smarter” is worth a hearing. However, the film itself and much of the coverage of the 2008 crisis indicate that core components of this sort of analysis — including the “value at risk” approach mentioned in the movie — were used by many investment banks. One character in the movie specifically says that it’s only a matter of time before other people figure out the problems associated with these securities.

So I think we should understand Margin Call as providing an example of mixing being first with cheating. It is also an example of how individuals and companies may not be fully self-aware or candid with themselves about which strategy for success they are employing, even when the options are staring them in the face because of their own pronouncements. Before exploring this point a bit more in terms of, yes, baseball, it might be helpful to refine the three options a bit so the differences between them are a little clearer. According to my proposed refinement,

  • being first encodes the importance of positioning, and the possibility of succeeding simply because of your position, broadly understood, or succeeding primarily because of your position
  • being smarter encodes the importance of ability, broadly understood, and the possibility of succeeding because you are more able in relevant ways than your competition
  • and cheating just encodes the point that cheating can sometimes be a successful strategy, distinct from the above avenues.

To see what I have in mind, think about a 10-lap car race which someone wins because of track position. An artificial but perfectly coherent example of this would be one car which is on the inside, shorter portion of the track for the whole race, as opposed to its competitor car which is on the outside, longer portion of the track. If the cars are going the same speed, the inside car does not display more ability than the outside car. Nonetheless, the inside car wins because its 10 laps are shorter than the outside 10 laps. In this case, the inside car wins because of positioning.

There are other more real-world cases where a driver wins because he has the fastest car. Importantly, this is how experts such as the announcers, drivers, and crew chiefs will report the race: “Jimmy Johnson had an incredible car today.” This is an example of winning with ability. Sticking with car racing, you might also figure out a way to win by illegally making your car lighter, or something like that. Here we’d say that you won by cheating.

These three cases are meant to be examples in which one of the three options dominates, and provides the best understanding of the example, or the best single answer to what happened with it. Although there is often some intermingling between the factors of position, ability, and cheating, there are plenty of cases in which one of the factors provides the primary understanding of what’s going on.

Now, applying this framework, it follows that there are 3 ways to get a job:

  • you can get it because of your positioning
  • you can get it because of your ability
  • you can get it because you cheat

Examples of route 1 and route 3 are all too easy to come by. You get a job because you know someone; another candidate gets a job because they falsify their resume. It may be somewhat disconcerting to realize that clear examples of route 2 are somewhat hard to find in the real world. I would suggest that by and large people do not win political elections because of their ability. Also, it is a rather fraught matter whether people get into elite colleges, law schools, medical schools, because of their ability; and for what it’s worth I don’t think the best view assigns great weight to this factor.

More broadly, there are other “anchoring” situations where someone may interview John first, or read John’s writing sample first, and the interviewer likes John well enough that they develop bias or laziness towards the rest of the candidates, who might be as good as John or even better. In this all too plausible scenario, if someone else had interviewed first, he or she would have gotten the job. The process did not necessarily identify the best candidate — rather, it identified John! — who was reasonably qualified, and, more importantly, first. (Equally, interviewing last can be an advantage for similar superficial reasons).

When you hire someone you know, there is often a plausible basis for doing so, in terms of your knowledge and trust in the person. You might resist the verdict that you hired person X because of their positioning, but you would acknowledge the relationship as an important factor in your hiring decision. This sort of approach enjoys widespread acceptance, and I’m not criticizing it.

By contrast, however, I think many people would resist the idea that they hired John because he was the first person they interviewed. They are self-consciously trying to select on the basis of ability. But for firmly rooted psychological reasons beyond their immediate control, they end up selecting on the basis of positioning. This is another example of how individuals and companies may not be fully self-aware or candid with themselves about which strategy for success they are employing.

At first glance, ability is more of a factor in the world of sports. I want to stick with the dynamic of people winning jobs, so let’s think of things like winning a spot on the Olympic team and winning a spot on a major league baseball team. Some Olympic spots are assigned by human selection, such as the figure skating spots and places on the basketball team. Human selection can be influenced by positioning factors of different kinds.

And some Olympic spots are occasionally won by cheating and by crime, such as the figure skating spots. Nonetheless, there are plenty of Olympic team spots and Olympic medals whose achievement is more ability driven. Perhaps the best examples of this are from the summer Olympics, which have the track and field events and swimming. The qualification standard to compete for the US at the Olympics in 100M freestyle is very simple: you have to qualify for the trials, by meeting a time benchmark, and then you have to finish first or second at the trials themselves. If you swim faster than all but one person, or faster than everyone, you’re on the team.

This criterion is quantitative rather than a matter of judgment. Therefore, although you can cheat, you do so in an ability-driven arena. In the same vein, the people who run US swimming and who coach the Olympic team are trying to identify the fastest swimmers, and their approach constitutes a good method of achieving this. Good swimmers can have an off day or get sick, and might not swim their fastest on the day of the trials. But I think this is an example in which the relevant actors, the lords of US swimming, have an accurate view of the method they’re employing to achieve success, which is an ability-based approach anchored in quantitative measurement of competition.

Even so, the relative importance of ability in swimming can be overstated. Positioning is important in that the training required to reach elite levels in swimming is expensive, and accordingly beyond the reach of many families. Other sports seem not to be as expensive, like running, and thus in my view they represent a more pure form of open, ability-based competition. For this reason, I tend to think that the gold medal in the 100 yard dash is one of the most impressive credentials around, in or outside of athletics.

But we should not let this favorable and indeed sentimental look at the 100-yard dash determine our view of the roles of position and ability in sports. There are cases in which positioning has a bizarre and unsettling amount of influence. As I was writing this post, I remembered a great vignette in Malcolm Gladwell’s book Outliers which is relevant to our current topics. Early in the book Gladwell relates the tale of a woman watching an elite youth hockey game in Canada, who studied the rosters and noticed that a disproportionate number of the players had been born in the first few months of the year. She was watching the game with her psychologist husband, who had the training and orientation to make the most of this information. He looked up a bunch of data and determined that elite Canadian hockey players reliably have the following birth distribution:

  • 40 percent are born between January and March
  • 30 percent are born between April and June
  • 20 percent are born between July and September
  • 10 percent are born between October and December

The explanation of this disorienting fact has to do with the January 1st age-group cutoff for youth hockey in Canada, whereby your age on New Year’s Day determines the age group you will play in that year. Kids on the front end of this effect are often bigger and stronger than kids who are in the middle or later — especially at very young ages where a few months is developmentally quite significant. The front-end kids outperform everyone else, and they get selected for elite travel teams. They play more hockey and get better coaching; and, in this manner, as Gladwell astutely points out, they become better players. They also end up disproportionately represented in the NHL.

This is a position-driven phenomenon. I haven’t made the notion of position entirely clear, but your month of birth is a paradigm example of it! The lords of Canadian hockey are trying to run an ability-driven process in which they pick the best players. This is not exactly what is happening. Positioning ends up favoring a subset of the population, and the population as a whole does not get the development and consideration it deserves. They’re limiting their player pool, and they’re limiting the life opportunities of large chunks of the population. Gladwell describes similar effects in other sports and indeed in arenas like education. In our terms, we can say that a main theme of Gladwell’s book is that position is much more important than we realize.

Turning finally to baseball, we can provisionally distinguish between different stages at which position has an influence. I’ve suggested that although positioning affects whether you’ll be able to train enough for the Olympic swimming trials, positioning is not a factor at the trials themselves. You have to beat everyone else, just about, by way of your ability or performance. So there’s the “runup” and the “selection arena”. Let’s apply these ideas to the arena of competing in spring training to make a major league baseball team. (This whole post got going because I’ve been thinking about spring training, which of course starts in a couple of weeks).

Unlike swimming, there are no purely quantitative criteria for making a baseball team. In addition, the spring season is only about 30 games long, and most players do not play all the games. As a result, the statistical samples which accumulate during this period are not large enough to be meaningful (compare http://www.fangraphs.com/library/principles/sample-size/). So when Player A outperforms Player B in spring training, it’s not a solid measure of their respective abilities to perform over an entire season.

For this reason, teams do not exclusively look at performance during spring training. But this broader approach points to the role of positioning. The coach may already have a lot of confidence in you, which can compensate for a bad spring training. Equally, the coach may not have much confidence in you at all, in which case a great spring training might not be enough for you to make the team.

At the swimming Olympic trials, you do not have this sort of handicap or head start afforded by the past. At spring training, you come in as part of a pecking order of players, which influences your space of possibilities and the lense through which your performance will be viewed. This is a very clear expression of the importance of position at spring training. The sharp division between runup and selection arena in swimming does not exist in baseball. At spring training, players operate under the inheritance of prior judgment.

Looking at spring training itself, there isn’t enough game data to form solid projections about future performance. Looking at the runup, it’s not like this era is free of the influence of position either. What we have is a pyramid of judgments which are based in part on limited evidence, and in part on each other. A high draft pick will benefit from a halo effect. A late-round draft pick will be on a shorter leash. If we assume that more high picks make the majors, this may not confirm the talent evaluations at work, because the halo effect could be contributing to the promotion of higher-round picks. Two players who perform in the same manner in the minors could have different career paths, because of differences in their initial position.

Baseball teams try to pick the best players, those with the most ability. They do not try to pick players because they liked them six months ago or two years ago. However, their process opens them up to this sort of selection bias. Are baseball teams self-deceived about what they’re doing, like some of the other actors we’ve considered in this post?

I actually don’t think so, and perhaps some of this optimism should reflect also upward to some of the earlier examples. I think that many baseball people would agree with the idea that they are making judgments about probabilities on the basis of limited evidence, and on limited time frames. In case you haven’t noticed, I struggle with the idea that spring training cuts represent definitive, warranted verdicts about player ability. Perhaps baseball people feel the same way. They may not be super confident about the respective ability levels of the players they are evaluating; they might agree that player A might still turn out to be better than player B. But the suggestion is that by the end of spring training, the teams have reached the point where they have to act.


Tanner Roark’s Z-Swing%, and Related Observations

Although the Nationals had a disappointing 2013 season overall, Tanner Roark (RHP) was one of their more pleasant surprises. The Nats brought him up in August, as injuries and performance problems created openings for several pitchers in their minor league system.

While Taylor Jordan also performed well, I think it’s fair to say that Roark had the most impressive and intriguing debut for the big-league team. Roark accumulated excellent “traditional” stats, and he did so at least in part by exploiting an unusual but highly effective talent: making batters not swing at good pitches. This post explores Roark’s story, and opens up the question of how his distinctive forte, zone-swing rate, contributes to effective pitching.

To recap, Roark finished 7-1 with a 1.51 ERA over 53 2/3 innings. He allowed only 1 home run in total, or 0.17 home runs per 9 innings; and the league batted .197 against him (– “batting average against” or “BAA”). The Nationals’ ace, Stephen Strasburg, allowed 0.79 home runs per 9 innings, with a BAA of .205. Roark was comparable in BAA to Strasburg and much, much better at preventing home runs.

Of course, Strasburg reached his figures in 183 innings of pitching as compared to Roark’s 53 innings of pitching. This is what is sometimes described as a smaller sample. But we should not discount Roark’s performance too quickly. His 53 innings involved five starts and nine relief appearances, including a total of 12 appearances with at least two innings pitched. This is considerably more than, say, one start and no relief appearances. Roark played for the Nationals for the last two months of the season. His stint in the majors last year was substantial enough, I think, to merit serious interest.

Roark’s 2013 performance was surprising in part because of his pedigree. In 2012 Roark was 6-17 as a starter in Triple-A, pitching for the Nationals. His 2012 ERA in Triple-A was 4.39 (although his FIP [Fielding Independent Pitching rating] of 3.85 was better). Providing more background, Adam Kilgore wrote in September 2013 that

Roark has never been regarded as a star or a significant prospect. In 2008, the Rangers drafted him in the 25th round. The Nationals acquired him and another minor league pitcher for Cristian Guzman at the 2010 trade deadline. Last winter, the Nationals left Roark unprotected from the Rule 5 draft for the second straight year. They invited him to major league spring training this year, and shipped him out in the first round of cuts.

(Washington Post, Nationals Journal, 9/17/2013; http://www.washingtonpost.com/blogs/nationals-journal/wp/2013/09/17/tanner-roarks-incredible-start-built-on-command-feel-for-pitching/)

Roark’s 2013 performance was also surprising because, with a fastball averaging 92.6 mph, he had good but not overwhelming velocity.

Going back to FIP and similar topics, another reason why Roark’s 2013 performance was surprising was because of some relationships between his statistics. For instance, although his 2012 Triple-A ERA (4.39) was higher than his 2012 Triple-A FIP (3.85), this relationship reversed itself last year in the majors, with Roark posting a 1.51 ERA and a 2.41 FIP. In addition, his xFIP (“expected Fielding Independent Pitching”) was 3.14, significantly higher than the FIP.

“ERA < FIP < xFIP” spreads of this size are not unheard of, but they are rare, especially when your ERA is less than 2.00. In fact, ERA < FIP < xFIP distributions of this type suggest that you are identical to Clayton Kershaw (1.83 ERA / 2.39 FIP / 2.88 xFIP) and that you have just signed a contract worth 215 million dollars!

These observations about Tanner Roark’s performance and pedigree raise several questions:

            How did he perform so well in 2013?

            What is going on with his ERA<FIP<xFIP distribution?

            What can we say about his future performance?

Taking a quick initial look at the ERA<FIP<xFIP distribution, a “negative” delta between ERA and FIP is often attributable to the pitcher having a low Batting Average on Balls in Play (BABIP). Roark’s BABIP was indeed very low, at .243. (Kershaw’s was .251).

Also, although this might sound odd, Roark’s extremely low HR rate (0.17 per 9 innings) pushed his ERA below his FIP, even though home runs are a fielding-independent matter. Roark was fine (league average or better) on the other FIP elements — walks, K’s, HBP’s. But combining these normal-range statistics with his homer rate produces a compromise number and some information loss.

Turning to xFIP, this calculation substitutes out the pitcher’s own homer rate for the league average homer rate. As we might expect, the league average homer-rate was much higher than Roark’s, and this explains the FIP < xFIP delta, while also contributing to the delta between his ERA and his xFIP.

These observations tend to intimate that some of Roark’s statistics are not likely to repeat themselves. Before turning to the “future performance” question identified above, I want to look more at the first question of trying to understand Roark’s 2013 success. There are aspects of Roark’s pitching last year which suggest that his strong performance numbers were not an accident, and that his apparent prowess is not simply overmagnified by the small prism of his innings total.

The first statistic of interest is that Roark was seriously good at throwing pitches in the strike zone which batters did not swing at. This is the Z-Swing% statistic recorded on Fangraphs and other places. Roark’s Z-Swing rate in 2013 was 54.8% (per Baseball Info Solutions [BIS]), or 55.9% per PITCHf/x. This means that batters only swung at Roark’s pitches in the strike zone about 55% of the time.

(BIS and PITCHf/x converge around 55% for Roark’s Z-Swing%. These systems actually diverge, or report different percentages, for some other stats which are not independent of Z-Swing%. Although this is interesting, the differences do not materially affect our evaluative questions. I will cite the BIS plate discipline statistics throughout and compare them to PITCHf/x at various points below).

The complement of Z-Swing% is what I will call “Z-pass” — the phenomenon of non-swings on pitches in the strike zone. Tanner Roark’s Z-pass rate last year was 45% — batters passed on about 45% of his pitches in the strike zone.

This was a very high Z-pass rate. In fact,

  • It was the highest Z-pass rate on the Washington Nationals, by about 5 percentage points, among Nationals pitchers with at least 50 innings.
  • It also was more or less the highest Z-pass rate in all of major league baseball. Roark came in first in Z-pass rate according to BIS. According to PITCHf/x, Roark was tied for sixth-best in Z-pass rate, behind Sonny Gray with a 47% Z-pass rate.

A high Z-pass rate is indicative of several good pitching qualities. Z-passes are good because they mean that batters are laying off a higher number of pitches which damage their cause and advance the pitcher’s cause. A high Z-pass rate indicates that the pitcher is accumulating strikes while maintaining an atypically lower risk of allowing a hit. (This is true if the pitcher is hitting the strike zone at a reasonable rate. More on this below). Tactically speaking, the Z-pass is the best outcome on the swing v. strike zone matrix below.

Image

(obviously i had a little trouble w/ the formatting, which i got tired of dealing with)

Swings on pitches in the zone and out of the zone can lead to hits, and worse. By contrast, if we assume that non-swings in the zone lead to strikes, the Z-pass simply constitutes a good outcome for the pitcher.

How often did Roark throw strikes? In 2013 Roark hit the strike zone 47.7% (BIS) of the time. This was about 3 percentage points ahead of major league average (44.9%). 3 percentage points comes out to about one standard deviation above average. (PITCHf/x reports a higher league-wide strike-zone rate — 49.4% — and a higher strike-zone rate for Roark as well, at 53.8%. PITCHf/x appears to have a larger strike zone than BIS).

It therefore appears Roark was exploiting his elite Z-pass rate often enough for it to be useful, and indeed for him to have an advantage over hitters. Roark accumulated strikes at a good rate; and, by strongly suppressing swings at pitches in the zone, he lowered the risk of allowing a hit. It appears this dynamic was a main factor in Roark’s success in 2013. That’s part of the answer to our “How did he perform so well” question.

Another factor which stands out from Roark’s strike-zone data is that he threw first-pitch strikes 70.6% of the time. This tied for third in major-league pitchers with at least 50 innings in 2013. Consistently gaining an initial advantage over hitters, and doing so at an elite rate, was another main factor in Roark’s success.

Other discussions of Roark have cited his command, his aggression, and an improved mental approach. Going back to Adam Kilgore, he writes:

Roark’s ascension began last season, when he told himself he would not allow his temper to control him on the mound. He would not the things out of his control – fluky hits, errors, whatever – distract him. He would throw strikes. He would be confident. He would attack, above all else.

“I feel that last year is when I had my, I guess, mental turnaround,” Roark said. “That was the biggest thing for me.”

(Washington Post, Nationals Journal, 9/17/2013; http://www.washingtonpost.com/blogs/nationals-journal/wp/2013/09/17/tanner-roarks-incredible-start-built-on-command-feel-for-pitching/)

We can certainly see command at work in Roark’s low homer rate, and his low walk rate (5.4%). We can see both command and aggression at work in his first-pitch strike rate. Roark’s league-leading Z-pass rate substantiates the command/aggression understanding of his performance, and also adds to this understanding.

A pitcher who suppresses swings on pitches within the zone is presumably hitting unattractive parts of the zone, but he may also be throwing in-zone pitches which do not present to hitters as strikes. This sounds like a pitcher on whom it is difficult to make good contact. This is a third idea, beyond Z-pass rate and first-pitch strike rate. One way, however, to be averse to good contact is to be a high Z-pass pitcher.

Being a high Z-pass pitcher does not entail being a high strikeout pitcher. Roark’s strikeout rate was only one percent below major-league average (again, among pitchers with 50 innings and up). Of course, on other measures, like ERA, Roark was much better than league average. I think that connecting Z-pass rate with suppression of good contact can help us understand why.

Z-passes represent hittable pitches – pitches in the zone – which were not hittable enough to induce a swing. Poetically speaking, Z-passes involve real visual ambiguity: since they end up in the strike zone, they can’t look that bad; but they do not look good enough to induce a swing.

How well does this characterization actually apply to Roark’s pitches? On this question, we have the following from the Atlanta Braves:

 “He wasn’t missing with any pitches over the plate, it seemed like,” said Braves catcher Gerald Laird. “When he was going away, he was throwing that little two-seamer back door, when he was coming in he was running that two-seamer in on your hands, and he had that little slider working.

“Tonight it seemed like he was hitting his spots and wasn’t making any mistakes. I know (Freddie Freeman) was saying he was starting it at him and running it back over. When he’s doing that it’s hard to pull the trigger.”

(http://www.washingtontimes.com/blog/nationals-watch/2013/sep/17/tanner-roark-shines-nationals-complete-doubleheade/#ixzz2prxGGOUh)

Of course, these descriptions of visual ambiguity — or of evidence which shifts within a fraction of a second — presumably apply to all or most of a high Z-pass pitcher’s offerings, not just to his pitches in the strike zone which do not elicit a swing. The image that emerges is of a player whose whole volume of pitches is tough to react to in a manner that creates good contact.

Roark was actually pretty good at inhibiting contact of any kind, especially on pitches within the strike zone. However, a look at his contact numbers does not immediately confirm this interesting and important point. As we see in the table below (from BIS by way of Fangraphs), many of Roark’s contact rates were actually above league average, sometimes by more than one standard deviation.

Image

Before turning to contact rates, you will have noticed that this table also gives us a look at how Roark’s Z-swing rate compared to the rest of baseball. According to BIS, Roark was 3 standard deviations above average on a positive pitching statistic which is completely independent of fielding. He was two standard deviations (56% Z-Swing%, as opposed to 63% league average) ahead according to PITCHf/x — this is still pretty good for a former 25th-round pick! Some other observations:

  • O-contact. Here Roark was much higher than average, but this may not be a bad thing, since contact outside the zone is less likely to be productive for the hitter.
  • Z-contact. Roark again was higher than average. But this somewhat unsettling number should not be digested outside of its relevant context, which is helpfully provided by Roark’s Z-swing rate. Looking at Z-contact multiplied by Z-swing yields the interesting result that Roark allowed contact on 51 percent of his strike zone pitches, as opposed to a league average of 57 percent, with a standard deviation of 3 percent.

(PITCHf/x condenses this gap, in much the same way that it condenses the gap between Roark and MLB on Z-pass. PITCHf/x reports Roark at 52.2% contact on all pitches within the zone, and MLB at 54.6%. Thus, if we switch from BIS to PITCHf/x, Roark’s contact rate goes up, and MLB’s goes down.

However, as noted above, PITCHf/x appears to be working with a larger strike zone than BIS (MLB-average Zone% of 49.3 vs. MLB-average Zone% of 44.9). This point complicates Roark’s apparent movement back towards league average. In brief, the fact that Roark’s swing rates go up — while the MLB average goes down — on larger renditions of the strike zone may be a testament to his effectiveness, rather than a knock against it.

  • SwStr (swinging strikes/total pitches). Since Roark did a good job suppressing contact within the zone, Roark’s low swinging-strike number does not seem to be an especially important piece in his overall puzzle.

The standard contact rates reported by BIS and PITCHf/x do not do a good job of communicating how well a pitcher actually prevents contact, because these contact rates only look at swings. Since you can suppress contact by suppressing swings, multiplying the contact rate by the swing rate provides a better view of how a pitcher is actually doing along this dimension. Despite a “zone-contact” rate which was higher than league average, Roark was very good to excellent at suppressing contact within the strike zone.

We are exploring a clue provided by Roark’s excellent Z-pass rate that Roark was good at inhibiting solid contact. This clue was supported by our look at Roark’s contact rates, which indicate that he was pretty good at suppressing contact flat out. The idea that Roark’s pitches were visually ambiguous enough to limit good contact receives further confirmation from his batted-ball statistics. In addition, looking at these statistics will bring us around nicely to the question of how well Roark might sustain his performance in future seasons.

Image

Roark’s ground-ball, fly-ball, and infield-fly rates combine to indicate a strong bias against good contact. Roark had a somewhat high line drive rate, and, admittedly, line drives are a form of good contact. For instance, I suspect it’s unusual to have a somewhat high line-drive rate and a markedly low BABIP. Roark’s line-drive rate provides one specific indication that his BABIP is due to increase. However, a somewhat high line-drive rate is not entirely at odds with the idea that a pitcher is suppressing good contact — especially if we are thinking about home runs. Since most line drives are not home runs, a slight tendency towards line drives is a small but genuine homer-prevention measure.

In this way, Roark’s line drive rate coheres with his ground-ball, fly-ball, and infield-fly rate statistics. All of these rates, and especially their combination, suggest a low-homer pitcher. Why didn’t Roark give up a lot of home runs? Well, he got a lot of grounders and infield flys, while limiting his fly balls overall, and he gave up a somewhat high proportion of line drives. It is very plausible to suppose that Roark’s extremely low HR/FB rate overshoots the anti-homer bias suggested by his other batted-ball rates. Equally, however, the other rates tell a clear enough story that a low homer rate is not at all a surprise. Roark was very good at inhibiting good contact.

How will he do in the future? A nice way to frame this question is in terms of Roark’s ERA, FIP, and xFIP numbers mentioned earlier. And, leading up to that, I think it’s helpful to assess the respective importance of two things: (1): the overall coherence of Roark’s 2013 statistics; and (2) the sample sizes in which they were achieved.

In terms of coherence, Roark’s statistics tell a consistent story:

  • Looking at Z-pass, Roark was very good at limiting swings on good pitches
  • Looking at Z-swing * Z-contact, Roark was very good at limiting contact within the zone
  • Looking at his batted ball rates, Roark was very good at limiting good contact.

I could be wrong about this, but I do not see relationships among Roark’s 2013 statistics which point to trouble looking ahead. These statistics tell a consistent story of effectiveness. You can focus on his low swinging-strike rate if you like, but this rate was consistent with Roark being at least one standard deviation (two sd’s according to BIS) better than average on limiting contact within the zone.

In addition, there are pockets within Roark’s portfolio where some stats are very good and others are even better, like the HR/FB rate relative to Roarks other batted-ball statistics. However, this type of overshooting is a good problem to have. To the extent that the non-harmonic components of Roark’s statistical portfolio are extremely good statistics, this relates to the issue of our expectations for future years. A version of Tanner Roark based on 2013, but without the extra anti-homer overshooting, would still be above MLB-average.

As noted above, Roark only pitched 53 innings, and that’s a much lower total than what a starting pitcher would typically accumulate over a full year. Although we intuitively regard this as a small sample, it does not follow that Roark’s performance is without predictive value. As is often pointed out on the pages of Fangraphs, statistics stabilize, or acquire predictive value, at different thresholds (http://www.fangraphs.com/library/principles/sample-size/). Generally speaking, fielding-independent stats stabilize more quickly for pitchers than fielding dependent stats; this is a helpful point in assessing the forward relevance of Roark’s 53 innings.

Some of Roark’s relevant statistics are above their stabilization thresholds. Roark allowed 153 balls in play (BIP), which puts him above the stabilization points for groundball rate and flyball rate:

70 BIP: GB rate

70 BIP: FB rate

Roark faced 204 batters, which is above the stabilization points for walks and strikeouts:

70 BF: Strikeout rate

170 BF: Walk rate

However, Roark was league-average in K’s and was “only” one standard deviation above average in walks; these numbers are not as good as Roark’s plate discipline statistics like Z-pass and suppression of contact within the zone. So it’s not clear whether Roark reached the stabilization points for key parts of his performance.

But this is more or less where I will have to leave it. Figuring out the stabilization point for Z-pass is beyond the scope of the present study. Indeed, my post has probably pushed us to near overload regarding things that we ever wanted to know about Tanner Roark! By the same token, it’s not clear that learning more about Roark’s statistical profile would shift our opinion much about his prospects for future performance. This is what I think we have to consider:

In an intuitively small sample size, Roark put up a consistent portfolio of excellent fielding-independent stats: on limiting zone-swings, limiting contact in the zone, and limiting good contact. Very broadly, the size of a sample has to be balanced with the consistency of the evidence within it. Just imagine watching a one-round boxing match in which one competitor knocks the other one down three times. This is a small sample which tells a very compelling story about the respective abilities of the boxers. Roark’s sample size is larger, of course, and his performance was not as dominant. Nonetheless, his limited 2013 season is packed with a lot of positive indicators.

Here are a few final comments about what Roark might do in the future, framed in terms of his ERA, FIP, and xFIP:

Image

As we discussed above, the delta between Roark’s ERA and FIP is primarily a matter of his low BABIP and his very low homer rate. Although Roark’s BABIP will probably go up, there are signs he may be better than average at suppressing hits: he showed a tendency to induce ground balls and infield flys; the latter especially inhibit BABIP.

Roark’s very low homer rate pulls down both his ERA and his FIP. Although his .17 homers per 9 innings will almost certainly go up, there are signs he may be better than average at suppressing home runs….signs which are distinct, that is, from his one homer allowed in 53 2/3 major league innings!! Roark’s tendencies toward ground balls, infield flys, and line drives are all anti-homer measures. These tendencies flow, by hypothesis, from his ability to inhibit good contact by throwing visually ambiguous pitches.

The most eligible view by far is that Roark will regress towards league average in future years. But accepting this view should not deprive us of optimism. Roark could go back at least one standard deviation on each of the ERA-like measures and still be at league average or better than league average. That’s a good position for any pitcher. It’s a great position, albeit a paradoxical one, for a pitcher who is currently slated to compete for no better than the 5th spot in the Washington Nationals’ 2014 starting rotation!! Suffice to say I think that Roark ought to receive full consideration for the opportunities available to him.


What if Desmond and Zimmermann were free agents this year?

This is a quick post to get things started, applying a formula often used on Fangraphs.com to estimate free-agent contracts. All or most Nationals fans are aware that the team recently signed SS Ian Desmond and starting RHP Jordan Zimmermann to two-year contracts, which cover the remaining period before both players are eligible for free agency in 2016. One alternative outcome was that the players went to arbitration, but another possibility of course was that the players signed long-term extensions, and the Nationals secured the services of two of their best players for many years to come.

Although we haven’t heard much about the negotiations, it appears that the respective sides in both cases were very far apart regarding valuations of the players. We can be more confident about this with respect to Zimmermann, who has publicly taken a somewhat firm line about his contract discussions with the Nationals…but, hey, Desmond didn’t sign a long-term contract last week either!

As is routinely pointed out, when a player is at least two years away from free agency, his team can often sign him for less money than he might get on the free agent market a couple of years down the road, and indeed for less money than similar players may be getting in free agency at the time.

In some quarters, this dynamic has acquired the title of the “hometown discount.” If this title suggests that the player is doing the team a favor, then I think it’s misleading. When a player re-signs with his team in this situation, he typically gets a large raise, effective immediately; in addition he gets — now, not two years down the road — a long-term guaranteed contract. Importantly, he thereby avoids the risks of injury, of decline in his performance, or of stiff competition, or a lack of buyers, in the year in the future when he actually goes on the free agent market. Some of these benefits are not easily measured in terms of dollars, but they are concrete benefits all the same, which do not reflect in any way the goodness of the player’s soul.

I mention this because I cannot entirely shake the thought that an unreformed conception of the hometown discount has affected some recent contract negotiations with the Nationals. On the other hand, both Desmond and Zimmermann have played excellent baseball over the past few seasons, and accordingly their value may have somewhat outrun what fans like myself — and management? — were expecting for these guys in the way of long-term contracts. Is Zimmermann the best starter on the Nationals? Is Desmond the best shortstop in baseball? These are somewhat melodramatic questions, of course, but the last couple of years have given them a foothold.

One way to impose a little bit of objective order on this situation is to look at what Desmond and Zimmermann might get now if they were free agents. This exercise does not directly address the question of what the Nationals should have offered them, but it does set a provisional upper limit on this question. Provisionally, the Nats shouldn’t pay what the players might get as free agents, because the players are not currently free agents. As we rehearsed above, a contract extension now can offer benefits which are otherwise unavailable to the players, and which offset a lower contract dollar value. A look at hypothetical free agent valuations is nonetheless instructive, because it suggests that the Nationals could reasonably have offered less.

Fangraphs uses the Wins Above Replacement statistic, coupled with observation-based estimates on how much teams are currently paying for Wins Above Replacement in the free agent market, to project what players might get in free agency. So we have WAR and the $x.xx/WAR being paid in the free agent market. The model looks ahead in two ways. It assumes that player performance will go down, and that the dollar price for performance will go up. More expansively, the model assumes that players will decline as they age, and it assumes that the dollar value of WAR will increase over time — perhaps as more money comes in to the game, as competition increases, and indeed as the dollar itself inevitably loses value.

Dave Cameron recently used the model to project a $230M/7 contract for Clayton Kershaw. Here is what the model says about six year free agent contracts for Desmond and Zimmermann, if these contracts were to start in 2014. All dollar figures are of course in millions.

Image

By way of full disclosure, I did not religiously apply the Fangraphs standard of a 0.5 per year decline in WAR for Zimmermann. He had about 3.5 WAR in 2013, and for present purposes I assumed he would replicate this in 2014…whereas I tuned down Desmond’s 5.0 WAR in 2013 to 4.5 in 2014. Different projections of WAR would have yielded different overall contract estimates. Nonetheless the above figures have a whiff of plausibility about them, and therefore some explanatory power.

I suspect that $130M/6 is significantly more than the Nationals are ready to pay for Ian Desmond. Although this is, importantly, a free-agent estimate, my guess is the Nats think that $130M/6 is a satellite in the wrong stratosphere. However, Desmond’s back-to-back 5.0 WAR seasons at shortstop represent a transformative and inarguably elite level of performance. So we all may have a little catching up to do regarding Desmond’s value and prospective salary.

Turning to Zimmermann, I suspect that, as he sees it, $89M/6 is too low.  And again, this is an estimate of what Zimmermann might get as a free agent, so our theory is that the Nats could reasonably have offered less.

However, it is all too easy to imagine that Zimmermann is moved by free agent contracts for pitchers on the scale of Masahiro Tanaka’s recent deal for the Yankees, which was $155M/7 (with some opt-outs and the like). Another recent deal in this range is Zack Greinke’s $147M/6 contract with the Dodgers.

Admittedly, both of these contracts very likely overshoot the dollar per WAR benchmarks at work in the above tables. There is in addition some correlative doubt as to whether these contracts will turn out to be good investments for the teams. Finally, the teams in question, the Yankees and the Dodgers, are notorious huge spenders. Even between the two of them, however, they don’t sign everyone. So their contracts are not perfectly predictive of what any given free agent might get in any given year.

Nonetheless, the Tanaka and Greinke contracts represent recent, salient, concrete data points which might shape a prospective free agent’s thinking. Of course, Tanaka signed his contact after Zimmermann signed his two-year extension; but the idea that Tanaka would be getting a very large contract from someone was established in the media fairly early in the offseason.

This is easy to say now, but I had been worried for some time about the effect of Tanaka vis-à-vis Zimmermann. After all, Tanaka has pitched zero innings in the majors! And how do you think Zimmermann would perform over in the Japanese league? Jordan Zimmermann is not in any rush to take approximately 50% less than Masahiro Tanaka.

Greinke too may be exerting a recalcitrant effect on Zimmermann’s thinking. He only had 2.9 WAR last year; Zimmerann is +3.0 WAR or better for each of the last 3 seasons. Zimmermann might not cite his WAR figures in verbal or written discussion; but they may not be too far removed from the mindset that’s driving him. (Of course, Greinke posted a 9.1 WAR season in 2009….but that was ages ago!)

The second and final year of the deal that Zimmermann reached with the Nationals was for $16.5M in salary. Multiplying that by six years yields $99M. Unless the Nats have simply given up, the short-term agreement they reached with Zimmermann seems to indicate some amenability on their part towards a longer contract on this scale. After all, a contract extension for a prospective free agent is not likely to involve a pay cut. Notably, $99M over six years is above the Fangraphs estimate. Our working assumption is that this would be generous since Zimmermann is not currently a free agent. However Zimmermann has powerful reasons to oppose a contract on this scale.