Just about everyone is familiar by now with the so called “hockey stock graph” showing that global temperatures have gone from steady to a sharp increase over the past two decades. The reason for it’s name is apparent in the graph below: the previous 1000 years are long and flat, like the long handle of a hockey stick, and the past few decades show a sharp increase, like the end of a hockey stick. A new paper from two statisticians questions whether the data actually produce a hockey stick shaped graph at all.

The paper look at the methods used to estimate a single global temperature series from the hundreds of data sets on temperature from tree rings, ice cores, and other natural phenomenon.  The dataset used to create the above graph contains 1,209 climate proxies ranging from 8855 BC to 2003 AD, eight global annual temperature aggregates from 1850-2006 AD, and 1,732 local annual temperatures dating 1850-2006 AD. Reducing these series to one timeseries of global temperatures is a difficult statistical task, especially given the spatial and temporal autocorrelation, missing observations, more covariates than timeperiods, regime switching, and weak signal to noise ratio.

The basic statistical task is that the authors have to model a relationship between the longer-time series, which are temperature proxies, and the more recent time series, which are actual temperature measures. If a reliable relationship can be modeled, then actual temperatures in can be backcasted in periods where such measures don’t exist, e.g. back before 1850, using the proxy variables.

Here is how the authors summarize their results:

On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample… Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth.

The authors present the following graphs of global temperature generated using bayesian models:

The long story short is that a hockey stick shape may not be the best representation of the data; some of their models produce graphs that look like hockey sticks, and some don’t. As seen in the figure above, using bayesian methods the long-handle of the hockey stick disappears. They conclude that proxy data may not be useful for predicting actual temperatures at time periods of several decades, let alone centuries.

However, the authors importantly note that “the temperatures of the last few decades have been relatively warm compared to many of the thousand-year temperature curves sampled from the posterior distribution of our model.” Furthermore, the evidence for global warming comes from a variety of sources and “paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate”. Global warming is still real, and still a serious threat, but this single visually compelling piece of evidence does not appear to be the best interpretation of the data.

ADDENDUM: In the comments Kevin Drum says that he doesn’t see a substantial difference between the two graphs. Fair enough. But I would point out two things about the graphs that illustrate the important differences. In the hockey stick graph the recent run-up in temperature anomalies exceeds the confidence intervals for the past century, whereas in the authors’ graph the recent run-up does not. Thus in the former case we can say the current run-up exceeds previous values in the past 1,000 years, whereas in the latter case we may not have exceeded previous values.  Comparing confidence intervals over the entire series emphasizes that fact further.

As one more illustration I’ll provide another graph from the paper which shows three different models that perform similarly:

Here the difference is clear. If the red line is the correct measure of temperature anomalies it is a significantly different story than if the green line is correct. The red line here provides a hockey stick graph consistent with the first graph that we are all familiar with, whereas the green and blue lines don’t look like any hockey stick I’ve ever seen. Now I know they do some things a little different in California where Kevin is from than they do here on the east coast, so I could chalk this up to cultural sporting differences… but I saw The Mighty Ducks, Kevin, they were from Anaheim, and their hockey sticks handles were flat like the red graph.

Final note,  please avoid using the comments to debate anything other than the statistical issues at hand; this is not the place to argue about climate science or climate politics in general. Let’s keep it limited to dimensionality reduction and time series statistical analysis.