-
14
Jul -
Smooth Operator
Posted by David Stockwell in Climate, Statistics, Theory
Table of contents for Sea Level
- A semi-empirical approach to sea level rise
- Smooth Operator
The replication of the highly influential Rahmstorf 2007 A Semi-Empirical Approach to Sea Level Rise, one of the main sources of projected sea level rise, was reported in the previous post.
In a now discredited (and disowned) Rahmstorf et al 2007 publication, Steve McIntyre showed that Rahmstorf had pulled an elaborate stunt on the community by dressing up a simple triangular filter with “singular spectrum analysis” with “embedding dimensions”, I can now report another, perhaps even more spectacular stunt.
His Figure 2 is crucial, as it is where the correlation between the rate of sea level increase, deltaSL, and the global temperature, Temp, is established. If these were not correlated, then there would be no basis for his claims of a significant “acceleration” in the increase in sea level when temperature increases, and his estimates of sea level rise by 2100 would not be nearly so high.
It is well known that smoothing introduces spurious autocorrelations into data that can artificially inflate correlations, and one of the comments on his paper (attached to the first link above) picked up on this. Rahmstorf’s procedure introduces no less than 5 different types of smoothing to produce his Figure 2:
1. singular spectrum analysis – the first EOF
2. he then pads the end of the series with a linear extrapolation of 15 points
3. convolution, (or 15 point filtering)
4. calculates the linear trend from 15 points (on the sea level data only)
5. binning of size 5
I replicated his procedure in the previous post in the series. Here, the entire procedure is substituted with a single binning (averaging each successive M data points). The figure below compares the Rahmstorf procedure at parameters m=13:16 (red line), and the result of binning the same data into bins of size m=13:16 (black line). The sea level data is differenced after binning to get a delta SL.
The only real difference is that the Rahmstorf method has a few extra points to the high temperature end, the right hand end of the graph. These are produced by the padding, when the artificial data is introduced.
That Rahmstorf was ignorant of the padding in the algorithms was shown when he categorically asserted on RC that they did “not use padding”:
you equate “minimum roughness” with “padding with reflected values”. Indeed such padding would mean that the trend line runs into the last point, which it does not in our graph, and hence you (wrongly) conclude that we did not use minimum roughness. The correct conclusion is that we did not use padding.
The effect of the padding is to duplicate points in the part of the curve where the temperature and sea level rise is highest, thus artificially inflating the influence of this part of the graph. If Rahmstorf did not even know that the algorithm was padding the data, it is unlikely he was aware that the padding was influencing the results.
At the end of the day, all of the triangular filters and Rahm-smoothing can be reproduced by binning of data at the same value as the smooth. One can speculate on the motive for the padding, as it clearly adds more data at the alarming end.
There are inherent advantages in binning. The figure above presents the autocorrelation function of sea level data out to lag 15, on smoothed (black) and unsmoothed (blue) data. That is, one series was smoothed with Rahmstorf’s method (m=15). The smoothed and unsmoothed series were then binned (m=5) then differenced (to remove the trend), and then the acf plotted. The autocorrelation in the unsmoothed data drops quickly below significance at lag 2. The autocorrelation in the smoothed data persists for longer, has a significant anticorrelation between lag 6 and 10, and significant again by lag 15.
The effect of smoothing is to smear out the information in the series, introducing spurious correlation that can seriously affect significance test. Binning OTOH keeps the information localized to each bin, and uncorrelated.
While there may be a rationale for binning to reduce noise in the dataset, and capture an inherent ‘climate-relevant’ scale, there is no reason at all for smoothing, and certainly not 5 times!
- Published by David Stockwell in: Climate Statistics Theory
- If you like this blog please take a second from your precious time and subscribe to my rss feed!

