Yes I watch “House”. I wanted to return to the issue of whether the snowfall in Antarctica is normally distributed, as it has bearing on the claim in van Ommen and Morgan from the abstract:

The precipitation anomaly of the past few decades in Law Dome is the largest in 750 years, and lies outside the range of variability for the record as a whole, suggesting that the drought in Western Australia may be similarly unusual.

The relevant passage in the supplementary information where normality is confirmed follows where the significance of the past few decades was tested with a t-test:

The use of a t-test does assume a normal distribution for events. Tests for non-normality confirmed the validity of this assumption: accepting the null hypothesis (Kolmogorov-Smirnov, P = 0.28, Lilliefors, P = 0.88). Also, inspection of a quantile-quantile plot supports the assumption of normally distributed data.

These ‘events’ as you recall are established as the smoothed Law Dome snowfall series crosses the mean, giving 56 variable length periods of snow accumulation or deficit. I can’t even think about the statistics of such a thing, though when I attempted to replicate it in the first posts of the series, I got less significance of the last event of snow accumulation (since 1970) than claimed.

So here I am going to do the same analysis on the raw and aggregated data (which I explained previously is more robust) and try to see how significant the last few decades really are.

There is an R package called nortest that contains the following tests of normality:

ad.test Anderson-Darling test for normality
cvm.test Cramer-von Mises test for normality
lillie.test Lilliefors (Kolmogorov-Smirnov) test for normality
pearson.test Pearson chi-square test for normality
sf.test Shapiro-Francia test for normality

Running each of these on the raw snowfall data we get P values of 3x10e-6, 6x10e-5, 2x10e-3, 0.04 and 6x10e-6. That is, normality is rejected by all tests. Here is the distribution of snowfall at LD with the normal distribution (in red).

fig6

To me this looks more like a lognormal distribution, so I will next estimate these parameters for testing how significant the snowfall in the last few decades has been. But first, here is the figure that will be produced assuming a normal distribution, where the red dashed and dotted lines are the 2 and 3 sigma significance limits respectively, and the black line is the value of the average snowfall at a range of aggregations, from 1 through to 100.

fig5

The snowfall in the most recent period becomes significantly high once the data are aggregated to more than 25 years, peak then decline to meander along the 95%CL from aggregations of about 40 years on. Remember that this is going to be an estimate, so the suspicion is that when the distribution is properly estimated, we may drop below the statistical 95% threshold.

For comparison, here is the previous figure for at the improbability of the final event by calculating the set of ‘events’ and their standard deviations for a range of filter sizes:

fig2LDL

The size of the final event becomes 3-sigma significant at around half-filter size 4 and declines thereafter. In other words, the final event is less significant at scales between 8 and 40 years when we use a robust aggregation approach, and a normal distribution.

The next post will be the same figure, only using a lognormal distribution to test the significance of the final event.