Global Warming Statics

It is often stated that global temperature has increased over some specific time frame. Few realize there are different ways to answer this question, and the increase may not actually be significant, particularly in view of persistent correlation between temperature over long time scales (LTP).

In Statistical analysis of hydroclimatic time series: Uncertainty and insights Koutsoyiannis evaluates two publications using two different approaches to this issue: the evaluation of trends as done in Cohn, T. A., and H. F. Lins (2005), or as the simple change in temperature between two points as in Rybski et al. (2006).

Continue reading Global Warming Statics

Spurious Regression Random Walk

The Draft Garnaut Report is to be commended for commissioning a study Global temperature trends – Breusch and Vahid (BV) by two prominent Australian National University (ANU) econometricians to examine global temperature series. The approach they take to modeling temperature has a long history. See for example at RealClimate,
Rybski, and Koutsoyiannis. Their findings that the significance barely reaches the 95% level with these kinds of models is not inconsistent with any of them. Even if there are no new insights, independent statistical studies by experts outside the field help to build trust and confidence in a controversial issue.

However, the results of the study are no reason for ‘high fives’ among proponents of man-made global warming. Despite concluding that “the temperatures recorded in most of the last decade lie above the confidence level produced by any model that does not allow for a warming trend”, the study also reveals just how close the case for anthropogenic global warming is to a spurious regression on a random walk.

Below are some of the issues that have been raised about the paper on the various climate blogs.

Continue reading Spurious Regression Random Walk

Hurst Coefficient Software

Long-range dependence is being identified many disciplines such as, networking, databases, economics, climate and biodiversity. LTP is competing with the sexy “long tail” for top spot as a theory of cultural consumption. Thus, the need for software offering complete long-range dependence analysis is crucial.

While there are some steps towards this direction, none are yet completely satisfactory. For one, the Hurst exponent cannot be calculated in a definitive way, it can only be estimated. Second, there are several different methods to estimate the Hurst exponent, but they often produce conflicting estimates, and it is not clear which of the estimators are most accurate.

A first step towards a systematic approach in estimating self-similarity and long-range dependence is the java tool called SELFIS, a java based tool that will automate the self-similarity analysis.

In R there is the fracdiff package on CRAN for fitting
frARIMA(p,d,q) models with fractional “d” and there’s a
one-to-one relationship between ‘d’ and the Hurst parameter for
these models: d = H – 1/2. These are better than the R/S method known to be far from optimal.

A useful summary of the issues and reference to other resources is
Estimating the Hurst Exponent

Demonstrating just how pervasive are these concepts in our daily life is the Physics of Fashion Fluctuations by
R. Donangeloa, A. Hansenb,c, K. Sneppenb and S. R. Souzaa.
Here a simple model for emergence of fashions — goods that become popular not due to any intrinsic value, but simply because “everybody wants it” — in markets where people trade goods shows spontaneous emergence of random products as money. The model supports collectively driven fluctuations characterized by a Hurst exponent of about 0.7.

Scale Invariance for Dummies is an investigation of scale invariance or long term persistence (LTP) in time series including tree-ring proxies – the recognition, quantification and implications for analysis – drawn largely from Koutsoyiannis.

Anybody know of any commercial packages dealing with the Hurst coefficient?

Niche Modeling. Chapter Summary

Here is a summary of the chapters in my upcoming book Niche Modeling to be published by CRC Press. Many of the topics have been introduced as posts on the blog. My deepest thanks to everyone who has commented and so helped in the refinement of ideas, and particularly in providing motivation and focus.

Writing a book is a huge task, much of it a slog, and its not over yet. But I hope to get it to the publishers so it will be available at the end of this year. Here is the dustjacket blurb:

Through theory, applications, and examples of inferences, this book shows how to conduct and evaluate ecological niche modeling (ENM) projects in any area of application. It features a series of theoretical and practical exercises in developing and evaluating ecological niche models using a range of software supplied on an accompanying CD. These cover geographic information systems, multivariate modeling, artificial intelligence methods, data handling, and information infrastructure. The author then features applications of predictive modeling methods with reference to valid inference from assumptions. This is a seminal reference for ecologists as well as a superb hands-on text for students.

Part 1: Informatics

Functions: This chapter summarizes major types, operations and relationships encountered in the book and in niche modeling. This and the following two chapters could be treated as a tutorial in the R. For example, the main functions for representing the inverted ‘U’ shape characteristic of a niche — step, Gaussian, quadratic and ramp functions – are illustrated in both graphical from and R code. The chapeter concludes with the ACF and lag plots, in one or two dimensions.

Data: This chapter demonstrates how to manage simple biodiversity databases using R. By using data frames as tables,
it is possible to replicate the basic spreadsheet and relational database operations with R’s powerful indexing functions.
While a database is necessary for large-scale data management, R can eliminate conversion problems as data is moved between systems.

R and image processing operations can perform many of the
elementary spatial operations necessary for niche modeling.
While these do not replace a GIS, it demonstrates that generalization of arithmetic concepts to images can be implemented simple spatial operations efficiently.

Part 2: Modeling

Theory: Set theory helps to identify the basic assumptions
underlying niche modeling, and the relationships and constraints between these
assumptions. The chapter shows the standard definition of the niche as
environmental envelopes is equivalent to a box topology. It is proven that when
extended to infinite dimensions of environmental variables this definition
loses the property of continuity between environmental and geographic spaces.
Using the product topology for niches would retain this property.

Continue reading Niche Modeling. Chapter Summary

Random numbers predict future temperatures

Previously “A New Temperature Reconstruction” used random data with long term persistence (LTP) to illustrate the circular reasoning behind the ‘hockey stick’ reconstruction of past temperatures. This one shows the potential for false positives due to the statistics used in the ‘hockey stick’. The dynamic simulation below shows future temperatures predicted using a random fractional differencing algorithm that generates realistic LTP behavior. Future temperatures and validation statistics are calculated each time the page is reloaded. One unusual statistic used in MBH98 suggests the future can be predicted using random numbers.

Note: This is a first version of the application and may contain errors and be improved considerably. The code is freely available under the GPL to order to promote open science. See The Reference Frame for more information.

Reload page for new prediction. Measured and predicted future temperatures, with years on the x axis, and temperature anomalies on the y axis. The measured temperatures are in blue and the simulated temperatures are in red. Black points are measured temperatures for years in the validation period.

Continue reading Random numbers predict future temperatures

In Praise of Numeracy

Mathematical shapes can affect our lives and the decisions we make.

hockey stick graph
describing the earths average temperature over the last millennia has been the subject of a controversial debate over reliability of methods of statistical analysis.

hockey stick.jpg
From this to this …

The long tail is another new icon, described in a new book, developed in the Blogosphere, by Chris Anderson called “The Long Tail”:

Forget squeezing millions from a few megahits at the top of the charts. The future of entertainment is in the millions of niche markets at the shallow end of the bit stream. Chris Anderson explains all in a book called “The Long Tail”. Follow his continuing coverage of the subject on The Long Tail blog.

As explained in Wikipedia:

The long tail is the colloquial name for a long-known feature of statistical distributions (Zipf, Power laws, Pareto distributions and/or general Lévy distributions ). The feature is also known as “heavy tails”, “power-law tails” or “Pareto tails”. Such distributions resemble the accompanying graph.

In these distributions a low frequency or low-amplitude population that gradually “tails off” follows a high frequency or high-amplitude population. In many cases the infrequent or low-amplitude events—the long tail, represented here by the yellow portion of the graph—can cumulatively outnumber or outweigh the initial portion of the graph, such that in aggregate they comprise the majority.

Continue reading In Praise of Numeracy

AIG Article

The Australian Institute of Geoscientists News has published online my article “Reconstruction of past climate using series with red noise” on page 14. Many thanks to Louis Hissink the editor for the rapidity of this publication. It is actually a very interesting newsletter with articles on the IPCC, and a summary of the state of the hockey stick (or hokey stick). There are articles on the K-T boundary controversy and how to set up an exploration company.

Reconstructing the hokey stick with random data neatly illustrates the circular reasoning in a general context, showing the form of the hokey stick is essentially encoded in the assumptions and proceedures of the methodology. The fact that 20% of LTP series (or 40% if you count the inverted ones) correlate significantly with the temperature instrument record of the last 150 years illustrates that (1) 150 years is an inadequate constraint on possible models to base an extrapolation of over 1000 years, and (2) the propensity of analogs of natual series with LTP to exhibit ‘trendiness’ or apparent long runs that can be mistaken for real trends. And check back shortly for the code, I have been playing around with RE and R2 and trying some ideas suggested by blog readers to tighten things up.

With the hokey stick discredited from all angles, even within the paleo community itself with recent reconstructions of Esper and Moberg showing large variation in temperature over the last 1000 years, including temperatures on a par with the present day, one wonders why it is taking so long for the authors of the hokey stick to recant and admit natural climate variability. While the past variability of climate may or may not be important to the attribution debate, it is obviously important on the impacts side, as an indicator of the potential tolerances of most species.

RE of random reconstructions

To follow up on the last post, I have calculated the RE as well as the R2 statsitics for the reconstruction from the random series. The same approach was used, i.e. generate 1000 sequences with LTP, select those with positive slope and R2>0.1, calibrate on linear model, and average. Here is the reconstruction again, with the test and training periods marked with a horizontal dashed line (test period to the left, training to right of temperature values):

Continue reading RE of random reconstructions

Scale invariance for Dummies

Below is an investigation of scale invariance or long term persistence (LTP) in time series including tree-ring proxies – the recognition, quantification and implications for analysis – drawn largely from Koutsoyiannis [2] (preprints available here). In researching this topic, I found a lot of misconceptions about LTP phenomena, such as LTP implying a long term memory process, and a lack of recognition of the implications of LTP. As to implications, the standard error of the mean of global temperatures at 30 data points is 4 times larger than the usual estimate for normal errors. Given that LTP is a fact of nature – attributed by Koutsoyiannis to the maximum entropy (ME) principle – this strongly suggests the H should be considered in all hypothesis testing.
Continue reading Scale invariance for Dummies