It is often stated that global temperature has increased over some specific time frame. Few realize there are different ways to answer this question, and the increase may not actually be significant, particularly in view of persistent correlation between temperature over long time scales (LTP).
In Statistical analysis of hydroclimatic time series: Uncertainty and insights Koutsoyiannis evaluates two publications using two different approaches to this issue: the evaluation of trends as done in Cohn, T. A., and H. F. Lins (2005), or as the simple change in temperature between two points as in Rybski et al. (2006).
* be full 3D coupled ocean-atmospheric GCMs,
* be documented in the peer reviewed literature,
* have performed a multi-century control run (for stability reasons)and
* have participated in CMIP2 (Second Coupled Model Intercomparison Project).
He observes that there are no actual criteria that show predictive skill. I am glad the IPCC are not designing mobile phone networks or market research software. One of the IPCC models called FOALS (aka Planet Alternating Current) is plotted over at The Blackboard.
The Draft Garnaut Report is to be commended for commissioning a study Global temperature trends – Breusch and Vahid (BV) by two prominent Australian National University (ANU) econometricians to examine global temperature series. The approach they take to modeling temperature has a long history. See for example at RealClimate, Rybski, and Koutsoyiannis. Their findings that the significance barely reaches the 95% level with these kinds of models is not inconsistent with any of them. Even if there are no new insights, independent statistical studies by experts outside the field help to build trust and confidence in a controversial issue.
However, the results of the study are no reason for ‘high fives’ among proponents of man-made global warming. Despite concluding that “the temperatures recorded in most of the last decade lie above the confidence level produced by any model that does not allow for a warming trend”, the study also reveals just how close the case for anthropogenic global warming is to a spurious regression on a random walk.
Below are some of the issues that have been raised about the paper on the various climate blogs.
Here is a simple statistical analysis using linear regression showing global warming of 0.2C this decade (as projected in the IPCC Fourth Assessment Report 2007) is “unlikely”.
Below are graphs for the last ten years and the trend line for global temperatures for four sources from Anthony Watts over the period January 1998 to February 2008. The simple linear regression line through the points shows the 10 year trend.
One of the main claims of the theory of global warming is that greenhouse gases in the atmosphere cause increasing temperatures. If temperatures stop increasing for long enough, while greenhouse gases such as CO2 continue to rise, then we could be justified in not believing the theory.
The basic numeracy skill from statistics is the hypothesis test. To set up the test we assume no difference between the datum being tested (called a null hypothesis or H0) and estimate the probability of assuming incorrectly, based on the data. The hypothesis test on these data would be as follows:
Predicting global temperatures seems to be entering general awareness as a worthwhile exercise. As I have published about recently, I think climate models are inadequately validated, confidence in the skill of models to forecast global warming is vastly exaggerated, and current skill is not enough to serve useful purposes. I thought I would tabulate some of the various predictions as I come across them. This is a fair test, as the future is unknown, and at the end of the year we can see whose is most accurate. Read the rest of this entry…
Long-range dependence is being identified many disciplines such as, networking, databases, economics, climate and biodiversity. LTP is competing with the sexy “long tail” for top spot as a theory of cultural consumption. Thus, the need for software offering complete long-range dependence analysis is crucial.
Here is a summary of the chapters in my upcoming book Niche Modeling to be published by CRC Press. Many of the topics have been introduced as posts on the blog. My deepest thanks to everyone who has commented and so helped in the refinement of ideas, and particularly in providing motivation and focus.
Writing a book is a huge task, much of it a slog, and its not over yet. But I hope to get it to the publishers so it will be available at the end of this year. Here is the dustjacket blurb:
Through theory, applications, and examples of inferences, this book shows how to conduct and evaluate ecological niche modeling (ENM) projects in any area of application. It features a series of theoretical and practical exercises in developing and evaluating ecological niche models using a range of software supplied on an accompanying CD. These cover geographic information systems, multivariate modeling, artificial intelligence methods, data handling, and information infrastructure. The author then features applications of predictive modeling methods with reference to valid inference from assumptions. This is a seminal reference for ecologists as well as a superb hands-on text for students.
Part 1: Informatics
Functions: This chapter summarizes major types, operations and relationships encountered in the book and in niche modeling. This and the following two chapters could be treated as a tutorial in the R. For example, the main functions for representing the inverted ‘U’ shape characteristic of a niche — step, Gaussian, quadratic and ramp functions – are illustrated in both graphical from and R code. The chapeter concludes with the ACF and lag plots, in one or two dimensions.
Data: This chapter demonstrates how to manage simple biodiversity databases using R. By using data frames as tables,
it is possible to replicate the basic spreadsheet and relational database operations with R’s powerful indexing functions.
While a database is necessary for large-scale data management, R can eliminate conversion problems as data is moved between systems.
Spatial:
R and image processing operations can perform many of the
elementary spatial operations necessary for niche modeling.
While these do not replace a GIS, it demonstrates that generalization of arithmetic concepts to images can be implemented simple spatial operations efficiently.
Part 2: Modeling
Theory: Set theory helps to identify the basic assumptions
underlying niche modeling, and the relationships and constraints between these
assumptions. The chapter shows the standard definition of the niche as
environmental envelopes is equivalent to a box topology. It is proven that when
extended to infinite dimensions of environmental variables this definition
loses the property of continuity between environmental and geographic spaces.
Using the product topology for niches would retain this property.
Previously “A New Temperature Reconstruction” used random data with long term persistence (LTP) to illustrate the circular reasoning behind the ‘hockey stick’ reconstruction of past temperatures. This one shows the potential for false positives due to the statistics used in the ‘hockey stick’. The dynamic simulation below shows future temperatures predicted using a random fractional differencing algorithm that generates realistic LTP behavior. Future temperatures and validation statistics are calculated each time the page is reloaded. One unusual statistic used in MBH98 suggests the future can be predicted using random numbers.
Note: This is a first version of the application and may contain errors and be improved considerably. The code is freely available under the GPL to order to promote open science. See The Reference Frame for more information.
Reload page for new prediction. Measured and predicted future temperatures, with years on the x axis, and temperature anomalies on the y axis. The measured temperatures are in blue and the simulated temperatures are in red. Black points are measured temperatures for years in the validation period.
Mathematical shapes can affect our lives and the decisions we make.
The
hockey stick graph describing the earths average temperature over the last millennia has been the subject of a controversial debate over reliability of methods of statistical analysis.
From this to this …
The long tail is another new icon, described in a new book, developed in the Blogosphere, by Chris Anderson called “The Long Tail”:
Forget squeezing millions from a few megahits at the top of the charts. The future of entertainment is in the millions of niche markets at the shallow end of the bit stream. Chris Anderson explains all in a book called “The Long Tail”. Follow his continuing coverage of the subject on The Long Tail blog.
The long tail is the colloquial name for a long-known feature of statistical distributions (Zipf, Power laws, Pareto distributions and/or general Lévy distributions ). The feature is also known as “heavy tails”, “power-law tails” or “Pareto tails”. Such distributions resemble the accompanying graph.
In these distributions a low frequency or low-amplitude population that gradually “tails off” follows a high frequency or high-amplitude population. In many cases the infrequent or low-amplitude events—the long tail, represented here by the yellow portion of the graph—can cumulatively outnumber or outweigh the initial portion of the graph, such that in aggregate they comprise the majority.
The Australian Institute of Geoscientists News has published online my article “Reconstruction of past climate using series with red noise” on page 14. Many thanks to Louis Hissink the editor for the rapidity of this publication. It is actually a very interesting newsletter with articles on the IPCC, and a summary of the state of the hockey stick (or hokey stick). There are articles on the K-T boundary controversy and how to set up an exploration company.
Today I am reporting more results of reconstructing past climates with randomly generated sequences (http://www.climateaudit.org/?p=566). Here are a few experiments to identify the critical components of the dendroclimatology methodology. I record the skill of reconstruction with: different types of series (i.i.d., alternating means and fractional differencing), and dropping each component of the methodology in turn (positive slope, positive correlation, calibration with inverse linear model).
Random Series
Some alternatives for generating random series are: independent and identically distributed errors (called i.i.d), and two ways of generating series with ‘red noise’ or long term persistence (LTP): alternating means and fractional differencing. An example each series with the CRU temperature data overlaid are below.
Figure 1. Three random series generated to simulate CRU temperatures over 2000 years. The i.i.d. series with a standard deviation equal to the CRU temperatures. Parameters for altmeans were arbitrarily chosen, while parameters for fracdiff were calibrated using the R fracdiff package. Note the i.i.d is least realistic, altmeans is similar with some artifactual ‘jumps’, while the fracdiff is very similar to temperatures.
To follow up on the last post, I have calculated the RE as well as the R2 statsitics for the reconstruction from the random series. The same approach was used, i.e. generate 1000 sequences with LTP, select those with positive slope and R2>0.1, calibrate on linear model, and average. Here is the reconstruction again, with the test and training periods marked with a horizontal dashed line (test period to the left, training to right of temperature values):
Below is an investigation of scale invariance or long term persistence (LTP) in time series including tree-ring proxies – the recognition, quantification and implications for analysis – drawn largely from Koutsoyiannis [2] (preprints available here). In researching this topic, I found a lot of misconceptions about LTP phenomena, such as LTP implying a long term memory process, and a lack of recognition of the implications of LTP. As to implications, the standard error of the mean of global temperatures at 30 data points is 4 times larger than the usual estimate for normal errors. Given that LTP is a fact of nature – attributed by Koutsoyiannis to the maximum entropy (ME) principle – this strongly suggests the H should be considered in all hypothesis testing. Read the rest of this entry…