Sign up/in

Authors

Abouts

Blogroll

Literature

Mathematical shapes can affect our lives and the decisions we make.

The
hockey stick graph
describing the earths average temperature over the last millennia has been the subject of a controversial debate over reliability of methods of statistical analysis.

hockey stick.jpg
From this to this …
Long_tail.PNG

The long tail is another new icon, described in a new book, developed in the Blogosphere, by Chris Anderson called “The Long Tail”:

Forget squeezing millions from a few megahits at the top of the charts. The future of entertainment is in the millions of niche markets at the shallow end of the bit stream. Chris Anderson explains all in a book called “The Long Tail”. Follow his continuing coverage of the subject on The Long Tail blog.

As explained in Wikipedia:

The long tail is the colloquial name for a long-known feature of statistical distributions (Zipf, Power laws, Pareto distributions and/or general Lévy distributions ). The feature is also known as “heavy tails”, “power-law tails” or “Pareto tails”. Such distributions resemble the accompanying graph.

In these distributions a low frequency or low-amplitude population that gradually “tails off” follows a high frequency or high-amplitude population. In many cases the infrequent or low-amplitude events—the long tail, represented here by the yellow portion of the graph—can cumulatively outnumber or outweigh the initial portion of the graph, such that in aggregate they comprise the majority.

Read the rest of this entry…

Comments

The recently published paper by Jane Elith and Catherine Graham et.al.”Novel methods improve prediction of species’ distributions from occurrence data” (EG06) is sure to be a landmark study in the field. EG06 compares 16 modeling methods using 226 well-surveyed species in 6 regions of the world. Measures of statistical skill on held back data show a spread from a wide range of methods including: the older methods such as BIOCLIM, DOMAIN, through GARP, GLM and GAM to the newer arrivals from machine learning MAXENT, BRT and community based method GDM, prompting the conclusion “novel methods improve prediction”. The work of a great many people is appreciated, as these results will no doubt be very helpful to many biodiversity modellers in the future. Read the rest of this entry…

Comments

Having run across two recent notes on science blogs by academics, and written a post about the benefits of blogs to scientists here, I felt compelled to issue a warning for readers, that while there are positives and negatives to blogs — they just don’t get it.

Read the rest of this entry…

Comments

Each time WhyWhere runs a tally is kept of the accuracy of each of the variables. Even with the few examples so far, this tally is showing some quite surprising results.

Read the rest of this entry…

Comments

In an earlier post on the spatial analysis of increasing house prices in the US, I used the small set annual climate variables (7) and found that precipitation rather than temperature was a better predictor of metropolitan areas with increases greater than 20% in median price in 2005.

Distribution of metropolitan areas with house price increases greater than 20% in 2005, as predicted by elevation model.

Here I have run the analysis again using the new version of WhyWhere and the entire set of available terrestrial variables (All_Terrestrial). This time the best variable was etopo-terr (accuracy = 0.80), a raw elevation variable. I think this is a more sensible result than achieved with the climate variables, as appreciation has been well known to have been in coastal areas.
Read the rest of this entry…

Comments

Dataset: The default test points at http://landscape.sdsc.edu/ww-testform.html for a bird in Mexico (I think the Eared Trogon but I can’t be sure anymore, my interest is the stats).

0.0888124196480469.png
Read the rest of this entry…

Comments

The new [tag]WhyWhere[/tag] application is starting to work smoothly on [tag]large datasets[/tag] now (http://landscape.sdsc.edu/ww-testform.html). I have added a list of all terrestrial data (All_Terrestrial), though there will be some errors until I clean that lot up, it should be usable. I have been thinking about how to deal with the numbers of data sets in a [tag]streaming data[/tag] framework.

Read the rest of this entry…

Comments

archives

tag cloud