Web landshape.org

First Time At Niche Modeling?

This is a blog on the power of numeracy. My first book — Niche Modeling — is now in print.

The first six chapters are tutorial topics in R programming and theoretical topics in niche modeling: functions, data, spatial, topology, environmental data collections, and examples. The last six chapters are about using niche modeling to detect errors: bias, autocorrelation, non-linearity, long term persistence, circularity and fraud - useful information for any biological modeler.

May 31, 2006

‘Results management’ — detection and diagnosis using Benford’s Law

Filed under: Science, Finance — admin @ 11:53 pm

Can the fabrication of research results be prevented? Can the peer review process be augmented with automated checking? These questions become more important with automated submission of data to archives. The potential usefulness of automated methods of detecting at least some forms of either intentional or unintentional ‘result management’ is clear.

Benford’s Law is a postulated relationship on the frequency of digits (Benford 1938). It states that the distribution of the combination of digits in a set of random data drawn from a set of random distributions follows the log relationship (Hill 1998). Benford’s Law, actually more of a conjecture, suggests the probability of occurrence of a sequence of digits d is given by the equation:

Prob(d) = log10(1+1/d)

For example, the probability of the sequence of digits 1,2,3 is given by log10(1+1/123). Below is the distribution predicted by Benford’s Law for the first four digits.

chap12-001.png

Fig 1. Expected distributions of the first four digits according to Benford’s Law.

(more…)

May 23, 2006

Niche modeling — what is it?

Filed under: Statistics — admin @ 9:09 pm

There are a number of ways to answer this question. There are a rich diversity of methods to predict species’ distribution and they could be listed and described. Alternatively, the biological relationships between species and the environment could be emphasized, and approaches from population dynamics used as a starting point.

A more general approach to niche modeling can be based the statistical idea of the probability distribution.

Definition: A niche model is a probability distribution defined on environmental variables.

Definition: A probability distribution f(E) is an assignment of a probability to every interval on a set of environmental variables E.

This definition of the niche as a probability distribution over sets of environmental variables allows for developing niche models in new ways over new entities. (more…)

May 19, 2006

Geographic models with R and netpbm

Filed under: Uncategorized — admin @ 11:04 pm

Geographic information is a major component of niche modeling in any spatial science such as ecology. Geographic Information Systems (GIS) are the tool of choice when the main purpose is managing geographic information.

As in the previous chapter when R was used as a relational database, R can be used to perform simple spatial tasks. This both avoids the need for a separate GIS system when not necessary, and helps to build knowledge of advanced use of the R language.

R is not very efficient for some of these operations as data must be manipulated in a form suitable for mathematical operations, and this limits the size of the data that can be handled. Another more efficient way to perform basic ENM functions on large sets of data is to use image processing. For this, a good image processing package is called netpbm and examples of the use of image utilities to perform fundamental analytical operations for modeling are given.

(more…)