This is a guest blog by Bill Fothergill, also known as billthefrog.
He sent it to me a couple of weeks ago, but it's still topical.
Many of the contributors to the Arctic
Sea Ice Blog have passed comment on the dangers of basing an
end-of-melt-season prediction on anything as simple as the current
area or extent values. The variables that determine the eventual
outcome of the annual melt season are truly legion, and it is sheer
folly to think that a single parameter provides anything other
than a tenuous glimpse at what eventually may transpire.
However, whilst few might disagree with the above assertion, it would obviously be better to base this claim on something a trifle more rigorous than a vague gut feeling or some unshakeable belief that the universe should conform to one’s prejudices. There are obviously many ways to perform some form of mathematical analysis on the available data, but an example of such a method would be to look at the correlation between various mean monthly values over a predetermined time frame.
In the example outlined here, the NSIDC monthly area and extent values are used to provide the data, and the CORREL function in Excel is used to calculate a type of relationship known as the correlation coefficient. The dataset(s) were read into a rectangular array with “months” along one dimension and “years” along the other such that, say, the April extent for each year from 1979-2012 could be compared in turn with its respective equivalent figure for May, June, July etc.
Correlation coefficients can take on any value between +1 and -1. A value of +1 would indicate a perfect functional relationship between the sets of figures. For example, every value for, say, July might be exactly 25% larger than the corresponding value for August. Similarly, if every August value was, say, 1 million square kilometres above its corresponding September figure, then the correlation coefficient would again be +1. (This will of course be recognisable to many as our dear old (dys)functional friend “y = mx + c”.) As the correlation gets weaker, the value of the coefficient decreases, and eventually will become negative if a rise in one variable is matched by a drop in the corresponding partner value.
As those living in the real world know all too well, the anomalies over the summer and autumn (fall) seasons having been getting rapidly larger of late. This then begs the question “how does this impact upon the predictive skill of any early-season value?”
Initially, I simply repeated the correlation exercise on a subset (2007-20012) of the overall dataset. (The reason for this choice of subset should be pretty obvious.) However, as this contained just 6 years worth of data, there was a concern that any result could be interpreted as being only an artefact of the selection period. (i.e. inadvertent cherry picking) In order to allay any such fears, two further subsets were also included by way of comparison.
Although the above table might seem somewhat confusing to those unversed in this form of analysis, some patterns should nevertheless be pretty obvious. Possibly the most striking is that, almost without exception, the coefficient based on area is significantly higher than that based on extent. In other words, when trying to predict what may transpire later in the year, one is more likely to meet with some degree of success if one concentrates on area, rather than extent.
One can also clearly see that the correlation coefficients pertaining to any of the subset periods are markedly lower than those for the entire 1979-2012 dataset. In other words, it has become more difficult to predict end-of-melt-season outcomes. (Assuming one’s prediction is predicated solely upon present extent/area data.) Although the coefficients for both area and extent drop significantly in each of the subsets, the deleterious effect on the extent coefficient is much more marked.
The third obvious thing that can be seen from the above is that the correlations grow weaker as the distance between the two sets of months grows. For example, the May: June correlation is stronger than that for May: August. This is of course exactly what one would expect. The Arctic sea ice can be regarded as a “system” that exhibits chaotic behaviour – as the time period increases, so does the scope for the unexpected.
By filtering and sorting, additional information can be gleaned as follows…
Taken over the entire dataset, the adjacent month correlations are pretty strong. This behaviour is generally what one might expect and is an example of “auto correlation” – basically if the level of ice is high (or low) one month, then it is very likely to be somewhat similar the following month. However, if one looks at the subset figures, even this normally tight relationship appears to have broken down, particularly around the April – June period.
Two and three month separations
The above can obviously be repeated for non-adjacent months as shown below…
It can clearly be seen that, even when looking at the entire dataset, the April coefficients are distinctly lower than those starting in subsequent months. However, in more recent years (as demonstrated by the various subset values) it is becoming necessary to wait until June or even July before one can make a reasonably reliable prediction for even just two or three months ahead. The presence of negative correlations can also be seen in the April and May figures.
Inevitably, people want to get an early feel for what the situation is likely to be at the time of the September minima. One can easily extract the relevant coefficients for just this purpose…
Even prior to recent years, the April and May figures appear to have been of limited value when trying to predict the September value. However, as can be seen from the amount of red in the above table(s), April and May seem to have become almost worse than useless as an indicator of what will happen later.
Summing up, attempts to predict the September extent based solely on the level of ice in April or May is basically a fool’s errand. Although there is a reasonable chance of being close based upon June figures (much more so for area, rather than extent) one really needs to wait until July before betting the family silver.