« ASI 2013 update 3: the Arctic goes POP | Main | Problematic predictions 2 »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Neil Blanchard

The physical structure of the ice, and the salinity of the surrounding sea water will have large effects, as well. The volume of the ice is probably the best measure; much better than area alone.


Nightvid Cole

Very interesting that there are so many negative correlation values.

I think the approach of Kaleschke and Spreen in their "Sea Ice Outlook 2010" (see also
ftp://ftp-projects.zmaw.de/seaice/prediction/ ) is better, and reduces artifacts of ice in areas that will be gone by September no matter what (e.g. Hudson Bay, Bering Sea).

But they haven't updated their ongoing outlooks page to include 2013 yet :(

Rick Aster

This is a nice analysis, and I hope readers can take a few minutes to look over it closely, including the tables of coefficients.

If “chaotic behavior” is hard to visualize, another way of picturing the change is that Arctic sea ice has become more like a pawn, at the mercy of larger forces at work. That is just a metaphor, of course, but in physical terms, it makes sense that as ice gets smaller and weaker, it is more easily pushed around.


"In other words, it has become more difficult to predict end-of-melt-season outcomes"

This is not a conclusion that can be drawn from the data shown here. To see why, consider the this data set: (10, 11), (11, 10), (10, 10) and (11,11). There is no correlation. And this dataset has no correlation either: (0, 1), (1, 0), (0, 0) and (1,1). Now, if you add these datasets together, suddenly there is a good correlation: a high number on the first datapoint always goes together with a high number on the second, and a low number with another low number.

This is exactly what happened in the arctic. In the 80s, there was high spring extent and high summer extent. In the '07-'12 period, there was much lower spring extent and summer extent. Apart, there is no correlation (natural variability overcomes the trend). Taken together, there is a correlation (the trend overcomes natural variability).

This phenomenon occurs in every dataset with a trend and random peturbations, so in itself this tells nothing about the predictability of the end-of-melt-season outcomes. A better way to test that hypothesis is to try to 'predict' each year by using the data of the other years, and see if there's a trend in the quality of the predictions. If people are interested, I can do that kind of analysis, but I suspect that any trend found is not significant. There's not enough data.


Nice post. The sea ice extent in the early melting season is indeed a poor predictor for the September mean extent.

The PIOMAS volume and the long term models (like Gompertz curves) have been more reliable. They are not predicting a new record for 2013 (yet). Based on the latest PIOMAS update.

The ice conditions this year are causing a high extent compared to volume. This pattern might change by September. But a complete pattern reversal seems unlikely. We'll have to wait and see.

Chris Reynolds


Did you detrend before taking correlations?

I've previously posted a graph of correlations in 10 day increments with the ice area at 7/9/XX.
But this was done with the difference between each successive year. In other words I have a table of years in columns with rows being ten day increments of CT Area. From that I make a secondary table in which each value is the difference from the previous year. i.e. from 1980 to 2012.

It is on that table of differences that I calculate the correlations, because the correlation equation gives spurious results in the presence of trends.

When I calculate the correlations in a likewise manner to the above graph, but for just a five year period, I get a noisier plot (to be expected), but a plot that more or less tracks the evolution of correlations seen above.

I think your shorter periods show less correlation because the longer ones contain more of a trend.

Is the trend in CT Area greater than that for NSIDC Extent? This is implied by the ratio of the two indices. If so that would explain the difference between extent and area.


I miss Intrade. Last year, I cleaned up on the minimum extent markets.

While you guys are, of course, trying to make accurate predictions based on science, if we ever get any kind of prediction markets like Intrade back, the key to making money on betting on this (and I recommend against betting the family silver) is to keep going long on the low numbers for years. Over time, you'll win.

Just like with the GISS temp markets, or the last named storm of Atlantic hurricane season markets (much more risky, but over time, if you bet the signal, you'll make up for your loss on noise).

The real trick is convincing the deniers to put their money where their mouths are. It's amazing how many of them are too chicken to bet--proof that they don't really believe their own BS.


I find this exceptionally interesting, but for different reasons.

It seems to me that every estimate is little more than a guess. OK pure statistical analysis does come closer to the actual end result than modelling, but pure statistical analysis did not predict 2005, 2007 or 2012. It can’t. Because it can only model on a trend. Events which are driven by factors we don’t understand and can’t predict, are not present in a statistical model. They can’t be.

In the don’t know camp we have
W/m^2 on average for the summer
Solar output
Weather events (including summer cyclones and blocking high’s)
El Nino
La Nina
Ice mechanics in dangerously weakened ice
Impact of Local methane
NAO future impact

In the “known for certain” camp we have
What happened last time

To me, to take the ONE variable we DO know and subject it to rigorous statistical analysis is simply creating a formula which states that

We are absolutely certain that our results equal “Don’t Know^3”

Every analysis, every prediction, every heavily worked out massaged and strictly analysed offering starts with. “If we see x y or z then we will get”.

It seems to me that the stricter the rules of analysis and the more time and effort put into it, the wider the margin of the guess. Because, on a year by year basis, the predictions are nothing more than a guess. Whether they are a better guess (more reasoned with more data), or a worse guess (gut call), or in the middle guess (taking all the changes into account I think….), they are nothing more than a guess. Because we do not clearly understand and cannot model, on a day by day basis, the environment that will unfold in each melting season.

Otherwise the climate scientists would beat us every time. They’ve had decades to work on all of this….


I see potential outcomes of melting seasons (extent/area minimum) as a battle between weather conditions and ice thickness/quality. Until a couple of years ago the minimum was determined almost entirely by weather conditions during the different phases of the melting season.

Last year we saw clearer than ever (more clearly than in 2010 and 2011) that ice thickness/quality has diminished to a point where it starts to challenge the dominance of weather conditions.

This year - with a start that was incredibly non-conducive to extent/area decrease - might turn out to be further evidence of this process, although it's too early to tell. I think we'll know more at the end of July.

Espen Olsen


I am pretty sure we know a lot more by the end of July ;)!


I am reminded by my explorations a few years ago into algorithms for day trading.

The issue we are wrestling with is very similar, interestingly enough; it is the dynamic relationship between short term and long term trends in a system with multiple inputs, each of which *themselves* may be cyclical, and be modified by feedback from others.

In day trading, the result is almost indisinguishable from a casino; I found that with the inputs available, the output/success was essentially random, regardless of how involved the algorithm applied (No, I did not lose any money; I ran the models first....)

Chris Reynolds said I think your shorter periods show less correlation because the longer ones contain more of a trend.

I think this captures the essence of it. Short term trends are decoupled from the specific outcome of the longer term cycle. Further, interaction between aspects of the system could multiply or cancel out others contributing to the outcome. For example, as a practical example a long enough anticyclone over the Beaufort, or pan-arctic dipole, could cancel and roll back the negative feedback of cloud and low temperatures in late May and early June which preserved ice extent and concentration. Conversely, a mild arctic cyclone forming might further ablate the heat entering the arctic, and form a bastion against rising continental temperatures on the margins.

Those I think are the phenomena which in short term directly affect the end of season state. So to that end, the answer lies in understanding the dynamics of the atmosphere, and over the short term (solstice to solstice) determining what effects it will have on the conditions which either strengthen or weaken the ice. At any given juncture, the state of the ice is mostly independent of the weather conditions immediately to follow. To that end, I'd suggest we'd benefif by examining year over year weather data as a factor for correlations, as well as the state of the ice.


I claim 0 expertise here.

But it strikes me that, besides weather, the _quality_ of the ice--how slushy, briny, thin, full of holes..--is becoming a much larger factor than the quantity measures of extent, area, and even volume that we're used to dealing with.

But, perhaps with the exception of thickness, we have few solid numbers to attach confidently to these more qualitative aspects of the sea ice.

We do, though, have a pretty good general idea that these qualities are generally deteriorating quite markedly.

That's why I agree with some other here and on the blog that we are just a few days or weeks of strong sunshine away from a major cliff, a new minimum and maybe even something more stunning. But I also agree that things are more uncertain and harder to read than ever, especially within any one year.

Rob Dekker

Bill, thank you for a great post.

I've been looking at these correlations of early area/extent as predictors for later area/extent as well, but from the perspective of physical effect of solar energy uptake during the melting season.

Specifically I focused on May and June as predictors of September ice extent, so my analysis does not have the 'width' of yours, although it has some 'depth' that I think may be valuable in this analysis.

Specifically, I worked with 4 variables (predictors) that should (on the basis of laws of physics of albedo effect) affect energy uptake and thus ice melt during the melting season :

1) Northern Hemisphere snow cover
2) Arctic sea ice area
3) Effect of the area of polynia and melting ponds throughout the ice extent, which I think is pretty well captured by the simple relation : (ice_extent - ice_area).
4) ice thickness of the ice that will melt before September.

I calculated (theoretically, and with a lot of guessing) how much of an effect each one of these variable should have on the September minimum, and then looked for evidence of these physical effects in the correlation data and linear regression results of observational data.

The results are quite interesting.

For starters, I can confirm that 'area' is a better predictor for the later state of the ice than 'extent', simply from theoretical physics point of view : area determines how much energy gets absorbed, and thus how much ice will melt later on, while 'extent' can vary simply how the wind blows, and does not (for the part of extent that does not include area) affect the amount of energy absorbed.

Secondly, I get a better correlation if I include early "snow cover" (using Rutgers data) than if I just include 'area' as a 'predictor'.
The effect is not large for Jun->Sept prediction, but it is significant, and for May, it seems that 'snow cover' is dominating the correlation.

Thirdly, I cannot find much evidence that point (4) (ice thickness) has changed much over the past 30 years, and this was surprising to me at first, until I realized that for any particular melting season, the ice that will melt out between May (or June) until September will be mostly First Year Ice. Whatever amount of MYI there is in any particular year will be mostly unaffected, and thus does not significantly affect the outcome of the prediction.

Finally, sorry that I don't have any data to present right now.
I don't have my programs and data at the computer I'm at right now, but tomorrow or Tuesday I'll present some result from my analysis.

For now, thank you for your post. It made me realize that, although correlation does not equal causation, it may be time to explain in more details that statistical analyses (of area, extent and snow cover) that we are often doing on this blog is actually grounded in the laws of physics, and that our observational data confirms theory. As a result, our predictions (if properly calculated) may not be as 'problematic' as you suggest.

More tomorrow.

Kevin McKinney

Thanks, Bill--nice post.

I agree, though, that "In other words, it has become more difficult to predict end-of-melt-season outcomes..." is an unwarranted conclusion. Since the reference period is much longer, variability is much lower; this by itself should improve the correlations.

Your conclusion could still be right, though; you could, for instance, analyze comparable periods from early in the record, say 1979-1984. How 'easy' was prediction then, as measured by the coefficients?

The thought about looking at detrended data is also a good one.

I agree, though, that "In other words, it has become more difficult to predict end-of-melt-season outcomes..." is an unwarranted conclusion. Since the reference period is much longer, variability is much lower; this by itself should improve the correlations.

Your conclusion could still be right, though; you could, for instance, analyze comparable periods from early in the record, say 1979-1984. How 'easy' was prediction then, as measured by the coefficients?

In a sense, the less dominant weather conditions become, the easier it should be to predict the minimum, right? Especially when the Arctic has become ice-free. ;-)


Well done, just shows the sensitivity of this thin ice to changes in weather and climate. Compare that to thick glaciers that respond over long periods. Juneau Icefield a good example.


Those numbers are great info. It seems SIA is more important to look at than extent through July.

Kevin McKinney
In a sense, the less dominant weather conditions become, the easier it should be to predict the minimum, right? Especially when the Arctic has become ice-free. ;-)

Easier, yes, but much less interesting!

Maybe you could look at this idea of the weather becoming less important. Rather than just doing one early comparison, follow the lead of a paper like Santer et al 2011 by computing and comparing coefficients for all possible bins of defined lengths.

For example, 5-year bins would look like this:

...and so on to:

Would you find a clear pattern of change in the resulting coefficients? If so, then that would suggest that role of weather--the main ingredient in short-term variability, one would presume--might be changing.

(It wouldn't prove it, of course; you'd still need to look at whether the weather itself was changing in ways that bias the analysis. And we already know it is changing--it's getting warmer and warmer up there, on average!)


The SIA stats in here suggest that a 2012 min is almost impossible, but we shouldn't rule anything out. However, it makes me question why so many are confident of a new record. This isn't extent which is unreliable. This is area which by the numbers is strong. Why is this the year where we get area to go off the grid from June? Even last year didn't.

Rob Dekker

As promised, a bit more on getting physics of the albedo effect involved in correlations.

In Bill's excellent overview of correlations here, he used 'area' earlier as a predictor for 'area' later (area->area), and similarly 'extent' earlier as a predictor of 'extent' later (extent->extent).

Here, Bill noted that correlation of area->area is better than extent->extent and also that when taken over the shorter timeframes (2003->2012, 2005->2012 and 2007->2012) that the correlation factor reduces significantly (and in the case of 'extent->extent' almost completely disappears.

Others have pointed out that the correlation over the longer time frames is better because it simply qualifies the long term down trend that we all know is happening. So overall, it looks like predictions based on statistics are indeed problematic.

I think the reason that the correlations fall apart is because no physical meaning has been attached to using simply 'area' or 'extent' as a predictor for later 'area' or 'extent' of sea ice.

What I tried is to come up few variables known in June, that reflect how much energy the Arctic absorbs in June, and see if a combination (formula) of these variables works better as a predictor for 'later' 'area' or 'extent'.

I choose my variables so that they reflect the albedo ('whiteness') of the Arctic, since that determines how much solar energy the Arctic will absorb for the remainder of the season. I keep it real simple, and thus choose 3 variable for albedo :

(1) Snow - Northern Hemisphere snow cover
(2) Extent - Arctic sea ice area
(3) (Extent - Area) : Effect of the area of polynia and melting ponds throughout the ice

The 'predictor' formula (how 'white' the Arctic is) can then be :

Snow + Extent - (Extent - Area)

Now, for each of these factors, we need to determine how much of the solar radiation will cause ice melt. As a start, which has physical measing, I choose the following weight factors :

For (3) : 1.0 (assuming that ALL solar radiation onto melting ice and into polynia will cause ice to melt later in the season.
For (2) : 0.5 (assuming that half of the heat absorbed in the ocean OUTSIDE of the main pack will cause ice melt (while the other half would cause the ocean to warm up.
For (1) : 0.25 (assuming that half the heat from lack of snow cover will be blown North, and half of that will go to ice melt.

With that rough theoretical 'guess', we then get to the formula that should serve as a 'predictor' for the amount of heat that is absorbed in June due to snow/ice cover :

0.25 * Snow + 0.5 * Extent - 1.0 * (Extent - Area)

Note that this formula, expressed in simple factors is :

0.25 * Snow - 0.50 * Extent + 1.0 * Area

When I use this formula as the 'predictor' I get improved correlation numbers (especially for the shorter terms if you use only 'Area' or 'Extent' as a predictor), which suggests we are on the right track !

But what I was really interested in, is if by tweeking these weight factors (which after all were just educated guesses), if we can improve the correlation numbers even more. If it turns out that the 'optimum' correlation is way off from the weight factors I suggested above, then we know that the physical effects of 'albedo' amplification are simply not significantly visible in the later ice cover numbers.
So I used the 1995-2012 series (long enough for statistical quantity and short enough to not be affected by completely different melting regions) and tweeked the numbers until I found optimal correlation. This was the resulting formula (normalized Area weight to 1.0) :

Formula1 = 0.26 * Snow - 0.59 * Extent + 1.0 * Area

This optimal weight factor choice based on observational data is almost scarily close to the very rough numbers that were based on pretty rough theoretical estimates, and we can talk more the meaning of that that later.

For now, let me show you the initial results I get for the June->September prediction.
The first two lines are the results I got by reproducing Bill's June->Sept correlations and the last two lines are the correlations I get by applying my formula to predict Sept extent/area from June extent/area/snow-cover data :

               1979-2012  1995-2012  2003-2012  2005-2012  2007-2012
area->area       0.93       0.90       0.84       0.74       0.85
extent->extent   0.84       0.77       0.48       0.14       0.28

formula1->area 0.94 0.94 0.93 0.90 0.90
formula1->extent 0.92 0.93 0.91 0.85 0.86

Now, what is interesting is that the correlation holds up, even for the shorter timeframes.

This means that Formula1 is a better predictor that simply only 'area' or 'extent' for determining the September sea ice extent from information available in June.

And the interesting thing is that this formula is based on physics of ice melt, rather than on statistical extrapolation of area and extent.

Another interesting thing is the prediction of the 2013 September ave min based on this formula. NSIDC has just published the June extent and area numbers, and in absense of Rutgers' snow cover numbers I plug in 200 k km^2 extra snow cover for June, then this formula predicts 4.5 M km^2 ice extent this September, with about 250 k km^2 standard deviation.

Sorry for the long post, but it may be that predictions are NOT as problematic as we thought, and are NOT simply guesses, as long as we use some physically relevant data.

Much more on this later, as there is a lot more interesting info to deduce now that we know that physical effects ARE recognizable in the observational data.


Great post Rob. I think we can go pretty low if the weather sets up right. But my money is still higher than your predicted numbers, but it will be fun to watch. I am not on the new record train this year given the data to this point.

I like to see detailed reasoning like that. Thanks for the post.

Rob Dekker

Thanks Henry. Of course weather for the remainder of the season still plays a role, but based on the simple model I presented, there is virtually no chance of reaching a record below 2012 this year.

The only thing I am specifically concerned about these holes (caused by the PAC 2013) in the Central Basin, which fall in category (3) of my formula. Heat absorbed there has the highest impact, and since they are in the Arctic Basin, they may cause quite a bit of havoc (significant ice loss in areas where we normally don't see it) later in the season.

Again, it is a very interesting melting season !


Note that the resolution in area measurements gives possibility to a situation where numerous small leads amount to some albedo decrease. I can't come up with a proper measure for these. But great to see some advance in forecasting. My attempt failed badly in not taking the snow amount in consideration. Thanks Rob Dekker.


Rob, your comment was so good that I elevated it to a follow-up guest blog: Problematic predictions 2.

Truly great stuff.


the attempt (2011) (IJIS extent) http://erimaassa.blogspot.fi/2011/07/number.html (I've lost the file in the recent crash of the computer)
true 4663594 vs. predicted 4015554

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment