In Bill's excellent overview of correlations here, he used 'area' earlier as a predictor for 'area' later (area->area), and similarly 'extent' earlier as a predictor of 'extent' later (extent->extent).
Here, Bill noted that correlation of area->area is better than extent->extent and also that when taken over the shorter timeframes (2003->2012, 2005->2012 and 2007->2012) that the correlation factor reduces significantly (and in the case of 'extent->extent' almost completely disappears.
Others have pointed out that the correlation over the longer time frames is better because it simply qualifies the long term down trend that we all know is happening. So overall, it looks like predictions based on statistics are indeed problematic.
I think the reason that the correlations fall apart is because no physical meaning has been attached to using simply 'area' or 'extent' as a predictor for later 'area' or 'extent' of sea ice.
What I tried is to come up few variables known in June, that reflect how much energy the Arctic absorbs in June, and see if a combination (formula) of these variables works better as a predictor for 'later' 'area' or 'extent'.
I choose my variables so that they reflect the albedo ('whiteness') of the Arctic, since that determines how much solar energy the Arctic will absorb for the remainder of the season. I keep it real simple, and thus choose 3 variables for albedo:
(1) Snow - Northern Hemisphere snow cover
(2) Extent - Arctic sea ice area
(3) (Extent - Area) : Effect of the area of polynia and melting ponds throughout the ice
The 'predictor' formula (how 'white' the Arctic is) can then be:
Snow + Extent - (Extent - Area)
Now, for each of these factors, we need to determine how much of the solar radiation will cause ice melt. As a start, which has physical measing, I choose the following weight factors:
For (3): 1.0 (assuming that ALL solar radiation onto melting ice and into polynia will cause ice to melt later in the season.
For (2): 0.5 (assuming that half of the heat absorbed in the ocean OUTSIDE of the main pack will cause ice melt (while the other half would cause the ocean to warm up.
For (1): 0.25 (assuming that half the heat from lack of snow cover will be blown North, and half of that will go to ice melt.
With that rough theoretical 'guess', we then get to the formula that should serve as a 'predictor' for the amount of heat that is absorbed in June due to snow/ice cover:
0.25 * Snow + 0.5 * Extent - 1.0 * (Extent - Area)
Note that this formula expressed in simple factors is:
0.25 * Snow - 0.50 * Extent + 1.0 * Area
When I use this formula as the 'predictor' I get improved correlation numbers (especially for the shorter terms if you use only 'Area' or 'Extent' as a predictor), which suggests we are on the right track!
But what I was really interested in, is if by tweaking these weight factors (which after all were just educated guesses), if we can improve the correlation numbers even more. If it turns out that the 'optimum' correlation is way off from the weight factors I suggested above, then we know that the physical effects of 'albedo' amplification are simply not significantly visible in the later ice cover numbers.
So I used the 1995-2012 series (long enough for statistical quantity and short enough to not be affected by completely different melting regions) and tweaked the numbers until I found optimal correlation. This was the resulting formula (normalized Area weight to 1.0):
Formula1 = 0.26 * Snow - 0.59 * Extent + 1.0 * Area
This optimal weight factor choice based on observational data is almost scarily close to the very rough numbers that were based on pretty rough theoretical estimates, and we can talk more the meaning of that that later.
For now, let me show you the initial results I get for the June->September prediction. The first two lines are the results I got by reproducing Bill's June->Sept correlations and the last two lines are the correlations I get by applying my formula to predict Sept extent/area from June extent/area/snow-cover data:
1979-2012 1995-2012 2003-2012 2005-2012 2007-2012 area->area 0.93 0.90 0.84 0.74 0.85 extent->extent 0.84 0.77 0.48 0.14 0.28
formula1->area 0.94 0.94 0.93 0.90 0.90 formula1->extent 0.92 0.93 0.91 0.85 0.86
Now, what is interesting is that the correlation holds up, even for the shorter timeframes. This means that Formula1 is a better predictor that simply only 'area' or 'extent' for determining the September sea ice extent from information available in June. And the interesting thing is that this formula is based on physics of ice melt, rather than on statistical extrapolation of area and extent.
Another interesting thing is the prediction of the 2013 September ave min based on this formula. NSIDC has just published the June extent and area numbers, and in absense of Rutgers' snow cover numbers I plug in 200 k km^2 extra snow cover for June, then this formula predicts 4.5 M km^2 ice extent this September, with about 250 k km^2 standard deviation [2012 was 3.61, 2007 was 4.30 and 2011 was 4.63 million km2; N.]
Sorry for the long post, but it may be that predictions are NOT as problematic as we thought, and are NOT simply guesses, as long as we use some physically relevant data.
Much more on this later, as there is a lot more interesting info to deduce now that we know that physical effects ARE recognizable in the observational data.
And in a second comment Rob writes:
Of course weather for the remainder of the season still plays a role, but based on the simple model I presented, there is virtually no chance of reaching a record below 2012 this year.
The only thing I am specifically concerned about these holes (caused by the PAC 2013) in the Central Basin, which fall in category (3) of my formula. Heat absorbed there has the highest impact, and since they are in the Arctic Basin, they may cause quite a bit of havoc (significant ice loss in areas where we normally don't see it) later in the season.
Again, it is a very interesting melting season !