date: Fri, 11 Jan 2008 18:02:54 -0800
from: Karl Taylor <taylor13atXYZxyzl.gov>
subject: Re: Updated Figures
The upper panel of figure 2 shows the distribution between differences
between simulated and observed trends. The lower panel shows the kind
of differences we can expect to get by chance alone (i.e., unforced
variability), according to ensembles of simulations by individual
models. If we had larger ensembles, we would expect the distribution of
these intra-ensemble differences to be more nearly symmetrical about
zero. By chance the mean of the results is displaced negatively.
As Ben mentioned, if he had included run 1 minus run 2, as well as run 2
minus run 1 (and similarly for other pairs), the expected symmetry would
be realized, but he was afraid that this would constitute "double
counting". The point of the diagram, however, is to obtain our best
estimate of the differences we can expect to get by chance within a
model ensemble. I contend that the likelihood of getting a difference
of x is equal to the likelihood of getting a difference of -x (within a
single model's ensemble), so why not use this information to fill in the
pdf in a reasonable way. Thus, I would like to see each difference
plotted twice, once with a positive sign and again with a negative sign
(and, if you like, we can say we are weighting each point by a half, but
of course that doesn't matter here). In this way we will provide a
better picture of the true range of differences we would expect to get
from each model ensemble.
One of the unfortunate problems with the asymmetry of the current figure
is that to a casual reader it might suggest a consistency between the
intra-ensemble distributions and the model-obs distributions that is not
real (and would be unexpected): namely that the differences between
trends in runs by individual models also typically are displaced
negatively, just like the difference between model and obs. This is, of
course, incorrect, and I think we should guard against this
Ben and I have already discussed this point, and I think we're both
still a bit unsure on what's the best thing to do here. Perhaps others
can provide convincing arguments for keeping the figure as is or making
it symmetric as I suggest.
There are a few other minor points concerning figure 2 which I'll write
down here, so that I don't forget them before I see Ben next on Monday.
1. In panel A, I would plot the histogram for model-obs, not obs-model.
I'm used to thinking of errors as being positive when the model value
is greater than observed and vice versa.
2. It would appear that if we believe FGOALS or MIROC, then the
differences between many of the model runs and obs are not likely to be
due to chance alone, but indicate a real discrepancy. If, on the other
hand, we believe several of the other models (e.g., MRI or PCM),
relatively few of the the model-obs differences are significant. This
would seem to indicate that our conclusion depends on which model
ensembles we have most confidence in. Am I reasoning this correctly?
One complicating factor here is that the normalized differences are
ratios, which in the intra-ensemble case roughly measure the amplitude
of variability on 20-year time scales (since the true forced trend is
the same for both runs) relative to unforced variability on shorter
trends (as represented by the standard errors calculated from the
de-trended time-series). Thus, a model that has the total variability
about right will not yield the correct distribution unless the ratio of
the longer-term to shorter-term variability is correct. Similarly, a
model that has the incorrect total variance might yield a better
normalized trend distribution if the fraction of the total variability
exhibited on 20-year time-scales is correct.
3. Instead of '"Between realization" tests', wouldn't it be better to
say 'Intra-ensemble tests'?
4. Instead of "Model-vs-model results", wouldn't it be better to say
"Realization-vs-realization", not to imply that one model's run is
compared to a different model's run.
5. The model labels could be placed as axis labels in place of model number.
Ben Santer wrote:
> My apologies. I forgot to attach the Figures in my last email. Figures
> are appended now. I plead Douglass-induced forgetfulness...
> Best regards,
> Ben Santer wrote:
>> Dear folks,
>> Here are the revised Figures 1-3 of our contribution to IJoC.
>> Changes made:
>> Figure 1: In panel A, I've added some space to separate the UAH and
>> RSS trends from the tick marks on the right hand side of the plot, as
>> per Leo's request.
>> Figure 2: As Peter suggested, I've converted the Figure from one to
>> two panels. I agree that this is an improvement. The original Figure
>> was fairly "busy". Furthermore, the colored symbols (which denote
>> results for the "between realization" trend tests) bore no
>> relationship to the "Frequency of occurrence" scale on the y-axis.
>> This is now clear from panel B.
>> Figure 3: As Mike suggested, I've removed the legend from the interior
>> of the Figure (it's now below the Figure), and have added arrows to
>> indicate the theoretically-expected rejection rates for 5%, 10%, and
>> 20% tests. As Dian suggested, I've changed the colors and thicknesses
>> of the lines indicating results for the "paired trends". Visually,
>> attention is now drawn to the results we think are most reasonable -
>> the results for the paired trend tests with standard errors adjusted
>> for temporal autocorrelation effects.
>> Please let me know if you would like me to make any other changes.
>> With best regards,
>> Benjamin D. Santer
>> Program for Climate Model Diagnosis and Intercomparison
>> Lawrence Livermore National Laboratory
>> P.O. Box 808, Mail Stop L-103
>> Livermore, CA 94550, U.S.A.
>> Tel: (925) 422-2486
>> FAX: (925) 422-7675
>> email: santer1atXYZxyzl.gov