Sunday, April 1, 2012


cc: "Thomas.R.Karl" <>, carl mears <>, "David C. Bader" <>, "'Dian J. Seidel'" <>, "'Francis W. Zwiers'" <>, Frank Wentz <>, Karl Taylor <>, Melissa Free <>, "Michael C. MacCracken" <>, "'Philip D. Jones'" <>,, Sherwood Steven <>, Steve Klein <>, 'Susan Solomon' <>, "Thorne, Peter" <>, Tim Osborn <>, Tom Wigley <>
date: Sun, 16 Dec 2007 14:39:09 +0100
from: Leopold Haimberger <>
subject: Re: [Fwd: sorry to take your time up, but really do need a scrub

Hello John and colleagues,
My colleagues from Vienna and I have a paper under review in J. Climate, we submitted the
second revision last week. The reviews of the
first revision were quite positive.
It tries to explain why versions 1.3 and 1.4 in particular are better than 1.2. It also
explains a method that uses the breakpoint dates gained from
RAOBCORE as metadata for a neighbor composite homogenization method similar to HadAT. This
second method is much less dependent on possible inhomogeneities in the ERA-40 background.
I sent it to Ben already but those interested may download it from
but please keep it confidential.
The paper shows that the radiosonde homogenization is making progress but of course ongoing
work. At least it shows that there seem to be
ways that remove much the pervasive bias in radiosonde temperatures.
Time series from RAOBCORE v1.4 are already published in "Arguez et al (2007) Supplement to
State of the Climate in 2006 BAMS 88 Nr 6, s1-s135" (I believe the plot numbers are 2.4 and
2.5). These plots were collected by J. Christy about the same time last year.
I have to leave now but can you give details later.
John Lanzante wrote:


Perhaps a resampling test would be appropriate. The tests you have performed
consist of pairing an observed time series (UAH or RSS MSU) with each one
of 49 GCM times series from your "ensemble of opportunity". Significance
of the difference between each pair of obs/GCM trends yields a certain
number of "hits".

To determine a baseline for judging how likely it would be to obtain the
given number of hits one could perform a set of resampling trials by
treating one of the ensemble members as a surrogate observation. For each
trial, select at random one of the 49 GCM members to be the "observation".
>From the remaining 48 members draw a bootstrap sample of 49, and perform
49 tests, yielding a certain number of "hits". Repeat this many times to
generate a distribution of "hits".

The actual number of hits, based on the real observations could then be
referenced to the Monte Carlo distribution to yield a probability that this
could have occurred by chance. The basic idea is to see if the observed
trend is inconsistent with the GCM ensemble of trends.

There are a couple of additional tweaks that could be applied to your method.
You are currently computing trends for each of the two time series in the
pair and assessing the significance of their differences. Why not first
create a difference time series and assess the significance of it's trend?
The advantage of this is that you would reduce somewhat the autocorrelation
in the time series and hence the effect of the "degrees of freedom"
adjustment. Since the GCM runs are based on coupled model runs this
differencing would help remove the common externally forced variability,
but not internally forced variability, so the adjustment would still be

Another tweak would be to alter the significance level used to assess
differences in trends. Currently you are using the 5% level, which yields
only a small number of hits. If you made this less stringent you would get
potentially more weaker hits. But it would all come out in the wash so to
speak since the number of hits in the Monte Carlo simulations would increase
as well. I suspect that increasing the number of expected hits would make the
whole procedure more powerful/efficient in a statistical sense since you
would no longer be dealing with a "rare event". In the current scheme, using
a 5% level with 49 pairings you have an expected hit rate of 0.05 X 49 = 2.45.
For example, if instead you used a 20% significance level you would have an
expected hit rate of 0.20 X 49 = 9.8.

I hope this helps.

On an unrelated matter, I'm wondering a bit about the different versions of
Leo's new radiosonde dataset (RAOBCORE). I was surprised to see that the
latest version has considerably more tropospheric warming than I recalled
from an earlier version that was written up in JCLI in 2007. I have a
couple of questions that I'd like to ask Leo. One concern is that if we use
the latest version of RAOBCORE is there a paper that we can reference --
if this is not in a peer-reviewed journal is there a paper in submission?
The other question is: could you briefly comment on the differences in
methodology used to generate the latest version of RAOBCORE as compared to
the version used in JCLI 2007, and what/when/where did changes occur to
yield a stronger warming trend?

Best regards,


On Saturday 15 December 2007 12:21 pm, Thomas.R.Karl wrote:

Thanks Ben,

You have the makings of a nice article.

I note that we would expect to 10 cases that are significantly different
by chance (based on the 196 tests at the .05 sig level). You found 3.
With appropriately corrected Leopold I suspect you will find there is
indeed stat sig. similar trends incl. amplification. Setting up the
statistical testing should be interesting with this many combinations.

Regards, Tom

Ao. Univ. Prof. Dr. Leopold Haimberger
Institut f�r Meteorologie und Geophysik, Universit�t Wien
Althanstra�e 14, A - 1090 Wien
Tel.: +43 1 4277 53712
Fax.: +43 1 4277 9537

No comments:

Post a Comment