Friday, April 27, 2012

3578.txt

cc: "Thorne, Peter" <peter.thorneatXYZxyzoffice.gov.uk>, Leopold Haimberger <leopold.haimbergeratXYZxyzvie.ac.at>, Karl Taylor <taylor13atXYZxyzl.gov>, Tom Wigley <wigleyatXYZxyz.ucar.edu>, John Lanzante <John.LanzanteatXYZxyza.gov>, "'Susan Solomon'" <ssolomonatXYZxyznoaa.gov>, Melissa Free <Melissa.FreeatXYZxyza.gov>, peter gleckler <gleckler1atXYZxyzl.gov>, "'Philip D. Jones'" <p.jonesatXYZxyz.ac.uk>, Thomas R Karl <Thomas.R.KarlatXYZxyza.gov>, Steve Klein <klein21atXYZxyzl.llnl.gov>, carl mears <mearsatXYZxyzss.com>, Doug Nychka <nychkaatXYZxyzr.edu>, Gavin Schmidt <gschmidtatXYZxyzs.nasa.gov>, Frank Wentz <frank.wentzatXYZxyzss.com>, ssolomonatXYZxyzi.com
date: Fri, 25 Apr 2008 12:55:28 -0700
from: Ben Santer <santer1atXYZxyzl.gov>
subject: Re: [Fwd: JOC-08-0098 - International Journal of Climatology]
to: Steve Sherwood <steven.sherwoodatXYZxyze.edu>

<x-flowed>
Dear Steve,

Thanks very much for these comments. They will be very helpful in
responding to Reviewer #1.

Best regards,

Ben

Steve Sherwood wrote:
> Ben,
>
> It sounds like the reviewer was fair. If (s)he misunderstood or didn't
> catch things, the length of the manuscript may have been a factor, and I
> am definitely sympathetic to that particular complaint.
>>
>> CONCERN #1: Assumption of an AR-1 model for regression residuals.
> I also am no great fan of AR1 models parameterized by the lag-1
> variance, because if the time step is too short they can go greatly
> astray at longer lags where it matters. But if you choose the
> persistence parameter to give a good fit to the entire autocorrelation
> function--i.e. make sure it decays to 1/e at about the right lag--it
> should work fine. I suggest trying this to see whether it changes
> anything much, and if not, leaving it at that. I think that for simply
> generating confidence intervals on a scalar measure there is no reason
> to go to higher-order AR processes, as a matter of principle.
>
>> CONCERN #2: No "attempt to combine data across model runs."
> The only point of doing this would seem to be to test whether there are
> any individual models that can be falsified by the data. It is a
> judgment call whether to go down this road--my judgment would be, no,
> that is a subject for a model evaluation/intercomparison paper. The
> question at issue here is whether GCMs or the CMIP3 forcings share some
> common flaw; the implication of the Douglass et al paper is that they
> do, and that future climate may therefore venture outside the range
> simulated by GCMs. The appropriate null hypothesis is that the observed
> data record could with nonnegligible probability have been produced by a
> climate model---not that it could be reproduced by every climate model.
>
>>
>> The Reviewer seems to be arguing that the main advantage of his
>> approach #2 (use of ensemble-mean model trends in significance
>> testing) relative to our paired trends test (his approach #1) is that
>> non-independence of tests is less of an issue with approach #2. I'm
>> not sure whether I agree. Are results from tests involving GFDL CM2.0
>> and GFDL CM2.0 temperature data truly "independent" given that both
>> models were forced with the same historical changes in anthropogenic
>> and natural external forcings? The same concerns apply to the high-
>> and low-resolution versions of the MIROC model, the GISS models, etc.
> (S)he seems to have been referring to the fact that all models are
> tested with the same data. I also fail to see how any change in
> approach would affect this issue.
>>
>> I am puzzled by some of the comments the Reviewer has made at the top
>> of page 3 of his review. I guess the Reviewer is making these comments
>> in the context of the pair-wise tests described on page 2. Crucially,
>> the comment that we should use "...the standard error if testing the
>> average model trend" (and by "standard error" he means DCPS07's
>> sigma{SE}) IS INCONSISTENT with the Reviewer's approach #3, which
>> involves use of the inter-model standard deviation in testing the
>> average model trend.
> I also am puzzled. The standard error is appropriate if you have a
> large ensemble of observed time series, but not if you have only one.
> Computing the standard error of the model mean is useless when you have
> no good estimate of the mean of the real world to compare it to. The
> essential mistake of DCPS was to assume that the single real-world time
> series was a perfect estimator of the mean.
>>
>> And I disagree with the Reviewer's comments regarding the superfluous
>> nature of Section 6. The Reviewer states that, "when simulating from a
>> know (statistical) model... the test statistics should by definition
>> give the correct answer. The whole point of Section 6 is that the
>> DCPS07 consistency test does NOT give the correct answer when applied
>> to randomly-generated data!
> Maybe there is a more compact way to show this?
>> In order to satisfy the Reviewer's curiosity, I'm perfectly willing to
>> repeat the simulations described in Section 6 with a higher-order AR
>> model. However, I don't like the idea of simulation of synthetic
>> volcanoes, etc. This would be a huge time sink, and would not help to
>> illustrate or clarify the statistical mistakes in DCPS07.
> I wouldn't advise any of that.
>
> -SS
>


--
----------------------------------------------------------------------------
Benjamin D. Santer
Program for Climate Model Diagnosis and Intercomparison
Lawrence Livermore National Laboratory
P.O. Box 808, Mail Stop L-103
Livermore, CA 94550, U.S.A.
Tel: (925) 422-2486
FAX: (925) 422-7675
email: santer1atXYZxyzl.gov
----------------------------------------------------------------------------

</x-flowed>

No comments:

Post a Comment