cc: mannatXYZxyzginia.edu

date: Mon, 19 Feb 2001 10:58:36 -0500

from: "Michael E. Mann" <mannatXYZxyztiproxy.evsc.virginia.edu>

subject: Re: verification results

to: Tim Osborn <t.osbornatXYZxyz.ac.uk>, Scott Rutherford <rutherfoatXYZxyzchutes.gso.uri.edu>

HI Tim,

Thanks for your response.

Yes, I was a bit (though not too) surprised at the similarity between the

seasonal results, but clearly warm season and calender mean are superior to

cold-season, and at least this much is to be expected. Much of the skill

comes from getting the baseline 19th century temperature right, and this is

somewhat similar between seasons.

The combined (multiproxy+MXD) results, if you saw those, are particularly

promising w/ a verification resolved variance of 0.83. Scott is currently

extending all 3 reconstructions (Multiproxy, mxd and combined) back to AD

1400. This should be very telling!

I agree that the verification scores, as currently computed, may be

slightly inflated by the fact that the infilled data have been used rather

than the raw data. I've given this some thought, and I propose that we s

do the verification 3 different ways. Tim, perhaps you can help out in

particular w/ number "3".

1) (multiproxy, MXD, and combined): Use all reconstructed gridpoints, using

the infilled data, the way Scott has done it. [ALREADY DONE]

2) (multiproxy, MXD, and combined): Only perform the MSE and total variance

sums used to compute the resolved variance over those points (in both space

and time) where the original instrumental data is available (ie, just skip

in the sum any location where there is no original value in the raw Jones

data, or a point in time for a given gridpoint where the original data is

missing).

3) (only makes sense for MXD, warm season) Perform the sums using only the

actual available Jones gridpoint data (warm season mean) for only those

temperature gridpoints which actually are MXD gridpoints themselves.

Compare the verification results for both the current reconstruction of the

corresponding temperature gridpoints based on the ridge

regression/expectation maximization algorithm, with the reconstructions of

those gridpoints directly from the MXD data.

Scott, perhaps you can let Tim know where he might most easily be of help

here. Let me know if there are any questions/comments about the above. Thanks.

I should have more than enough to talk about at EGS once we are done w/ the

additional runs/exercises described above. Scott and I will meet in C'Ville

in early March to try to get the results together. After EGS, we should

think about starting to write some of this up!

more soon,

mike

At 01:20 PM 2/19/01 +0000, Tim Osborn wrote:

>Dear Scott and Mike,

>

>sorry for the lack of response to your e-mail in January. I've been away

>quite a bit and then had three proposal deadlines come up last week, so was

>rather busy. The results look *very* promising, with good verification

>statistics etc. The multivariate resolved variance is only slightly higher

>for the warm season than for the cold season which is a little surprising,

>though I guess warm & cold seasons are more strongly correlated for the

>leading covariance structures that are most highly weighted than for the

>smaller scale variations that are suppressed. A couple of specific things:

>

>(1) MXD errors. A really useful measure that one can use here is the EPS

>(Expressed Population Signal), based on the mean correlation between the

>tree cores that make up each chronology (rbar), and the number of tree

>cores (n). Time-dependence can be included by putting the time-varying

>number of tree cores into the EPS equation. This would give some measure

>of the 387 chronologies. For the gridded case, some improvement would have

>to be estimated for the many grid boxes that contain >1 chronology, but

>that's not difficult to do. What is *difficult* is that I don't have

>either rbar or n for each chronology! All I have is, for each year and

>each chronology, the number of tree cores with data expressed as a fraction

>of the maximum number of tree cores for that chronology, i.e.,

>fraction(x,t) = n(x,t) / nmax(x). Since I don't know nmax for each

>chronology x, I can't work out n(x,t) from fraction(x,t). Nor do I know

>rbar(x). I've been wanting to get rbar(x) and n(x,t) ever since I started

>using this tree-ring data set (1997), but it's apparently not easily

>available. How important do you think it is for the current piece of work?

> If it is important then I could try to get it again, but if it's a fairly

>minor side issue then I won't.

>

>(2) From what I remember about the plan of action, the idea was to do these

>first runs using the already infilled instrumental data (though of course

>witheld for the pre-1901 period). I assume, therefore, that the timeseries

>and statistics shown are from this infilled & complete data set (I see now

>from the annual maps that this *is* the case because the "raw data" maps

>are complete). It would be useful to compare these MXD-based

>reconstructions with the instrumental data by sampling only those grid

>boxes in both the raw and reconstructed fields that originally contained

>real data in the instrumental data set. [I assume that the infilled

>instrumental data will be the same as the original instrumental data for

>these grid boxes.] The comparisons/statistics shown are useful, but only

>in addition to the comparison done against the (subsampled) original data,

>because (i) the message is harder to get across when using already infilled

>data; and (ii) the statistics *might* be slightly (artificially) improved

>when using infilled data. If you'd prefer me to do this, then let me know

>where the data sets are (plus format) and I can compare against my copy of

>the Jones instrumental data.

>

>So I guess that you should have enough to talk about at the EGS in Nice

>then Mike? Please let me know if you need any further input from me at

>this stage.

>

>Best regards to you both,

>

>Tim

>

>

>

>

>

