cc: Phil Jones <p.jonesatXYZxyz.ac.uk>, Kate Willett <kate.willettatXYZxyzoffice.gov.uk>, Dick Dee <email@example.com>
date: Mon, 16 Feb 2009 08:36:13 +0000
from: "peter.thorne" <peter.thorneatXYZxyzoffice.gov.uk>
subject: Re: updated HadCRUH
to: Adrain Simmons <Adrian.SimmonsatXYZxyzwf.int>
Adrian et al.,
without looking at the data in detail I suspect its because N. America
is so spotty data (because the data feed for N. America was corrupted
for Kate's PhD - we will remedy in our new dataset efforts - there are
literally a couple of thousand stations) that Kate's approach has led to
discernible differences. Kate will produce some sensible words on this
for the paper but basically:
1. I have performed some intra-station qc (different to HadCRUH) as part
of planned follow-on work.
2. Kate has taken this data and anomalised it so that on an annual-mean
basis the station series should agree over the overlap period.
3. Because when we gridded it up there were obvious hotspots Kate then
went cross-eyed for several days looking at station-neighbour comparison
plots to weed out those with obvious discontinuities (think of this as a
Because N. America in HadCRUH is so data sparse and Kate rejected a
fairly hefty proportion of stations (about 1/4 - more in RH than q) as
having obvious issues I suspect this is the reason for your disagreement
in that region (the other regions are better sampled so dropping
stations won't have as big an issue on the noise (red or white!) - I
think there's a Figure in Kate's J. Clim piece with gridbox sampling, if
not then its in her thesis). If you combine the two data files Kate sent
then over the overlap betweenthe datasets there should be very good
Kate looked at the different datasets (good vs. rubbish) and the broad-
scale signal is similar so all that splitting the data should do over
most regions (we only looked at extra-tropics, tropics, globe) is reduce
the noise from having rank bad data included. It shouldn't affect the
On Sun, 2009-02-15 at 20:36 +0000, Adrian Simmons wrote:
> Thanks, Phil. I'll reconsider figure order, and blending, once we see
> whether North America is repairable.
> All the best
> P.JonesatXYZxyz.ac.uk wrote:
> > Adrian,
> > Author order fine with me. Results look very good except for
> > North America. As the rest seem OK, it looks unlikely that
> > you've done soemthing wrong Adrian. I guess checks need
> > to be made.
> > As Kate has updated from 1994-2007, the overlaps could be
> > compared. On the two plots you've sent, you'll need to combine
> > the q and RH series from HadCRUH either side of 1994. The grey
> > lines don't go back to the first year in the plots.
> > Cheers
> > Phil
> >> Dear Kate
> >> I'm not really netcdf literate, but after my last email yesterday I
> >> remembered that someone helped me read a file a few years ago (for the
> >> 2004 ERA/NCEP/CRUTEM2v comparison paper with Phil and others, as it
> >> happens). So I've read your file with the extended (1994-2007) version
> >> of HadCRUH, and done some comparisons. The outcome is just as I'd hoped,
> >> with one proviso. If Phil will forgive me, I'll move you ahead of him in
> >> the author list because of this extra work.
> >> The first figure shows the comparison of ERA-Interim, the original
> >> HadCRUH and your new extension. Plotted are the global averages over
> >> land points, from 1989-2008. For 1989-1993 the spatial and temporal
> >> sampling is as in HadCRUH, and for 2004-2008 it is as in the new dataset
> >> (using the December 2007 coverage in computing the ERA values for 2008).
> >> For 1994 to 2003, the averaging is done (for all three datasets) only
> >> over points for which both HadCRUH and the new dataset have non-missing
> >> values. You can see that the extended dataset agrees with ERA-Interim as
> >> to the decrease in relative humidity in recent years, which is a bit of
> >> a relief as I can stick with the current explanation of the result.
> >> The proviso concerns the values for North America in the new dataset.
> >> The second attachment shows RH time series for the six land regions
> >> considered earlier. For five of the six regions the agreement is
> >> excellent, but North America stands out as different, with the new
> >> dataset matching neither HadCRUH nor ERA-Interim early in the period,
> >> and not showing the drying that ERA-Interim shows late in the period.
> >> This is reflected a little in the global plot in the first attachment.
> >> As discussed in an earlier mail, ERA-Interim does have relatively low
> >> rainfall after about 1999 compared with GPCC (and more so GPCP),
> >> suggesting it might also overdo the drying in RH, but the mismatch
> >> between HadCRUH and your new dataset over just North America looks odd.
> >> It is always possible I've made a mistake somewhere, but it's hard for
> >> me to see why in this case only North America would be affected, given
> >> the way I do the calculations. Maybe you could find time to have a
> >> closer look at this region.
> >> Otherwise, ERA-40 should reach the end of 2008 during the week ahead, so
> >> I'll finish the figures then, and try to get on with the writing.
> >> Best regards
> >> Adrian
> >> Kate Willett wrote:
> >>> Hi Adrian,
> >>> I've finished the HadCRUH update! It took longer than expected, as usual
> >>> but I've just gridded it, made zonal averages for the Globe, Northern
> >>> Hemisphere, Tropics and Southern Hemisphere and it looks good.
> >>> I have sent you both the homogeneity checked version
> >>> (ISDSUB_update_5by5_9407.nc.gz) and a gridded version of all the
> >>> stations kicked out from the homogeneity check
> >>> (ISDBADS_update_5by5_9407.nc.gz). These contain 'q_anomalies',
> >>> 'q_num_obs', 'rh_anomalies', 'rh_num_obs' and 'time' which is months
> >>> since January 1973. Missing data should be -1e+30. Next week I'll put
> >>> together a page of text describing exactly what I've done for your
> >>> paper. I'm happy to redo this text as necessary.
> >>> In brief, I took all the ISD stations that matched the HadCRUH stations
> >>> (2676) from 1994 to 2007 to give 10 years of overlap with HadCRUH. I
> >>> then created pentad mean anomalies using the HadCRUH 1973-2003
> >>> climatology. These were then adjusted to match the homogenisation
> >>> process of HadCRUH for the overlapping period and the entire series for
> >>> each station offset to match the median of the HadCRUH station for the
> >>> 1994-2003 period.
> >>> I then turned these into monthly mean anomalies and ran a t-test over
> >>> the overlapping period. Any stations with sparse data presence or
> >>> significantly different from its HadCRUH equivalent (at 5%) were removed
> >>> (now 2473 stations). I made neighbour composites for each station using
> >>> at least 3 and a maximum of 10 of its closest correlating neighbours.
> >>> This removed a few more stations (some q only and some rh only). I then
> >>> ran a homogeneity check to see if each station appeared reasonably
> >>> homogenous in relation to its neighbours. This left me with 2095
> >>> stations. Both the homogenous and non-homogenous stations have now been
> >>> gridded to 5 by 5.
> >>> I've also attached two plots comparing large scale timeseries with
> >>> HadCRUH. HadCRUH is shown in black, and the ISD update shown in orange.
> >>> The blue line shows the data kicked out from the homogeneity check and
> >>> the red line is all ISD stations (found in HadCRUH).
> >>> Pete and I have just looked over the plots and feel that the homogeneity
> >>> check, while not making a very large difference, was still a useful part
> >>> of the process. There are clear places where the orange line is much
> >>> closer to HadCRUH than the blue. The downwards shift in RH in the
> >>> Southern Hemisphere in 2002 is consistent with your Australia plot. This
> >>> also appears in the global average.
> >>> No time to write more now as I have a review to do before I leave
> >>> tonight but I will get the write up to you hopefully next week and
> >>> hopefully some more thoughts on the updated data.
> >>> Thanks
> >>> Kate
> >> --
> >> --------------------------------------------------
> >> Adrian Simmons
> >> European Centre for Medium-Range Weather Forecasts
> >> Shinfield Park, Reading, RG2 9AX, UK
> >> Phone: +44 118 949 9700
> >> Fax: +44 118 986 9450
> >> --------------------------------------------------
Peter Thorne Climate Research Scientist
Met Office Hadley Centre, FitzRoy Road, Exeter, EX1 3PB
tel. +44 1392 886552 fax +44 1392 885681