Friday, March 30, 2012


date: Wed, 19 Dec 2007 15:08:18 +0000
from: Ian Harris <>
subject: Decisions, decisions
to: Phil Jones <>, Kevin Marsh <>

Please forward as necessary..


I need to make a number of decisions concerning the automation and
packaging of the CRU TS Dataset. I can make them in isolation (and
already have fallback positions) but I'd appreciate thoughts.

1. Station Counts.

I am producing two sets of station counts - the traditional one,
based on correlation decay distances ('spheres of possible
influence'), and a new one which just counts the number of stations
in each cell at each timestep. The former ranges from zero to over
800, the latter from zero to less than 10.

Two questions:

1a. Are we happy to release both sets of data? (provisional answer: yes)
1b. Should the station counts be in the same NetCDF files as the data
they refer to? (provisional answer: yes)

2. Update and Release Strategy

We agreed some time ago that there would be monthly, incremental data
releases. However this does not sit perfectly with the current file
arrangements, which are decadal files plus a full file. The strategy
can only work if we re-release the latest decadal file, *and* the
full file, every month. It's not impossible but it's a bit excessive
(each decadal file could have 120 releases!).

There is a secondary issue, that of when to republish updated
material. If, say, all Moroccan data is replaced with improved
versions, at what stage should we re-release the existing published
data to take account of this? If we are republishing the full
database every month to include the new month's data, should we
include all new data for previous years? If not, how do we manage
this? And if we do it, the full file and decadal files will have
different data in them!

I know we have covered some of this before, but it was a long time ago..

Two questions:

2a. Are we happy to release new full and latest-decadal files every
month? (provisional answer: no, I suggest we only update the latest
decadal file with the new month, and the full file is updated once a
2b. When do changes in past years get processed? (provisional answer:
once a year, the full file is reprocessed and any changed decadal
files are reissued)

As you can see the issues are more complex than they seem.


Ian "Harry" Harris
Climatic Research Unit
School of Environmental Sciences
University of East Anglia
Norwich NR4 7TJ
United Kingdom


