Thanks to Piotr Djaków for prompting me to write this post.
ISTI is the International Surface Temperature Initiative, previously I’ve written about using ccc-gistemp with the ISTI data.
While I’ve modified ccc-gistemp to be able to use the ISTI Stage 3 data, it’s far from push-button. In the place of making it easier, I’ve laid out a step-by-step guide here.
0. Prerequisites
You’ll need Python 2 and Python 3 and git. It will definitely work better if you’re on Unix.
1. Get source code
If you have git installed, you can use that:
git clone https://github.com/ClimateCodeFoundation/ccc-gistemp
git clone https://github.com/ClimateCodeFoundation/madqc
That creates directories named ccc-gistemp and madqc.
cd into ccc-gistemp:
cd ccc-gistemp
2. get ISTI data
I added code so that ccc-gistemp “knows” how to download the ISTI data:
tool/fetch.py isti
This creates the file input/isti.merged.inv and input/isti.merged.dat.
3. QC the ISTI data
The ISTI data comes from a variety of sources, some more raw than others. Lots of the data is not Quality Controlled (QC). It’s a good idea to QC it, and I suggest MADQC, see What Good is MADQC for more.
../madqc/mad.py --progress input/isti.merged.dat
This creates the file input/isti.merged.qc.dat; you also need to copy the .inv file:
cp input/isti.merged.inv input/isti.merged.qc.inv
4. run ccc-gistemp
tool/run.py -p 'data_sources=isti.merged.qc.dat;element=TAVG' -s 0-1,3-5
It is a lot slower than the usual run because there are far more stations in ISTI than GHCN-M. It took about 50 minutes on my late 2014 ultrabook class laptop (about the same as a low- to mid-range cloud instance).
The data_sources parameter tells ccc-gistemp to use the ISTI data; the element parameter tells ccc-gistemp to analyse the TAVG element (monthly mean). This latter parameter was added specifically for ISTI because the ISTI files have TMIN, TMAX, and TAVG all in the same file.
The ISTI data does not have the metadata that GHCN-M comes with, in particular, no urban indicators. ccc-gistemp’s Step 2 modifies urban stations, we have to skip this step, hence the -s 0-1,3-5 option.
5. view results
The results are available in fixed format text files in the result/ directory.
The land-only result is available in result/landGLB.Ts.GHCN.CL.PA.txt
If the land and ocean data sources finish on the same month then there will be a file result/mixedGLB.Ts.ERSST.GHCN.CL.PA.txt containing the merged land–ocean dataset.
If the land and ocean data sources do not finish on the same month then the blended land–ocean dataset makes no sense for the most recent months (which will have either no ocean contribution, or no land contribution); the results are produced but placed in the file tainted*; best to use the land-only result.
Do you prefer CSV files? You can run
tool/gistemp2csv.py result/*.txt
to convert the fixed format text files to CSV files (also written to the result/ directory).
The land-only result looks like this for me:

















