Meet Harry the Programmer
November 30, 2009
– Comments (30)
When this hits Main Street, Cap and Trade is dead.
A lot of the talk on ClimateGate thus far has centered on the emails and the revelations that scientists are, in fact, human. Some smart CAPS players know that there is a lot more to it. Currently spreading across the Net is a text file found in the hacked files called HARRY_READ_ME. It is a collection of the programming notes of a man named Harry, diligently attempting to debug the source code and data sets for the CRU's climate modeling program. In simple terms, this collection of programs accept scientist inputs of data readings, collects them in databases, runs them through a few subroutines for verification, and spits out the numbers that form the basis of Anthropogenic Global Warming climate models. It is a very important set of programs - the CRU TS2.1/3.0 datasets, 2006-2009.
What follows are actual programming code and notes. Interspersed in italics and [brackets] are comments from myself and other bloggers that have done a remarkable job of deconstructing the code. A reference section will be posted at the end of the analysis.
I apologize to all the programmers here for the crappy formatting of the code that I have pasted. That's on me, not Harry :) I've tried to reproduce the indentations but they don't "keep" when I go to submit the post.
The (soon to be infamous) HARRY_READ_ME.txt file and some of the files in the /cru-code/ directory is most likely authored by Ian "Harry" Harris (not confirmed.) He's the poor sod who seems to have been landed with the responsibility of reworking the code after the departure of Tim Mitchell and/or Mark New who apparently wrote much of it (not confirmed.)
Beginner Programming Mistakes
[See if you can figure out the beginner programming mistake that is baffling Harry in this routine.]
17. Inserted debug statements into anomdtb.f90, discovered that a sum-of-squared variable is becoming very, very negative! Key output from the debug statements:
OpEn= 16.00, OpTotSq= 4142182.00, OpTot= 7126.00
DataA val = 93, OpTotSq= 8649.00
DataA val = 172, OpTotSq= 38233.00
DataA val = 950, OpTotSq= 940733.00
DataA val = 797, OpTotSq= 1575942.00
DataA val = 293, OpTotSq= 1661791.00
DataA val = 83, OpTotSq= 1668680.00
DataA val = 860, OpTotSq= 2408280.00
DataA val = 222, OpTotSq= 2457564.00
DataA val = 452, OpTotSq= 2661868.00
DataA val = 561, OpTotSq= 2976589.00
DataA val = 49920, OpTotSq=-1799984256.00
DataA val = 547, OpTotSq=-1799684992.00
DataA val = 672, OpTotSq=-1799233408.00
DataA val = 710, OpTotSq=-1798729344.00
DataA val = 211, OpTotSq=-1798684800.00
DataA val = 403, OpTotSq=-1798522368.00
OpEn= 16.00, OpTotSq=-1798522368.00, OpTot=56946.00
forrtl: error (75): floating point exception
IOT trap (core dumped)
..so the data value is unbfeasibly large, but why does the sum-of-squares parameter OpTotSq go negative?!!
Probable answer: the high value is pushing beyond the single-precision default for Fortran reals?
Missing Data
Here, the expected 1990-2003 period is MISSING - so the correlations aren't so hot! Yet the WMO codes and station names /locations are identical (or close). What the hell is supposed to happen here? Oh yeah - there is no 'supposed', I can make it up. So I have :-)
[This is why you never put large scale programming projects on the shoulders of one individual. Have they never heard of SCRUM?]
Questionable Programming Selection
[When I was a developer, in addition to the concepts of version control and frequent archiving, one thing my evil commercially oriented supervisors insisted on were "code reviews". This is the hated point where your manager and/or some other experienced developer goes through your code and critiques it in terms of clarity and quality.
As a general rule code reviews teach you a lot. And in places where you have a choice of potential language one of the big questions in a code review is often "why develop this in X?"
The first thing I note is that a lot of the stuff is written in Fortran (of different vintages), and much of the rest in IDL. When you look at the code you see that a lot of it is not doing complex calculations but rather is doing text processing. Neither fortran nor IDL are the tools I would use for text processing - perl, awk, sed (all traditional unix tools available on the platforms the code runs on as far as I can tell) are all better at this. Indeed awk is used in a few spots making one wonder why it is not used elsewhere. The use of fortran means you get weird (and potentially non-portable) things like this in loadmikeh.f90 (program written to load Mike Hulme RefGrids ):]
call system ('wc -l ' // LoadFile // ' > trashme-loadcts.txt') ! get number of lines
open (3,file='trashme-loadcts.txt',status="old",access="sequential",form="formatted",action="read")
read (3,"(i8)"), NLine
close (3)
call system ('rm trashme-loadcts.txt')
[For those that don't do programming this is counting the number of lines in the file. It is doing so by getting an external (unix) utility to do the counting and making it store that in a temporary file (trashme-loadcts.txt) and then reading this file to learn how many lines there are before deleting the file. This is then used as test for whether the file is finished or not in the subsequent line-reading loop.]
XLine=0
do
read (2,"(i7,49x,2i4)"), FileBox,Beg,End
XExe=mod(FileBox,NExe) ; XWye=nint(real(FileBox-XExe)/NExe)+1
XBox=RefGrid(XExe,XWye)
do XEntry=Beg,End
XYear=XEntry-YearAD(1)+1
read (2,"(4x,12i5)"), (LineData(XMonth),XMonth=1,NMonth)
Data(XYear,1:12,XBox) = LineData(1:12)
end do
XLine=XLine+End-Beg+2
if (XLine.GE.NLine) exit
end do
[There are a bunchaton of related no-nos here to do with bounds checking and trusting of input - including the interesting point that this temporary file is the same for all users meaning that if, by some mischance, two people were running loadmikeh at the same time on different files in the same place there is the possibility that one gets the wrong file length (or crashes because the file has been deleted before it could read it). Now I'm fairly sure that this process is basically single user and that the input files are going to be correct when this is run but, as Harry discovers, sometimes it isn't clear what the right input file is:]
Had a hunt and found an identically-named temperature database file which did include normals lines at the start of every station. How handy - naming wo different files with exactly the same name and relying on their location o differentiate! Aaarrgghh!! Re-ran anomdtb:
Major Database Problems
[If the underlying database structure is complete junk, even simple input routines become nightmares.]
You have failed a match despite the WMO codes matching.
This must be resolved!! Please choose one:
1. Match them after all.
2. Leave the existing station alone, and discard the update.
3. Give existing station a false code, and make the update the new WMO station.
Enter 1,2 or 3:
You can't imagine what this has cost me - to actually allow the operator to assign false WMO codes!! But what else is there in such situations? Especially when dealing with a 'Master' database of dubious provenance (which, er, they all are and always will be).
False codes will be obtained by multiplying the legitimate code (5 digits) by 100, then adding 1 at a time until a number is found with no matches in the database. THIS IS NOT PERFECT but as there is no central repository for WMO codes - especially made-up ones - we'll have to chance duplicating one that's present in one of the other databases. In any case, anyone comparing WMO codes between databases - something I've studiously avoided doing except for tmin/tmax where I had to - will be treating the false codes with suspicion anyway. Hopefully.
Of course, option 3 cannot be offered for CLIMAT bulletins, there being no metadata with which to form a new station.
This still meant an awful lot of encounters with naughty Master stations, when really I suspect nobody else gives a hoot about. So with a somewhat cynical shrug, I added the nuclear option - to match every WMO possible, and turn the rest into new stations (er, CLIMAT excepted). In other words, what CRU usually do. It will allow bad databases to pass unnoticed, and good databases to become bad, but I really don't think people care enough to fix 'em, and it's the main reason the project is nearly a year late.
And there are STILL WMO code problems!!!
[....and Harry has had enough. I do really feel sorry for the guy. He's simply in over his head.]
OH F*CK THIS!!! It's Sunday evening, I've worked all weekend, and just when I thought it was done and just when I thought it was done I'm hitting yet another problem that's based on the hopeless state of our databases. There is no uniformed state of integrity, it's just a catalog of issues that continues to grow as they're found.
It's Not My Fault My Predecessor Is An Idiot
[Have you ever heard the story about the monkeys and the banana in the cage? It definitely applies here.]
#####
Welcome! This is the MERGEGRIDS program.
I will create decadal and full gridded files
from the output files of (eg) glo2abs.for.
Enter a gridfile with YYYY for year and MM for month: tmxabs/tmx.MM.YYYY.glo.abs
Enter Start Year: 1901
Enter Start Month: 01
Enter End Year: 2006
Enter End Month: 12
Please enter a sample OUTPUT filename, replacing
start year with SSSS and end year with EEEE: cru_ts_3_00.SSSS.EEEE.tmx.dat
Writing cru_ts_3_00.1901.1910.tmx.dat
(etc)
This took longer than hoped.. running out of disk space again. This is why Tim didn't save more of the intermediate products - which would have made my detective work easier. The ridiculous process he adopted - and which we have dutifully followed - creates hundreds of intermediate files at every stage, none of which are automatically zipped/unzipped. Crazy. I've filled a 100gb disk!
So, anyway, back on Earth I wrote wmocmp.for, a program to - you guessed it - compare WMO codes from a given set of databases.
Stupid Shell Tricks
[It is worth noting that the "wc -l" shell trick is also repeated in the idl files as well - e.g. in cru-code/idl/pro/quick_interp_tdm2.pro where its used even less efficiently in a pipe:]
"cat " + ptsfile + " | wc -l"
[This would almost merit an entry at thedailywtf.com but for the fact that is is far from the worst 'feature' of this particular program.]
Tim, You Are An Idiot
[Ok, maybe not all of Harry's predecessors are idiots and it's not everyone else's fault, but Tim definitely made Harry's life miserable.]
AGREED APPROACH for cloud (5 Oct 06).
For 1901 to 1995 - stay with published data. No clear way to replicate process as undocumented.
For 1996 to 2002:
1. convert sun database to pseudo-cloud using the f77 programs;
2. anomalise wrt 96-00 with anomdtb.f;
3. grid using quick_interp_tdm.pro (which will use 6190 norms);
4. calculate (mean9600 - mean6190) for monthly grids, using the
published cru_ts_2.0 cloud data;
5. add to gridded data from step 3.
This should approximate the correction needed.
On we go.. firstly, examined the spc database.. seems to be in % x10.
Looked at published data.. cloud is in % x10, too.
First problem: there is no program to convert sun percentage to cloud percentage. I can do sun percentage to cloud oktas or sun hours to cloud percentage! So what the hell did Tim do?!! As I keep asking.
[I don't know Harry. What the hell did Tim do?!]
Then - comparing the two candidate spc databases:
spc.0312221624.dtb
spc.94-00.0312221624.dtb
I find that they are broadly similar, except the normals lines (which both start with '6190') are very different. I was expecting that maybe the latter contained 94-00 normals, what I wasn't expecting was that het are in % x10 not %! Unbelievable - even here the conventions have not been followed. It's botch after botch after botch. Modified the onversion program to process either kind of normals line.
[The same file also gets a mention in Part 31 (at least I think its the same one - either that or there are two programs that do the same 'trick'):]
; map all points with influence radii - that is with decay distance [IDL variable is decay]
; this is done in the virtual Z device, and essentially finds all points on a 2.5 deg grid that
; fall outside station influence
dummymax=max(dummygrid(*,*,(im-1)))
while dummymax gt -9999 do begin
if keyword_set(test) eq 0 then begin
set_plot,'Z' ; set plot window to "virtual"
erase,255 ; clear current plot buffer, set backgroudn to white
device,set_res=[144,72]
endif else begin
initx
set_plot,'x'
window,0,xsize=144,ysize=72
endelse
lim=glimit(/all)
nel=n_elements(pts1(*,0))-1
map_set,limit=lim,/noborder,/isotro,/advance,xmargin=[0,0],ymargin=[0,0],pos=[0,0,1,1]
a=findgen(33)*!pi*2/32
[etc.]
[What this is doing is finding out whether stations may influence each other (i.e. how close they are). It's doing this not by means of basic trig functions but by creating a virtual graphic and drawing circles on it and seeing if they overlap! There are a couple of problems here. First, it seems that sometimes, as the next few lines report, IDL doesn't like drawing virtual circles and throws an error.]
for i=0.0,nel do begin
x=cos(a)*(xkm/110.0)*(1.0/cos(!pi*pts1(i,0)/180.0))+pts1(i,1)
x=x<179.9 & x=x>(-179.9)
y=sin(a)*(xkm/110.0)+pts1(i,0)
y=y>(-89.9) & y=y<89.9
catch,error_value ; avoids a bug in IDL that throws out an occasional
; plot error in virtual window
if error_value ne 0 then begin
error_value=0
i=i+1
goto,skip_poly
endif
polyfill,x,y,color=160
skip_poly:
endfor
[In that case we just skate over the point and (presumably) therefore assume it has no influence - which is, um, very reassuring for people trying to reproduce the algorithm in a more rational manner.
However that's not the only problem because as our hero also reports some inconsistencies:]
..well that was, erhhh.. 'interesting'. The IDL gridding program calculates whether or not a station contributes to a cell, using.. graphics. Yes, it plots the station sphere of influence then checks for the colour white in the output. So there is no guarantee that the station number files, which are produced *independently* by anomdtb, will reflect what actually happened!!
[Fortunately our hero is able to fix this (although it isn't at all clear to me how the new process is integrated with the old one)]
Well I've just spent 24 hours trying to get Great Circle Distance calculations working in Fortran,
with precisely no success. I've tried the simple method (as used in Tim O's geodist.pro, and the
more complex and accurate method found elsewhere (wiki and other places). Neither give me results
that are anything near reality. FFS.
Worked out an algorithm from scratch. It seems to give better answers than the others, so we'll go
with that.
[And really at this point I think I am going to nominate this program to the thedailywtf.com and move on.]
Does Any Of This Really Matter
[Not according to Harry:]
Gotta love the system! Like this is ever going to be a blind bit of use.
=====================================================================
Ok, the fun is over. There is a great deal more, but we have space and time limitations, so let's stop.
So what to make of all this? And what you are you doing right now? Why aren't you checking the source code yourself to verify everything I posted here? Let's get to it Fools! :)
GISS and NCDC Agree
In their defense the CRU note that their temperature series mostly agrees with the ones from NCDC and GISS as if this excuses the abysmal quality of the processing. This is indeed true but given that there is significant shared source data and apparent cooperation between the teams this is not precisely reassuring. I haven't paid any attention in this analysis to the mathematical calculations being done since that's not what I can readily comprehend but it seems quite likely that each system applies similar data "cleansing" and "homogenizing" because the synthetics (see above) are going to be similar. Hence the confirmation bias problem as results that don't match up with the other two will tend to be scrutinized (and possibly adjusted) while the ones that do are left alone.
Since the adjustment mechanism, indeed the entire process, is opaque the claim that black box 1 produces are result similar to black box 2 is not convincing.
Can the AGW Science Be Salvaged Absolutely. Two words: OPEN SOURCING
In my opinion the entire project needs to be rewritten, and part of the rewrite should include using an actual database (SQL of some variety) to store the intermediate data. In fact the more I read through the HARRY_READ_ME doc (and glance at the code) the more I realize that open sourcing this, as suggested by Eric Raymond, would get huge benefits. By open sourcing it, the calculation and work flow would have to become more clear and we would be able to identify questionable parts. Open sourcing would also lead to the imposition of basic project management things like archiving and version control which seem to have been sorely lacking and it would allow a large group of people to think about the data structures and the process flow so that it can be made less subject to error.
Furthermore there are enormous problems in terms of station IDs and identifying whether stations have moved, been renamed and so on. And there are problems with data quality for stations (record missing, mistranscribed etc etc) where a distributed team would almost sertainy do better in terms of verifying the data (and potentially adding more raw data when it is found).
Even if the end result of the open source version were identical to the current one, opening this stuff up to public scrutiny and allowing people to contribute patches would go a long way towards improvng the quality of the output and go even further towards improving the credibility of the output which right now is low amongst anyone who's actually taken a proper look at the process.
What Happens When The Public Finds Out About HARRY_READ_ME
The same thing that happened to Scientology when the Internet Gods decided enough was enough. In the modern age, momentum turns quickly. Cap and Trade is officially dead. Whatever happens at Copenhagen will be moot. The public outcry to investigate the total irresponsibility, incompetence, and arrogance of the AGW scientific establish will blow the doors on Air Force One as soon as President Obama lands upon his return from Europe.
David in Qatar
References:
1. L'Ombre de l'Olivier
2. What's Up With That?
3. Climate Audit
4. Devils Kitchen
5. Neutral Net Writer
6. Asimov
Final Notes:
If you can't see how badly this destroys the credibility of the IPCC and the AGW community, you are married to your ideology. I don't have any beef with you, just please don't take the tremendous amount of time that so many great bloggers have spent breaking down this code and throw it back in my face as some twisted act of "right wing" propaganda.
As for me, my background is IT as well. I've been in the field for 7 years (I'm a rookie compared to most.) I currently work in a mixed Unix/Windows environment and I have enough experience to know a crappy piece of work when I see one.
But I'm by no means the final authority nor is the verdict in. Let's hear from some of the more experienced programmers in the Fool community. Let's see what we find when the rest of the scientific community divulges all of their codes and data sets. If it all looks like this, what will be the response? What will be the defense of the AGW believers? How will they rationalize it?
Unfortunately, I have to run. I'm on vacation and I have a flight to catch to Chicago. I'll be in Chicago on Tuesday night (staying in the Wrigleyville area.) Any locals want to get in touch and have a beer, I might be able to do that. You can email me here.
David in Qatar