Data-leak lessons learned from the 'Climategate' hack
- 26 November, 2009 07:41
In case you've missed it, someone recently dumped a large cache of e-mail files and documents from the University of East Anglia University's prestigious Climactic Research Unit onto the 'Net. The CRU is one of the leading climatology research institutions, and its data and models provide much of the infrastructure on which the theory of anthropogenic global warming (AGW) is based.
Many of the files and emails discuss hiding or manipulating data, which has disturbing connotations for the credibility of AGW overall. (In one document, a researcher explicitly acknowledges making up data sources.) For a relatively unbiased look at some of the issues, please click here.
Leaving aside the political hot potato of AGW itself, there are several lessons for networkers to take away from the exposure of the CRU's internal data.
Lesson 1: Don't let users put passwords in their signatures. Yep, you got that right: One of the scientists included both on his e-mail signature — which means that anyone receiving an e-mail from this guy had access to his files. This may have been the source of the hack; in fact, some folks have theorized that a recipient of the e-mail was the source of the data dump.
Lesson 2: Don't evade Freedom of Information requests. As noted in the Science Magazine link above, many of the e-mails discuss how to destroy documents in anticipation of Freedom of Information requests. That's a criminal offense in the United Kingdom (where the CRU is located). IT folks should be aware that an increasing amount of data (particularly scientific and research data gathered via public funding) is subject to FOIA. They should work with researchers to ensure documents are stored and organized with that in mind.
Lesson 3: Lock down sensitive servers. Another theory behind the supposed "hack" is that the files were compiled in response to a FOIA request — then stored on an unlocked server. The CRU declined to honor the FOIA request, but left the compiled response freely available.
Lesson 4: Advise your users that all e-mails (and indeed, voice, message and video communications) may be the subject of public disclosure. You may work in an industry that's not subject to FOIA — but anyone can get sued. And the process of "e-discovery" may make plenty of data public. If you don't have a comprehensive multimedia data retention policy (what gets retained, what gets destroyed and on what timeframe, how destruction is confirmed), get one now.
As someone with a background both in IT and in science (I participated in particle physics experiments as a physics PhD student), I would also add the following lesson to the folks writing scientific code: Don't make stuff up. The released document HARRY_READ_ME.txt contains examples in which the coder, supremely frustrated with the poor quality of his data, simply creates some. Even if the underlying science is sound, "created" data taints the integrity of the entire process. Don't do it, no matter how tempting.
Johnson is president and senior founding partner at Nemertes Research, an independent technology research firm. She can be reached at email@example.com.