<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Climate Code Foundation</title>
	<atom:link href="http://climatecode.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://climatecode.org</link>
	<description></description>
	<lastBuildDate>Wed, 25 Jan 2012 19:36:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Activity and Status</title>
		<link>http://climatecode.org/blog/2012/01/activity-and-status/</link>
		<comments>http://climatecode.org/blog/2012/01/activity-and-status/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 19:06:33 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=618</guid>
		<description><![CDATA[It&#8217;s been some months since we updated this blog. My apologies for that. Here&#8217;s a quick summary of our recent activities. I hope to make a quick series of blog posts over the next week or two describing some of &#8230; <a href="http://climatecode.org/blog/2012/01/activity-and-status/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been some months since we updated this blog.  My apologies for that.  Here&#8217;s a quick summary of our recent activities.  I hope to make a quick series of blog posts over the next week or two describing some of these in more detail, and linking to presentations and related materials:</p>
<ul>
<li>2011-08-26: Invited panelist at the <a href="http://www1.ccls.columbia.edu/~cmontel/ciWorkshop.html">First International Workshop on Climate Informatics</a>, at the New York Academy of Sciences in New York City.  This fascinating workshop, organized by Gavin Schmidt (of Columbia and NASA GISS) and Claire Montelione, brought together researchers in climate science and in informatics to find common ground.</li>
<li>2011-08-29: Invited talk at Google&#8217;s New York City offices, describing the Foundation, our Clear Climate Code project, our Summer of Code successes, and the then-draft Science Code Manifesto.</li>
<li>2011-09-01: Attended a meeting of the <a href="http://royalsociety.org/policy/projects/science-public-enterprise/">&#8216;Science as a Public Enterprise&#8217;</a> policy study, at the Royal Society in London.  <a href="http://michaelnielsen.org/blog/michael-a-nielsen/">Michael Nielsen</a> spoke about the Polymath project, and the open science revolution.  He took the time to talk with me later, and offered his support for the Science Code Manifesto.</li>
<li>2011-09-02: Invited panel member at <a href="http://www.scienceonlinelondon.org/">Science Online London</a>, providing an outsider&#8217;s perspective in a discussion of the research funding system.</li>
<li>2011-09-22: Visited the Macmillan offices in London, to meet Olive Heffernan (editor of Nature Climate Change) and Mark Hahnel (the man behind <a href="http://figshare.com/">FigShare</a>).</li>
<li>2011-10-10/11: Attended the two-day meeting at the Royal Society on <a href="http://royalsociety.org/events/2011/warm-climates/">&#8220;Warm Climates of the Past&#8221;</a>.  A lot of fascinating science attempting to draw lessons for the Anthropocene from some particular paleoclimate episodes (the LIG, the PETM, and so on).</li>
<li>2011-10-13: Launched the <a href="http://sciencecodemanifesto.org/">Science Code Manifesto</a>, to a great response from a wide range of scientists.</li>
<li>2011-10-21/23: Invited to the <a href="http://gsoc-wiki.osuosl.org/index.php/2011">Google Summer of Code Mentor Summit</a>, in Mountain View, CA.  A terrific gathering of people from across the open-source world.  We had a couple of positive sessions on Open Science: open data, open source code, open access publications.</li>
<li>2011-10-22/23: I couldn&#8217;t get to the <a href="http://opensciencesummit.com/">Open Science Summit</a> (also in Mountain View, CA), because it clashed with the Google event.  I did show my face at the Saturday evening social, and met a number of Open Science movers and shakers in person for the first time (apparently I just missed <a href="http://www.stanford.edu/~vcs/">Victoria Stodden</a>).  </li>
<li>2011-10-24: Talk at the GooglePlex (invited by Peter Norvig). Covered the Science Code Manifesto especially.</li>
<li>2011-10-24: Attended a seminar at Stanford by Robert Reicher, on sustainable energy futures.  Met John Mashey in the flesh (20+ years after first encountering him online).</li>
<li>2011-10-25: Invited visit to the <a href="http://berkeleyearth.org/">Berkeley Earth Surface Temperatures</a> team at the LBL.  Met Richard Muller and the rest of the team.  Sat in on their weekly team meeting (honoured to sit opposite Saul Perlmutter and Arthur Rosenfeld).</li>
<li>2011-11: Our <a href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5999649">paper on the ccc-gistemp project</a> was published in IEEE Software.  This is the Foundation&#8217;s first peer-reviewed publication.  Sadly my copy of this particular issue has been lost in the post; I must chase it up with the IEEE.</li>
<li>2011-11-10: The <a href="http://surfacetemperatures.blogspot.com/2011/11/ghcn-m-v310-showing-value-of-engaging.html">GHCN 3.1.0 dataset</a> was released by NCDC.  This incorporates fixes to their homogenization code prompted by Dan Rothenberg&#8217;s Summer of Code work.</li>
<li>2011-11-22: Invited talk to an <a href="http://icesfoundation.org/Pages/Home.aspx">ICES Foundation</a> meeting at the WMO in Geneva.  They&#8217;re a group in the very early stages of an exciting project.  I had finally received my copy of Michael Nielsen&#8217;s book <a href="http://michaelnielsen.org/blog/reinventing-discovery/">&#8220;Reinventing Discovery&#8221;</a> just before this trip.  I read it on my outward journey, and as a result completely rewrote my talk.  Read this book now.</li>
<li>2011-11-25: Invited seminar at <a href="http://climate.ncas.ac.uk/">NCAS</a> in Reading.  Covered the usual ground: the CCF, ccc-gistemp, Summer of Code, the Science Code Manifesto.  Very lively Q&#038;A session afterwards, which continued through coffee into lunch.</li>
<li>2011-12-15: Attended AAAS-sponsored seminar by David MacKay at Imperial College, London.  MacKay is the chief scientific adviser to the UK Department of Energy and Climate Change (DECC), and talked about their <a href="http://www.decc.gov.uk/en/content/cms/tackling/2050/2050.aspx">&#8220;2050 Pathways&#8221;</a> web tool, which has grown out of <a href="http://www.withouthotair.com/">his great book on sustainable energy</a>.  He is far too busy, but I managed to pigeon-hole him briefly to discuss a possible project to build a Pathways app for smartphones.</li>
<li>2012-01-18: Invited talk at <a href="http://www.ncdc.noaa.gov/oa/ncdc.html">NCDC</a> in Asheville, NC, hosted by Peter Thorne.  Very interesting meetings with Tom Petersen and Scott Hausman, then the Ingest and Analysis group, and the Climate Data Records team.</li>
<li>2012-01-22/26: Invited speaker and panellist at the AMS annual meeting in New Orleans.  Johnny Lin (of <a href="http://pyaos.johnny-lin.com/">PyAOS</a>) ran the &#8220;Second Symposium on Advances in Modeling and Analysis Using Python&#8221; and kindly invited me to speak.  An excellent meeting, full of new contacts and ideas.  One particular highlight for me was that all three of our Summer of Code students gave presentations.  I was also very glad to meet Travis Oliphant, the creator of NumPy &#8211; I hope to be able to work with him, and his new <a href="http://www.continuum.io/">Continuum Analytics</a> company and NumFocus foundation, in future.</li>
</ul>
<p>In the next month I am giving a seminar at the Met Office and attending a round-table meeting of the Royal Society policy study.  We&#8217;re expecting the Google Summer of Code 2012 to be announced shortly, and are hoping to take part.</p>
<p>It&#8217;s possible I&#8217;ve missed a few items.  I haven&#8217;t mentioned any of the amazing related work going on, especially in the open science field.  I&#8217;ll try to summarize that in another blog post.</p>
<p>As you can see, the blog silence has not been a sign of inactivity &#8211; rather the reverse.  In fact, I&#8217;m actually writing this blog post in a stolen moment between sessions at the AMS meeting. Some other aspects of the Foundation&#8217;s work have also been neglected (for instance, we failed to schedule a meeting of our advisory committee).  But the Foundation is in rude health.</p>
<p>In other news, one of our founders, David Jones, is now working for <a href="https://scraperwiki.com">scraperwiki</a>, a truly excellent open-source/open-data website.  He continues as a director of the Foundation, and is writing a paper on some of our work.</p>
<p>The Foundation is still unfunded and all our work continues to be unpaid (and most &#8220;invited talks&#8221; do not include travel or accommodation expenses).  We are meeting most of our expenses from a small fee for contract programming in March 2011.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2012/01/activity-and-status/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Science Code Manifesto</title>
		<link>http://climatecode.org/blog/2011/10/science-code-manifesto/</link>
		<comments>http://climatecode.org/blog/2011/10/science-code-manifesto/#comments</comments>
		<pubDate>Thu, 13 Oct 2011 09:58:55 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[manifesto]]></category>
		<category><![CDATA[sciencecodemanifesto]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=595</guid>
		<description><![CDATA[I am pleased to announce the launch of the Science Code Manifesto, laying out general principles of publication for science software. Please read the manifesto, endorse it if you agree, then come back here to discuss it. This issue isn&#8217;t &#8230; <a href="http://climatecode.org/blog/2011/10/science-code-manifesto/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a title="Science Code Manifesto" href="http://sciencecodemanifesto.org/" target="_blank"><img class="size-full wp-image-602 alignleft" title="'Code is Method' button, 240x120" src="http://climatecode.org/wp-content/uploads/2011/10/code-is-method-bloomfilter-240x120.png" alt="'Code is Method' button, 240x120" width="240" height="120" /></a>I am pleased to announce the launch of the <a href="http://sciencecodemanifesto.org/">Science Code Manifesto</a>, laying out general principles of publication for science software. Please read the manifesto, <a href="http://sciencecodemanifesto.org/endorse/">endorse it</a> if you agree, then come back here to discuss it.</p>
<p>This issue isn&#8217;t specific to climate science. I originally created this as a response and contribution to the Royal Society’s policy study on “Science as a Public Enterprise”. It is partly inspired by the <a href="http://pantonprinciples.org/">Panton Principles</a>, a bold statement of ideals in scientific data sharing. It refines the ideas I laid out in an <a href="http://www.nature.com/news/2010/101013/full/467753a.html">opinion piece</a> for Nature in 2010.</p>
<p>However, I did not originate these ideas. They are simply extensions of the core principle of science: publication. Publication is what distinguishes science from alchemy, and is what has propelled science – and human society – so far and so fast in the last 300 years. The Manifesto is the natural application of this principle to the relatively new, and increasingly important, area of science software.</p>
<p>Go on, <a href="http://sciencecodemanifesto.org/endorse/">endorse it</a> now.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/10/science-code-manifesto/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Royal Society submission</title>
		<link>http://climatecode.org/blog/2011/10/royal-society-submission/</link>
		<comments>http://climatecode.org/blog/2011/10/royal-society-submission/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 11:43:54 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=591</guid>
		<description><![CDATA[The Royal Society is conducting a policy study entitled &#8220;Science as a Public Enterprise&#8221;, and called for submissions from the public. The Climate Code Foundation made the following submission in August, (and I just realised that I never posted it &#8230; <a href="http://climatecode.org/blog/2011/10/royal-society-submission/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The Royal Society is conducting a policy study entitled &#8220;Science as a Public Enterprise&#8221;, and called for <a href="https://royalsociety.org/policy/projects/science-public-enterprise/call-for-evidence/">submissions from the public</a>.  The Climate Code Foundation made the following submission in August, (and I just realised that I never posted it to the blog):</p>
<p><b>1. What ethical and legal principles should govern access to research results and data? How can ethics and law assist in simultaneously protecting and promoting both public and private interests?</b></p>
<p>Two prefatory remarks which apply to all our answers:  First, the Climate Code Foundation is a non-profit organisation to promote the public understanding of climate science.  Thus, it is focused specifically on climate science.  Although its arguments may well apply to other fields, it takes no formal view on those fields.</p>
<p>Secondly, in this and the following answers, we take the use of &#8216;data&#8217; in the questions to mean both scientific measurements (and their accompanying metadata such as instrument type and time of observation) and code (that is, computer programs written by scientists to process their raw data into results).</p>
<p>In specific response to question 1: since climate science results are of critical public importance, the ethical principle of least harm is relevant to these datasets, and dictates that they should be made available to the general public as open data <a href="http://opendefinition.org/">http://opendefinition.org/</a>.  That is, at no cost, and under no restrictions save only, at most attribution and share-alike requirements.</p>
<p>The software source code, which creates, defines, and interprets the datasets, should be available as open source software <a href="http://opensource.org/">http://opensource.org/</a> for the same reason: so that any interested party may inspect, copy, reason about, criticise, improve, or run it, without restrictions.</p>
<p>Regarding legal principles and law, the Climate Code Foundation does not have a view.  We have observed many legal actions against climate scientists, and use of legal instruments in place of polite enquiry, and legal threats in place of debate, and find the effects to be damaging to discourse and chilling to research.  The criminal and civil law should of course be available at the last resort, but that is not how it has been used.</p>
<p><b>2 a) How should principles apply to publicly-funded research conducted in the public interest?</b></p>
<p>In this case, and again restricting my remarks to climate science code and data, the public-pays principle confirms the conclusion that publicly-funded research results should be open source and open data.</p>
<p><b>2 b) How should principles apply to privately-funded research involving data collected about or from individuals and/or organisations (e.g. clinical trials)?</b></p>
<p>This is not relevant to climate science and so the Climate Code Foundation has no view.</p>
<p><b>2 c) How should principles apply to research that is entirely privately-funded but with possible public implications?</b></p>
<p>The possible threat of climate change to our collective well-being is so severe that privately-funded research data and code, in this field, should be open.  However, I see no way to enforce this.</p>
<p><b>2 d) How should principles apply to research or communication of data that involves the promotion of the public interest but which might have implications from the privacy interests of citizens?</b></p>
<p>This is not relevant to climate science and so the Climate Code Foundation has no view.</p>
<p><b>3. What activities are currently under way that could improve the sharing and communication of scientific information?</b></p>
<p>There are many, and I&#8217;m sure other responses will give a much broader view.  The top trends I would mention are as follows:</p>
<ul>
<li>open access publication;</li>
<li>open data repositories;</li>
<li>open bibliographic data;</li>
<li>linked open data;</li>
<li>electronic lab notebooks, and open notebook science;</li>
<li>open source software;</li>
<li>blog science and tweet science;</li>
<li>citizen science and &#8216;crowd-sourced&#8217; science.</li>
</ul>
<p>In the Climate Code Foundation we ourselves are working with several climate science groups to create shared science software resources, and to improve access to and understanding of existing climate data resources.</p>
<p><b>4. How do/should new media, including the blogosphere, change how scientists conduct and communicate their research?</b></p>
<p>The Climate Code Foundation takes no view on the use of new media in conducting research, except to welcome the possibility of increased public engagement through online citizen science projects such as <a href="http://oldweather.org/">Old Weather</a>.  Much science can and must continue to be done in a traditional way (e.g. a month digging up tree-stumps from permafrost, followed by several months of long lab hours processing them).</p>
<p>However, new media (email, the web, the blogosphere), does provide new opportunities for communicating research, both to other researchers and to the public.  Many of these changes have already taken place over the last few decades, and indeed early users of the internet were almost all researchers, and the web was invented precisely to allow better communication between scientists.</p>
<p>First, the usual medium of communication between scientists is now email.  It may seem trivial to mention this; I mention it to illustrate a potential problem with the question.  Email is a new medium, but it has become so completely integrated into our lives that it is hard to imagine science without it.</p>
<p>Secondly, the internet allows researchers to communicate results fully to each other: datasets, source code, and all.  Science communication is no longer constrained to the narrow bottleneck of the paper publication.  Modern science is entirely dependent on such uses of the internet.</p>
<p>Thirdly, the web provides an unfiltered channel for scientists to communicate their results directly to interested members of the public.  This might be by providing raw data, commentary, tools for data analysis, and linking to other information.  Many sites are tremendous resources for learning, exploration, and education.  In climate science, we encourage and assist with this process.</p>
<p>Certain web-based tools should also be included in &#8220;new media&#8221;, and these can provide direct benefits to scientists that are able to use them.  If access to your source code does not require a signed faxed agreement, then the department secretary can be taken out of the loop and the source code can be placed on one of the many freely available project management tools (gitHub, GoogleCode, KnowledgeForge, etc). If individual researchers can even use the same tools to manage the code on a day to day basis in the spirit of Open Notebook research; this may reduce the burden on departmental IT staff.  BitTorrent, and other similar tools, can be used to share data in a peer-to-peer fashion, reducing the burden on departmental datacentres, but again only if the grip on data is released slightly and it is made more openly available.  We doubt that benefits are restricted to these particular examples.</p>
<p>Whether scientists engage in the discussions which take place in &#8216;the blogosphere&#8217; is a matter of choice; scientists should feel no obligation to take part.  In climate science, at least, it is not generally a polite environment, and large parts of it are completely hostile to almost any informed point of view.  It can be a boundless and toxic time-sink.</p>
<p><b>5. What additional challenges are there in making data usable by scientists in the same field, scientists in other fields, &#8220;citizen scientists&#8221; and the general public?</b></p>
<p>There are a few.  Data is not always accompanied by metadata, by a description of its format, its provenance, or its scope, or by information about limitations and errata.  This information may be understood by the originating scientists, but needs to be added to the data to make it usable by others.  However, this process is not compulsory, and should be encouraged simply by the rewards which automatically flow: a scientist whose data is more usable by others will receive more citations and other professional recognition.</p>
<p>That recognition does assume that mechanisms are in place to identify the effort: the researcher who does this work should of course be credited with it.</p>
<p><b>6 a) What might be the benefits of more widespread sharing of data for the productivity and efficiency of scientific research?</b></p>
<p>Simply put, more sharing of data will allow scientists to build more rapidly and reliably on each others&#8217; results, causing science to progress more quickly in the discovery of truth and the construction of scientific knowledge.  It will reduce duplication of effort, it will result in more rapid discovery and correction of errors, and it will increase the speed with which new ideas and approaches are developed.</p>
<p>More sharing of code in particular will have the same effects, and will also allow the development of large, reliable, shared libraries and bodies of code, increasing the productivity and reliability of science across entire fields.</p>
<p>Sharing of data and code should help to &#8216;level the playing-field&#8217; for poorly-resourced scientists, and may also lead to higher levels of collaboration and to greater community-building.</p>
<p><b>6 b) What might be the benefits of more widespread sharing of data for new sorts of science?</b></p>
<p>Open data and code allow relatively novel types of science such as crowd-sourced citizen science: <a href="http://zooniverse.org/">Zooniverse</a> and especially research using automated data-mining and scraping systems, such as the various projects under Peter Murray-Rust&#8217;s &#8220;Blue Obelisk&#8221; banner. Without open data and code, these projects would be impossible. Without significant improvements in policies and statements of open-ness, they will be unable to progress.</p>
<p><b>6 c) What might be the benefits of more widespread sharing of data for public policy?</b></p>
<p>In climate science, very great indeed, and this is the raison d&#8217;etre of the Climate Code Foundation: by encouraging and enabling climate scientists to share and communicate their results more clearly and effectively, we work to improve public understanding of the science, and thus to allow the formation of public policy in an informed context.</p>
<p>In climate science, the unavailability of some code and data, and the impression that other code or data is either simply unavailable or more mysteriously &#8216;missing&#8217;, has been immensely damaging to public perceptions for several years, especially since the theft of emails from the University of East Anglia in 2009.  Those emails were grossly mis-represented as revealing a culture of secrecy, data abuse, and incompetence.</p>
<p>In fact practices in climate science are broadly in line with those in many other sciences &#8211; although there is a certain level of openness, some data and much source code is not freely available to the public. It may be available on request, but at the discretion of the researcher, and often not to a member of the public.  This inconsistent environment allows noxious allegations to flourish, and to be exploited in public policy debates.  Many times in the last decade, policy makers have repeat these allegations in debate, to cast doubt on the science, to avoid or delay important climate-related policies.  The negative effect of this overall atmosphere on policy related to climate change is hard to overstate.</p>
<p>Greatly increased openness in climate science would not draw all the poison from this discourse: positions are far too entrenched for that. Allegations of secrecy and corruption will continue.  But it would give climate scientists an unambiguous, consistent, and verifiable answer to all such questions.  Here is the data.  Here is the code. Here are the results.</p>
<p><b>6 d) What might be the benefits of more widespread sharing of data for other social benefits?</b></p>
<p>It is hard to estimate the general social benefits of a better-informed public.</p>
<p><b>6 e) What might be the benefits of more widespread sharing of data for innovation and economic growth?</b></p>
<p>The Climate Code Foundation has no view on this question as it is posed.</p>
<p><b>6 f) What might be the benefits of more widespread sharing of data for public trust in the processes of science?</b></p>
<p>See my answer to (6c).</p>
<p><b>7. How should concerns about privacy, security and intellectual property be balanced against the proposed benefits of openness?</b></p>
<p>There is no conflict.</p>
<p>Scientists have a right to privacy, and open-ness does not conflict with that.  We only advocate the sharing of research products: data, code, results, and publications.  None of those contain any private information.</p>
<p>&#8220;Security&#8221; is a catch-all word, but again, there is no conflict.</p>
<p>&#8220;Intellectual property&#8221; is a much-abused term.  I will take its use here to mean copyrights.  Copyright is not threatened by openness. Indeed, many frameworks for openness (such as the Creative Commons) depend on copyright to enforce sharing conditions such as attribution. &#8220;Intellectual property&#8221; is often used as a post-hoc argument to justify delaying or avoiding openness.</p>
<p>There is no copyright on data (although there may be database rights within the European Union), and datasets have been protected in the past by secrecy and embargos.  Such embargos are becoming a thing of the past, as publication policies change, and scientists are realising that they receive more credit for becoming the originators and curators of a widely-used, widely-published, foundational dataset, than they could garner by eking out a few more papers on their own.</p>
<p>Software may be protected by copyright, but the protected work is rarely of any value which could be realised.  Either the research is described by a publication which gives full details of the algorithm, or it is not.  In the latter case, we would argue that the publication is seriously flawed &#8211; the science depends on methods which are not published &#8211; and is not truly part of the collective enterprise of science.  In the former case, because algorithms per se cannot be protected by copyright, there is no &#8220;intellectual property&#8221; left to protect.</p>
<p>In a very few cases, research bodies may have come to rely on revenue generated by licensing software or data protected by copyright or database rights.  In the specific case of climate science, we hold that the public interest argument for openness is so strong that these bodies must restructure their business models, and funding agencies may have to allow for this.</p>
<p>In short, the &#8220;intellectual property&#8221; of science is a commons, to which all researchers contribute, and from which all of society benefits.</p>
<p><b>8. What should be expected and/or required of scientists (in companies, universities or elsewhere), research funders, regulators, scientific publishers, research institutions, international organisations and other bodies?</b></p>
<p>Public funders of research in climate science should require all research products to be made available to the public: publications should be open-access, data should be open data, software should be open-source.</p>
<p>The responsibilities of all other actors in climate science (scientists, institutions, publishers, etc) will follow naturally from this requirement.</p>
<p>Many stakeholders and powers-that-be have made positive statements or policies about openness, but they are often hedged around with phrases such as &#8220;where possible&#8221;, &#8220;when available&#8221;, or &#8220;subject to commercial constraints&#8221;.  No such hedges are viable in climate science.  The public policy issues are so serious, and the possible consequences of inaction so grave, that there must be no exceptions.</p>
<p><b>Other comments</b></p>
<p>The Climate Code Foundation is launching a &#8220;Science Code Manifesto&#8221;, on the specific subject of science software availability, and its consequences for science stakeholders (strongly related to your question 8).  Please see:</p>
<p><a href="http://code.google.com/p/climatecode/wiki/ScienceCodeManifesto">http://code.google.com/p/climatecode/wiki/ScienceCodeManifesto</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/10/royal-society-submission/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Homogenization report</title>
		<link>http://climatecode.org/blog/2011/09/homogenization-report/</link>
		<comments>http://climatecode.org/blog/2011/09/homogenization-report/#comments</comments>
		<pubDate>Wed, 14 Sep 2011 21:07:21 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[daniel-rothenberg]]></category>
		<category><![CDATA[gsoc2011]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=583</guid>
		<description><![CDATA[This guest post is written by Daniel Rothenberg, who worked all summer on homogenization code, thanks to the excellent Google Summer of Code. This is his third post, here are the first and second. As you may recall, I spent &#8230; <a href="http://climatecode.org/blog/2011/09/homogenization-report/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>This guest post is written by <a title="Daniel Rothenberg" href="http://climatecode.org/about/activities/gsoc2011/rothenberg/" target="_blank">Daniel Rothenberg</a>, who worked all summer on homogenization code, thanks to the excellent Google Summer of Code.  This is his third post, here are the <a title="Welcome Daniel Rothenberg" href="http://climatecode.org/blog/2011/05/welcome-daniel-rothenberg/" target="_blank">first</a> and <a title="Homogenization project progress" href="http://climatecode.org/blog/2011/07/homogenization-project-progress/" target="_blank">second</a>.</em></p>
<p>As you may recall, I spent the past summer working on behalf of the Climate Code Foundation to port and revise the <a href="http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas">Pairwise Homogenization</a> software utilized by the National Climatic Data Center to produce the <a href="http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#intro">US Historical Climate Network</a> dataset. Since my last update in the middle of July, I successfully worked through my first pass at the remaining sections of the algorithm, and have arrived at a major milestone &#8211; a Python program which can arbitrarily look at networks in the USHCN raw data, and homogenize them based on pairwise comparisons.</p>
<p><img src="http://climatecode.org/wp-content/uploads/2011/09/homogenization_example.png"/><br />
<a href="http://climatecode.org/wp-content/uploads/2011/09/homogenization_example.png">Figure 1</a> illustrates the homogenization results for two stations which were passed into the algorithm with a random selection of 50 other stations from across the USHCN. This test illustrates that the new code does some things very well, but still has some work to be done. For starters, when investigating the diagnostic output log from running the code on this test case, it is clear that the code nearly exactly reproduces its Fortran parent&#8217;s results up through the final &#8220;CONFIRMFILT&#8221; stage of analysis. At this stage, the code attempts to condense a large number of suspected breakpoints into a best-fit over the data. There are still some discrepancies between my code and the original, which tends to suppress the final number of detected changepoints. A perfect example of this is in the NEW ULM plot in Figure 1; the Python code misses the first detected changepoint around the year 2000, while it sucessfully finds others that the Fortan code spots. By contrast, the Python code sometimes fails to remove extra changepoints &#8211; particularly around swaths of &#8216;deleted&#8217; data (data which cannot be analyzed in this algorithm, usually because there aren&#8217;t enough paired neighbors to provide supporting information to them); this is illustrated well by the COLFAX plot in Figure 1, around 1910.</p>
<p>Although this is the major glitch in the code at this point, there are some other issues which need to be ironed out. First, there are some numerical issues associated with calculating the standardized adjustments to apply at each changepoint. From my experience with other parts of the code, this is likely a sign error in the statistical test which calculates the final adjustment at each changepoint, so it should be simple to find and fix in the future. Second, the algorithm needs to be adjusted to accept external sources of documented changepoints &#8211; this will greatly improve its ability to find the &#8220;best&#8221; changepoints in the cloud of suspect ones it finds through the first half of the algorithm. Finally, I am still working on re-engineering the code in its existing form to work more in the fashion of an API so that it can be more easily used on various datasets in the future.</p>
<p>This project wouldn&#8217;t have been possible without the support and adivce of David Jones and Nick Barnes of the Climate Code Foundation, as well as with the help and advice of Claude Williams and Matt Menne at the National Climatic Data Center. Hannah and Filipe &#8211; my GSoC compatriots &#8211; also provided great feedback and help throughout our code reviews and meetings. I&#8217;d like to thank them for all their time and effort over the summer! </p>
<p>Finally, I&#8217;m excited to continue working on this project &#8211; especially over the next few months and leading up the 2012 Annual Meeting of the American Meteorological Society, where I will hopefully presenting a talk entitled &#8220;Lessons From Deploying the USHCN Pairwise Homogenization Algorithm in Python&#8221; as part of the 2nd Symposium on Advanced in Modeling and Analysis Using Python. There is much work to continue with in the future:</p>
<ul>
<li>Refinement of the Python homogenization code, including addressing known bugs in the CONFIRMFILT process.</li>
<li>Further collaboration with Menne/Williams to improve the code and thoroughly see how it differs from the Fortran homogenization code.</li>
<li>A possible project with David Jones, looking at applying this algorithm to data from the Canadian climate record.</li>
</ul>
<p>If you&#8217;re interested in helping continue this project, please contact me or the Climate Code Foundation &#8211; we&#8217;d love to have you on board!</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/09/homogenization-report/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Common Climate Project demonstration</title>
		<link>http://climatecode.org/blog/2011/09/common-climate-project-demonstration/</link>
		<comments>http://climatecode.org/blog/2011/09/common-climate-project-demonstration/#comments</comments>
		<pubDate>Sat, 10 Sep 2011 16:33:56 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[gsoc2011]]></category>
		<category><![CDATA[hannah-aizenman]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=576</guid>
		<description><![CDATA[This guest post is written by Hannah Aizenman, who worked all summer on the Common Climate Project, thanks to the excellent Google Summer of Code. The summer is now past, and Hannah has built a useful demonstrator website. This is &#8230; <a href="http://climatecode.org/blog/2011/09/common-climate-project-demonstration/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>This guest post is written by <a title="Hannah Aizenman" href="http://climatecode.org/about/activities/gsoc2011/aizenman/" target="_blank">Hannah Aizenman</a>, who worked all summer on the Common Climate Project, thanks to the excellent Google Summer of Code. The summer is now past, and Hannah has built a useful demonstrator website. This is her third post, here are the <a title="Welcome Hannah Aizenman" href="http://climatecode.org/blog/2011/05/welcome-hannah-aizenman/" target="_blank">first</a> and <a title="First code for Common Climate Project" href="http://climatecode.org/blog/2011/07/first-code-for-common-climate-project/" target="_blank">second</a>.</em></p>
<p><em></em>Since the last blog post, the project has gained a (very) barebones web interface, so you can test out the functionality <a href="http://134.74.146.36/hannah/ccpviz.html">here</a>. So please go and play with the project, and if you end up thinking &#8220;hmm, this could work for a dataset I have&#8221;, <a href="https://code.google.com/p/ccp-viz-toolkit/source/checkout">grab the code</a>. And if instead you think &#8220;hmm, this could work, but &#8230;&#8221; <a href="https://code.google.com/p/ccp-viz-toolkit/issues/list">file a ticket</a>, <a href="https://code.google.com/p/ccp-viz-toolkit/source/browse/">contribute a patch</a>, or email the <a href="http://groups.google.com/group/ccp-viz-toolkit-discuss"> mailing list</a> and see if it can get sorted out. The project has been tested with GISTEMP and CCSM-C, and should work out of the box on any NetCDF dataset using float coordinates and unidate time (assuming an &#8220;x since y&#8221; unit is given for time). I&#8217;m currently sorting out how to support non-gridded data, with a focus on the <a href="http://www.ncdc.noaa.gov/paleo/pubs/mann2008/mann2008.html">Mann temperature reconstructions</a>.</p>
<p>I hope that by giving people a simple little toolset to play with datasets, it&#8217;ll simplify dataset exploration so that anyone (even somebody who doesn&#8217;t understand climate/geophysical dataset conventions) can just join in and play. All the plotting is still handled server side using ccplib, so if you&#8217;ve got local data and don&#8217;t really want or need the web aspect, just grab that code, run setup.py, and go. There&#8217;s one <a href="https://code.google.com/p/ccp-viz-toolkit/source/browse/#hg%2Fscripts"> demo</a> already, and I&#8217;d love contributions of more!</p>
<p>I&#8217;ve also added support for making time series graphs, and I hope to add support for more visualization tasks as this project grows. I&#8217;m also including an example of a spatial graph for anyone who missed the last blog post:</p>
<p><img title="CCP example time series chart with GISTEMP data" src="http://climatecode.org/wp-content/uploads/2011/09/ccp-demo-gistemp-time.png" alt="CCP example time series chart with GISTEMP data" width="500" height="400" /><img title="CCP example map using GISTEMP data" src="http://climatecode.org/wp-content/uploads/2011/09/ccp-demo-gistemp-map.png" alt="CCP example map using GISTEMP data" width="500" height="400" /></p>
<p>I had originally intended this project to lay done the framework for building a flexible toolkit for visualizing data, and I hope I&#8217;ve at least accomplished that much. Adding new visualizations and file types mostly boils down to hooking into an existing class (CCPData or Graph respectively) and the client side HTML, CSS, and JavaScript is heavily separated so that the form can easily be styled to fit within a larger website. Adding more content to the web is a matter of adding another JavaScript function, HTML element, and pyramid view. The backend web architecture tries to be RESTful (though it doesn&#8217;t yet conform to the HTML RFC), so the URLs for the images contain all the user defined attributes of the graph and the graphs can be created and manipulated directly from the URL. I hope to maintain the flexibility of the project so that it can grow into something really useful for all sorts of scientists. I very much hope my project will simplify the current chore of getting data on the web, because I think making the data public friendly is key to improving public data literacy.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/09/common-climate-project-demonstration/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ccc-gistemp summer project update</title>
		<link>http://climatecode.org/blog/2011/09/ccc-gistemp-summer-project-update/</link>
		<comments>http://climatecode.org/blog/2011/09/ccc-gistemp-summer-project-update/#comments</comments>
		<pubDate>Fri, 09 Sep 2011 19:37:43 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[filipe-fernandes]]></category>
		<category><![CDATA[gsoc2011]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=543</guid>
		<description><![CDATA[This guest post is written by Filipe Fernandes, who worked all summer on our ccc-gistemp project, thanks to the excellent Google Summer of Code. The summer is now past, although Filipe continues to work on our code. This is his &#8230; <a href="http://climatecode.org/blog/2011/09/ccc-gistemp-summer-project-update/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>This guest post is written by <a href="http://climatecode.org/about/activities/gsoc2011/fernandes/">Filipe Fernandes</a>, who worked all summer on our ccc-gistemp project, thanks to the excellent Google Summer of Code. The summer is now past, although Filipe continues to work on our code. This is his third post, here are the <a title="Welcome Filipe Fernandes" href="http://climatecode.org/blog/2011/05/welcome-filipe-fernandes/" target="_blank">first</a> and <a title="Making ccc-gistemp more user friendly" href="http://climatecode.org/blog/2011/07/making-ccc-gistemp-more-user-friendly/" target="_blank">second</a>.<br />
</em></p>
<p><em></em>We finally arrived at the final stage of the Google Summer of Code program. I&#8217;m happy I could be part of such an interesting and exciting project. Most of all I&#8217;m glad I ended up working with the Climate Code Foundation (CCF). I won&#8217;t read nor write code the same way after being mentored by CCF&#8217;s David Jones.</p>
<p>In fact, both David Jones and Nick Barnes were a fantastic duo to work with. They live in a &#8220;hybrid world&#8221; of computer and climate scientists. That unique experience makes them the perfect mentors for people like me, who had little training in software. Developing the project under the CCF supervision was a great learning opportunity for me.</p>
<p>My project was focused on the ccc-gistemp (the CCF implementation of NASA GISTEMP). The project changed a little bit during the GSoC, but its general ideas survived:</p>
<ul>
<li>Make ccc-gistemp more user-friendly;</li>
<li>Improve ccc-gistemp running time using NumPy arrays;</li>
<li>Transform ccc-gistemp into an accessible piece of software for end users.</li>
</ul>
<p>Since the midterm I have implemented a few improvements towards those goals:</p>
<ul>
<li>Comma Separated Value alternative for ccc-gistemp outputs. Now the GUI can open the results directly in a spreadsheet program like excel.</li>
<li>GUI support for a rudimentary &#8220;project management system.&#8221; It means that the user can to make multiple runs with different options and compare them later.</li>
<li>A SUSE studio appliance with pypy+ccc-gistemp+data+GUI interface that can be executed as a virtual machine, live CD or Amazon EC2 instance. [1]</li>
<li>A NumPy alternative for step 3.</li>
</ul>
<p>The last one turned out to be more challenging than expected, and it is still a work in progress that I wish to continue pursuing after GSoC.</p>
<p>There are also several things left to be done:</p>
<ul>
<li>More (elaborated) graphics and plotting output for the GUI;</li>
<li>A project management via an ini- like file;</li>
<li>Full NumPy support (from steps 1-5);</li>
</ul>
<p>I would like to thank my mentor David Jones for all the wisdom he shared with me during the GSoC. I also would like to extend my thanks to all CCF/CCP mentors (Nick, Julien, Kevin and Jason) who promptly helped all the students. Finally, I would like to thank my colleagues Daniel and Hannah for their valuable opinions and feedback on my work. I&#8217;m going to miss our Monday meetings and Friday code reviews.</p>
<p>[1] <a title="SUSE studio appliance including ccc-gistemp" href="http://susegallery.com/a/YfJVDT/ccc-gistemp?#appliance-downloads">http://susegallery.com/a/YfJVDT/ccc-gistemp?#appliance-downloads</a><br />
[2] <a title="PyPI project page for ccc-gistemp" href="http://pypi.python.org/pypi/ccc-gistemp/" target="_blank">http://pypi.python.org/pypi/ccc-gistemp/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/09/ccc-gistemp-summer-project-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Homogenization project progress</title>
		<link>http://climatecode.org/blog/2011/07/homogenization-project-progress/</link>
		<comments>http://climatecode.org/blog/2011/07/homogenization-project-progress/#comments</comments>
		<pubDate>Sun, 17 Jul 2011 06:00:36 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[daniel-rothenberg]]></category>
		<category><![CDATA[gsoc2011]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=473</guid>
		<description><![CDATA[This guest post is written by Daniel Rothenberg, one of our Google Summer of Code students, who is working on a library for climate record homogenization. His previous post introduced his project. At the halfway point in my Google Summer &#8230; <a href="http://climatecode.org/blog/2011/07/homogenization-project-progress/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>This guest post is written by <a href="http://climatecode.org/about/activities/gsoc2011/rothenberg/">Daniel Rothenberg</a>, one of our Google Summer of Code students, who is working on a library for climate record homogenization. His <a href="http://climatecode.org/blog/2011/05/welcome-daniel-rothenberg/">previous post</a> introduced his project.</em></p>
<p>At the halfway point in my Google Summer of Code project, I am happy to report that a great deal of progress has been made. A few weeks ago, I set out to re-write the Pairwise Homogenization Algorithm (PHA) [1] used by the United States Historical Climatology Network (USHCN) [2]. While there is a published version of this algorithm available online [3], the code is written in Fortran and complicated to read, understand, and use. My project aimed to de-obfuscate this code, and use it to build a library of similar codes that people could use in the future to explore surface temperature homogenizations and reconstructions.</p>
<p>The first task in this project was to port the PHA from its current form (in Fortran) to something more accessible and maintainable. Thus, I&#8217;ve spent the majority of my time slogging through complex, dense Fortran subroutines, working out the array-traversing logic that govern the mechanics of the algorithm. While there is a published, high-level description of this algorithm and its logic [1], nothing quite compares to seeing how the code is actually implemented. By far, the most difficult obstacle in this project so far has been translating existing code—like the semi-hierarchical splitting algorithm used to identify possible undocumented changepoints—into a Pythonic, easy-to-understand form.</p>
<p>So far, I&#8217;ve had a lot of success overcoming this sort of obstacle, but it&#8217;s never an easy feat. The original code, by Claude Williams and Matt Menne, has useful comments and documentation, but is written in an older style of Fortran (Fortran 77), and uses many conventions which are avoided in modern programming. A good example is copious use of <code>goto</code> statements. In a nutshell, these tell a program to jump over a block of code—even out of other control structures like <code>for</code>-loops. While they&#8217;re useful for some tasks, they result in the creation of what&#8217;s often derided as “spaghetti code”— code which goes every which way but loose, and is hard to understand.</p>
<p>There are good ways to untangle spaghetti code, though. For instance, I&#8217;ve been developing my code on a test set of data which I repeatedly run through the Fortran routines. By manipulating where the original Fortran code ends, I can liberally sprinkle debugging information like <code>print</code> statements throughout the code, and re-compile/run it in seconds when needed. This lets me track how variables like <code>for</code>-loop index counters change over time, and allow me to investigate exactly where looping code breaks and to where it jumps.</p>
<p>Then, once I understand how it works, I can translate it into Python—but only with a few clever tricks! You see, Python doesn&#8217;t have a <code>goto</code> statement. However, it does include ways to break prematurely out of loops—the <code>continue</code> statement, which skips to the next iteration of a loop, and the <code>break</code> statement, which immediately exits the looping scenario. These are useful, but have caveats; for instance, the <code>break</code> statement only breaks out of one loop, so if you&#8217;re looping over two indices, you can&#8217;t exit out of the “master” loop structure. Other nifty Python tools let you overcome this, though. For instance, you can use <code>zip</code> to “zip up” two lists of values into a single one, which often lets you condense some complex nested <code>for</code>-loops from Fortran into simpler and easier to understand loops in Python.</p>
<p>Other obstacles have sometimes involved the uncovering of bugs in the original PHA code. To date, I&#8217;ve found three significant bugs in the code which could potentially change some of the detection of changepoints in the algorithm. Two of these relate to a form of linear regression called the Kendall-Theil robust line fit. In this method, you form pairwise estimates of slopes from all the values you have in your data, and estimate the linear regression using the median slope you find. One bug I found involved the two-phase regression form of this code (used if you hypothesize that the slope on one half of a segment of data is different from the slope on the other half) used in the changepoint detection tests with the Bayesian Information Criterion. A second bug was the inadvertent overwriting of a variable describing median values found. I have reported these bugs back to the authors.</p>
<p><img src="http://climatecode.org/wp-content/uploads/2011/07/inhomogeneities.png" alt="" /></p>
<p>These bugs bring me full-circle back to the original intent of this project, which is to de-obfuscate code and make it as easy-to-understand and transparent as possible. The only way to have caught these bugs is to have dug through the code and kept detailed accounts of what values various loops take on during their execution. These sorts of runtime errors in your code can be very hard to catch—especially in complex codes which are hard to understand. There are a few tried-and-true methods to alleviate them, however. First, you can use large sets of simple tests cases to catch the various corner cases and bugs that can creep into your codes. I use this method to validate my auxiliary methods, such as correlation computations. However, sometimes it&#8217;s just not tenable to generate test cases—the Kendall-Theill code is complicated enough that while you could theoretically work out simple cases by hand, you really need some sort of numerical code to perform the method.</p>
<p>It is in cases like this where it is so important to practice good software development and engineering principles, like organized code, strong documentation, and iterative development. The truth is that a great deal of numerical codes used in scientific programming are dense and complicated; reducing the complexity of code and writing in as clear and transparent a manner as possible helps both the end user of the code and the writer himself ensure that it is valid and does what it is supposed to do.</p>
<p>With that said, although I&#8217;ve accomplished much in my project so far [4], there is still a great deal to do. Expect a second post on this soon, which will also recap a recent trip I made to the National Climatic Data Center.</p>
<p>1) <a href="ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/menne-williams2009.pdf">ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/menne-williams2009.pdf</a><br />
2) <a href="http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas">http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#phas</a><br />
3) <a href="http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#homogeneity">http://www.ncdc.noaa.gov/oa/climate/research/ushcn/#homogeneity</a><br />
4) <a href="http://code.google.com/p/ccf-homogenization/">http://code.google.com/p/ccf-homogenization/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/07/homogenization-project-progress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making ccc-gistemp more user-friendly</title>
		<link>http://climatecode.org/blog/2011/07/making-ccc-gistemp-more-user-friendly/</link>
		<comments>http://climatecode.org/blog/2011/07/making-ccc-gistemp-more-user-friendly/#comments</comments>
		<pubDate>Sat, 16 Jul 2011 06:00:15 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[filipe-fernandes]]></category>
		<category><![CDATA[gsoc2011]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=475</guid>
		<description><![CDATA[This guest post is written by Filipe Fernandes, one of our Google Summer of Code students, who is working on our ccc-gistemp project. His previous post introduced his project. Hello, my name is Filipe Fernandes and I&#8217;m a Google Summer &#8230; <a href="http://climatecode.org/blog/2011/07/making-ccc-gistemp-more-user-friendly/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i>This guest post is written by <a href="http://climatecode.org/about/activities/gsoc2011/fernandes/">Filipe Fernandes</a>, one of our Google Summer of Code students, who is working on our ccc-gistemp project.  His <a href="http://climatecode.org/blog/2011/05/welcome-filipe-fernandes/">previous post</a> introduced his project.</i></p>
<p>Hello, my name is Filipe Fernandes and I&#8217;m a Google Summer of Code (GSoC) student for the Climate Code Foundation (CCF).</p>
<p>I&#8217;ve worked mostly on packaging and cross-distribution of <i>ccc-gistemp</i>.</p>
<p>The current <i>ccc-gistemp</i> code is a program for people with at least intermediate computer skills. It must be run from the a command line terminal and it is difficult to make multiple runs and comparisons.</p>
<p>We want to change that, making <i>ccc-gistemp</i> available to a broader audience.  The progress I have made towards that goal is:</p>
<ul>
<li>Added a Command Line Interface (CLI) that unify all calls to run/vischeck;</li>
<li>Package ccc-gistemp via a standard Python setup.py;</li>
<li>Registered the code at PyPI;</li>
<li>Implemented py2exe (Windows) and py2app (Mac) for a frozen version of the CLI;</li>
<li>Started a Graphical User Interface (GUI).</li>
</ul>
<p><img src="http://climatecode.org/wp-content/uploads/2011/07/ccc-gistemp-GUI.png"/></p>
<p>The <a href="http://pypi.python.org/pypi/ccc-gistemp/0.6.1">PyPI package</a> has 63 downloads so far (as of 2011-07-10), which is quite impressive, since there was no advertisement. A Linux package was also added to the Open Build Service (OBS), but the number of downloads is not available.</p>
<p>Via the OBS one can create live CDs with the code or virtual machines that run on Virtual Box or Amazon EC2 making the code even more accessible.</p>
<p>I decided to tackle the GUI early in the project schedule due to its importance and higher difficulty (tackle largest risk first). I never used wxPython before, but I&#8217;m glad with the results so far.</p>
<p>The GUI is still under development, but the current version already runs <i>ccc-gistemp</i> similar to the CLI. We are working in ways to visualize the results and compare different runs.</p>
<p>For the second half of the GSoC period I&#8217;ll be working with the GUI and implementing an alternative core to <i>ccc-gistemp</i> using NumPy.</p>
<p>My original proposal has changed a little bit, I&#8217;m favoring the GUI instead of the NumPy implementation. I believe that the foundation of a good user interface is crucial to achieve the foundation goals.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/07/making-ccc-gistemp-more-user-friendly/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>First code for Common Climate Project</title>
		<link>http://climatecode.org/blog/2011/07/first-code-for-common-climate-project/</link>
		<comments>http://climatecode.org/blog/2011/07/first-code-for-common-climate-project/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 06:00:24 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[gsoc2011]]></category>
		<category><![CDATA[hannah-aizenman]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=481</guid>
		<description><![CDATA[This guest post is written by Hannah Aizenman, one of our Google Summer of Code students, who is working on a web-based visualisation tool for reconstructions of late Holocene temperatures, for the Common Climate Project (CCP). Since my last blog &#8230; <a href="http://climatecode.org/blog/2011/07/first-code-for-common-climate-project/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><i>This guest post is written by <a href="http://climatecode.org/about/activities/gsoc2011/aizenman/">Hannah Aizenman</a>, one of our Google Summer of Code students, who is working on a web-based visualisation tool for reconstructions of late Holocene temperatures, for the Common Climate Project (CCP).</i></p>
<p>Since <a href="http://climatecode.org/blog/2011/05/welcome-hannah-aizenman/">my last blog post</a> my project has progressed enough that the code can actually be used to make graphs of some sort, mostly plots of GISTEMP anomalies. The code is separated into two distinct parts: code that handles the data and code that handles the graphs.</p>
<p>The data part is pretty straightforward; give the code the path to the data and it&#8217;ll try to unpack the data and pull all the metadata out of the file. This is achieved as follows:</p>
<pre>file_path = '../data/fields/gistemp_sat_anom_2.5deg.nc'
data_obj = unpack.CCPDataFromNetCDF(file_path, field = 'field')</pre>
<p>Then the data can be pulled out of the file using:</p>
<pre>im = data_obj.get_all_data()</pre>
<p>Note: there is also a get_data() function for doing online calculations over multiple files.</p>
<p>Once the data is pulled out of the file, it&#8217;s up to the user to run it through whatever algorithm he or she chooses to and spit out data to be graphed. For this post, I just took the mean of all the observations:</p>
<pre>missing = data_obj.missing_value
mask = [im == missing]
im_masked = np.ma.masked_array(im, mask)
masked = im_masked.std(0)</pre>
<p>Next it&#8217;s time to set up the graph, starting off with setting the attributes and creating the object:</p>
<pre>graph_attrs = dict( projection = 'moll',
                          title = 'gistemp_sat_anom_2.5deg',
                          cmap = 'gist_heat_r',
                          xlabel = 'longitude',
                          ylabel = 'latitude',
                          cblabel = 'std dev of temp anomalies')
graph_obj = spatial.SpatialGraph(data_obj, **graph_attrs)</pre>
<p>And to finish off and create the image:</p>
<pre>graph_obj.ccpfig(masked, 'gistemp_demo')</pre>
<p><img src="http://climatecode.org/wp-content/uploads/2011/07/ccp-vis-demo-2.png">
<p>This code is also available for download at: <a href='https://code.google.com/p/ccp-viz-toolkit/source/browse/scripts/demo_gis.py' target='_blank'>https://code.google.com/p/ccp-viz-toolkit/source/browse/scripts/demo_gis.py</a></p>
<p>Besides generating figure, this code also lays out the rough structure of the project at the moment. There are three main, and somewhat independent, parts to the library: handling the data, doing some number crunching, and making pretty pictures. The plan for the web interface, which is my current focus, is to glue this library to some javascript or similar, so that anybody can go on to the commonclimate website, pick their data, throw in some attributes, and generate their own figures.</p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/07/first-code-for-common-climate-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Science as a Public Enterprise</title>
		<link>http://climatecode.org/blog/2011/06/science-as-a-public-enterprise/</link>
		<comments>http://climatecode.org/blog/2011/06/science-as-a-public-enterprise/#comments</comments>
		<pubDate>Tue, 14 Jun 2011 06:00:56 +0000</pubDate>
		<dc:creator>Nick Barnes</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://climatecode.org/?p=441</guid>
		<description><![CDATA[The Royal Society is conducting a policy study entitled &#8216;Science as a Public Enterprise&#8217;, focused on public engagement with science. This goes far beyond the traditional notions of &#8216;engagement&#8217;, in which the high priesthood of science may offer occasional public &#8230; <a href="http://climatecode.org/blog/2011/06/science-as-a-public-enterprise/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The Royal Society is conducting a policy study entitled <a href="http://royalsociety.org/policy/sape/">&#8216;Science as a Public Enterprise&#8217;</a>, focused on public engagement with science.  This goes far beyond the traditional notions of &#8216;engagement&#8217;, in which the high priesthood of science may offer occasional public lectures and open days, write pop-science books, or contribute to TV documentaries.  There is a growing realisation across science that modern communication media allow much more direct involvement: the public can see, grasp, and take part in scientific research to a much greater extent than has ever been possible before.  There is also a sense that there are practical arguments for increased transparency &#8211; that it would benefit scientists as well as the public &#8211; as well as a moral case (the public purse funds most research, and the public are often profoundly affected even by private science &#8211; for instance medical science, or models of oil dispersal in deep-water blowouts).  The Climate Code Foundation, of course, welcomes this study, which relates directly to our goal of improving the public understanding of climate science.</p>
<p>The study group is led by Geoffrey Boulton, an eminent geologist.  As part of the study, there was a Town Hall Meeting on Wednesday (2011-06-08), looking specifically at &#8216;Open Science&#8217;, which David Jones and I (Nick Barnes) attended.  It was divided into two panel sessions, &#8220;<i>Why</i> should science be open?&#8221; and &#8220;<i>How</i> should science be open?&#8221;  The meeting was addressed by Paul Nurse, president of the Royal Society, by Mark Walport, director of the Wellcome Trust, and by Philip Campbell, editor-in-chief of <i>Nature</i>.  Many more of the great and good of UK science were in attendance, either on the panels or contributing from the floor.  The discussion was interesting, and for the most part was both constructive and well-informed.</p>
<p>Mark Walport described the case for open science as &#8220;obvious and powerful&#8221;, and summarised arguments for and against.  He dismissed many of the arguments against as weak and insubstantial, but identified the following as stronger: </p>
<ul>
<li>there are no incentives for greater openness;</li>
<li>the global equity question: is free access necessarily fair access?</li>
<li>(especially in medical science)what about the confidentiality of the subjects?</li>
<li>what about privately-funded science, or science with national-security ramifications?</li>
<li>competitiveness: won&#8217;t groups or countries practising open science be disadvantaged?</li>
</ul>
<p>He emphasized that even these last two arguments can&#8217;t stand in the way of an urgent and necessary change: negative results of medical trials <i>must</i> be published.</p>
<p>Geoffrey Boulton contrasted his first ever science publication, which had six data points, with a more recent paper of his which has six <i>billion</i>.  Many modern papers cannot include all their data, and act instead almost as an advertisement for the dataset, where the real science value lies.  I would argue that the same metaphor applies to papers on computational science: the paper cannot include a precise description of the computational methods, and should act as a pointer to the underlying code.</p>
<p>Stephen Emmott, head of computational science at Microsoft Research, said that we need a revolutionary change to maintain reproducibility and falsifiability in a world of model-based science.  He emphasized the importance of open code: much research cannot be reproduced without the code.  He referred to a genomics study (possibly the same ones described in <a href="http://www.nature.com/nature/journal/v470/n7334/full/470305b.html">this Nature editorial</a>) in which the findings of most studies could not be reproduced due to a lack of openness.</p>
<p>Geoffrey Boulton rounded off the first session by encouraging us to ask &#8220;Is it worth the candle?&#8221; to open science, suggesting that the answer is decidedly yes, and pointing out that we will probably have to do it anyway.</p>
<p>The &#8220;How?&#8221; session was introduced by Philip Campbell, who emphasized three key questions:</p>
<ul>
<li><i>Credit:</i> How can the systems of acknowledgement, reward, professional advancement, and institutional assessment in science be evolved to properly recognise contributions other than the traditional peer-reviewed paper?  Creating and curating datasets, writing and maintaining code, promoting public engagement, all must be recognised and rewarded.</li>
<li><i>Cost:</i> Creating and especially curating datasets is expensive, especially in fields such as particle physics and metagenomics where data volumes are enormous. Who is going to pay? Funding agencies need to step up for this.  Opening, curating, and maintaining software resources also costs money (although much less) and funding agencies have failed to provide for it.</li>
<li><i>Community:</i> Each scientific community must decide on the appropriate level of openness.  For example, data embargo times might vary from field to field according to the personal and institutional investment made in obtaining data.  In many fields, openness is increasing.  In genomics, researchers who wanted data embargoes have been persuaded to accept credit instead: <i>open science wins citations.</i></li>
</ul>
<p>Timo Hannay, of Digital Science (a division of Macmillan publishing) is working to provide better software tools to working scientists.  He pointed out that almost all scientists have better software tools for managing their music collection or family snapshots than they do for managing their data and other digital resources.</p>
<p>From the floor, Peter Murray-Rust expressed the view that some groups can have valuable vested interests in the status quo, and be opposed to openness regardless of the interests of society or the views of scientists.  Sometimes gradual &#8220;evolution&#8221; is possible, but sometimes a &#8220;fracture&#8221; is necessary.</p>
<p>The last comment I recorded was from Cameron Neylon, a biophysicist and open research expert who sits on our advisory committee (as does Peter Murray-Rust).  He said that funding bodies should demand progress, but can&#8217;t move out in front of their scientific communities.  So communities have to believe in the provision of research outputs as adding value.  However, institutions and agencies &#8220;should <i>never</i> spend money <i>restricting</i> access&#8221; to scientific data or information.</p>
<p>In the coffee break after the meeting, I met Philip Campbell, who invited me to attend a meeting to discuss journal software publication policies.  I very much look forward to that.  Geoffrey Boulton encouraged us to make a submission to the study group, which we will certainly do.  I also spoke briefly to Nick von Behr of the Royal Society, and to Timo Hannay, and hope to be able to meet each of them again in future.</p>
<p>One last point raised, although I can&#8217;t recall who said it: access to science ought not to be limited according to perceived interest.  Almost any scientific topic is of interest to some proportion of the public, and modern technology &#8211; in particular the web &#8211; allows those specific people to directly engage in the science, without the wasted effort and limits that traditional &#8216;broadcasting&#8217; media would impose.</p>
<p>This has a direct bearing on citizen science &#8211; another important aspect of &#8216;Science as a Public Enterprise&#8217;, not really touched on by this meeting.  There are dozens of amazing citizen science projects, covering <a href="http://galaxyzoo.org/">astrophysics</a>, <a href="http://climateprediction.net/">climate prediction</a>, <a href="http://malariacontrol.net/">malaria control</a>, <a href="http://oldweather.org/">historical climatology</a>, among many other topics.  Some simply allow the public to donate spare computational power of their own machines.  In others, participants contribute their own intelligence (for instance, to discriminate between different galaxy types, or to read and transcribe old hand-written ship&#8217;s logs).  In either case, a large amount of excellent science is being done with the help and participation of the public, which would not be possible in any other way.</p>
<p>Overall, a constructive and interesting meeting.  I look forward to future activities of the study, and to seeing its conclusions.  It is easy to be impatient at the pace of change in large organisations or communities, but this change, however much delayed, is definitely coming.</p>
<p><small><i>More information about the Royal Society study <a href="http://royalsociety.org/policy/sape/">here</a>.</i></small></p>
]]></content:encoded>
			<wfw:commentRss>http://climatecode.org/blog/2011/06/science-as-a-public-enterprise/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

