Short description: I propose a web-based tool supporting dynamic visualization of spatially and temporally variable surface temperature data from ccc-gistemp. The tool will support dynamic client-side interaction including navigation of climate data and visualization at variable resolutions and throughout history.
I am proficient in Matlab and Mathematica. I greatly prefer Python and regularly use numpy and scipy through Python. I have extensive experience in data mining and numerical processing – most of it in relation to computational genetics and its massive reams of sequence and genotype data.
I am both a producer and consumer of open-source software. I am most familiar with and favor Linux and its suite of open-source applications.
I have published extensively in graduate school . Please see my CV for a full list. The most pertinent to this project is “Dynamic Visualization and Comparative Analysis of Multiple Collinear Genomic Data.”
I have not had any formal education in climate science outside what one learns in introductory college-level science courses. I’ve developed a personal interest based on what I’ve read and heard. I’ve performed some rudimentary data mining and analyses based on publicly available data to assess climate trends in the past century and a half. I’m very much the kind of person who wants to prove something for themselves rather than taking statements at face value. This led me to want to explore climate change for myself given the, perhaps unfounded, controversy.
I intend to pursue the Google Summer of Code as a sort of independent summer internship, in no uncertain terms. Climate change is a field I am interested in and I think I can apply my skills, and of course I would love to be paid so that I can afford the time to do it. I am really excited about the project proposed and I want to make sure it’s done and done right. I enjoy making dynamic web-based tools like this and I think it will prove to be a great way to expose these kind of climate data and problems to lay people and climate scientists alike.
This summer, aside from GSoC, if I am so lucky, I have planned vacations with my family and my fiance’s family amounting to 2 weeks. I also plan to continue progress toward my degree including a part-time research assistantship. I will not be taking any courses and my primary focus for the summer will be GSoC and this project. I have completed similar projects (see my CV) and I am confident this project can be completed in excellent fashion in the given time.
I propose a web-based tool supporting dynamic visualization of spatially and temporally variable surface temperature data. Time-permitting, the tool may be made extensible to the generic set of geographical (GIS) data.
Many web-based visualizations and tools exist pertaining to climate and/or geographical data, most of which admit a map-based UI conceptually similar to what I propose. However, each of these has limited functionality, operates on a fixed data set, and often has poor performance. There does not exist a generic online GIS viewer, much less one which is open source and can be easily extended to the format specific to GISTEMP. I plan to create (1) a tool exposing GISTEMP data, (2) a more dynamic and necessarily higher performance and more responsive tool to allow fluid user interaction, (3) a greater range and flexibility of data and visualization than that seen in an, essentially, gridded image (the technique used by Google Maps and many existing viewers), and (4) if time permits, an open-source tool for visualization of generic GIS data.
April 23rd – May 21st
Exploration and parsing of GISTEMP data and/or any data that will be exposed by the tool.
May 21st – June 11th (1st quarter)
Setup of server architecture and data, server-side data mining, filtering, and exposure by JSON/AJAX
June 11th – July 9th (2nd quarter)
Client-side map-based representation including panning and zooming, JSON/AJAX interface with server
July 9th – Midterm evaluation
At this stage, we should at least meet the baseline of existing, static, map-based visualization tools
July 9th – July 30th (3rd quarter)
Advanced client-side manipulation including off-map annotation, “side” widgets, and navigation through time
July 30th – August 13th (4th quarter)
Polishing UI, user manual (if I’ve done my job, it won’t be long), documentation, and cross-browser testing
August 13th – August 20th
Final testing and documentation
August 20th – Final evaluation
Data-mining, filtering, and exposure via a JSON/AJAX interface to web-based client.
While the server architecture and setup will be largely industry-standard, the primary server-side deliverable will be the code (Python) used to serve client requests (JSON/AJAX), mine the requisite data, and return a response
Visualization widgets, primarily dynamic map-based display as described above
Data mining and filtering components of the server will interface with existing data files/formats and code generating annotation or processed/post-analysis data. Client software consisting of the website/visualization tool(s) will be largely independent and may stand alone. It may also be integrated into the existing site at climatecode.org or any other location as a stand-alone widget.
The tool/website will promote the goals of the Climate Code Foundation by providing an intuitive and informative interface exposing the GISTEMP data at a level comprehensible and usable for anyone from lay persons with a vague interest in climate change to climate scientists.
This project requires no travel.
The most pressing concern is simple exposure and availability of the data. I expect access and parsing of the existing data to be the biggest hurdle impeding success of this project. Using the month before May 21st and the official start of the project to familiarize myself with the data and difficulties of access and parsing should serve to make sure this won’t be a problem and we can avoid issues popping up later on.
The ideal mentor will be a scientist familiar with the GISTEMP data, including the technical format and interpretation as a climate researcher, including knowledge of which data sets and potential visualization would be most useful.
Aside from regular mentoring, this project will require the GISTEMP data we plan to include in the tool and a server, physical or virtual, on which to develop the server-side components. I am most familiar with a LAMP setup – Linux, Apache, MySQL, and Python (not PHP) – and this is widely available and understood. Of course the GISTEMP data is not a simple MySQL database, so this will be entirely dependent on the existing data formats. If the foundation does not have these resources, a more or less LAMP architecture is available via Google App Engine or Amazon Web Services (EC2). I can also certainly negotiate a development environment, if not a production server, from the university.