How hard of a problem could you solve in only three days? Who would you choose to help you do it?
The ICFP Programming Contest is an international programming competition organized in conjunction with the International Conference on Functional Programming, which is an annual academic conference about programming languages. Each year, teams from around the world compete in the ICFP contest to demonstrate the supremacy of their favorite programming language by solving a challenging problem over a 72-hour period.
The contest is organized by a different institution every time, typically a university, and the organizers work well in advance and sometimes take years to prepare the contest problem, which is kept secret until it’s time for the contest to begin. On the appointed day, which is usually a Friday in June or July, the organizers unveil an elaborate problem description on the Internet. When the problem is released — at a time that may be convenient or wildly inconvenient, depending on the difference between the organizers’ time zone and the time zone one’s team happens to be in — the teams go to work, using whatever tools they like to solve the task at hand. Past contests have challenged competitors to control a Mars rover that has to get to a home base while avoiding hostile aliens, design an ant colony capable of defending itself from invaders, and decode a string of letters resembling DNA (not coincidentally, the letters chosen for the bases were “I”, “C”, “F”, and “P”) and “resequence” it to draw a picture. The problems often include strange and hilarious twists, and the problem descriptions may be filled with in-jokes and cute asides.
The ICFP contest has been going on for twelve years, but the first time I heard of it was in 2007. I was living in Portland, Oregon at the time, and that summer, four of my Portland friends competed in the contest and were quite successful, coming in 28th out of 869 teams worldwide. (The contest scoring works differently each year and is determined by the organizers, but in 2007 the scoring procedure happened to be instantaneous and automatic. Teams could see how they were doing as the contest progressed by uploading their solutions to a website where they were automatically scored.)
It was great to see Jesse, Josh, Kim, and Paul do well in the contest, but it was also a painful wake-up call for me. They had invited me to work with them in the early hours of the contest (literally, the early hours — the difference in time zones was such that it started at 3 a.m. for us that year), and I had tried, I really had. Before long, though, it was clear that they were going to leave me in the dust. Part of it was that they’d worked as a group before and knew each other’s styles well, and I knew that I was a smart person, but as a programmer, I couldn’t keep up with them.
I was tired of being a mediocre programmer. It was time to get to work. That Christmas, my boyfriend, Alex, gave me the book Structure and Interpretation of Computer Programs, a classic computer science textbook, and we started reading the book together and doing all the exercises. My goal was to finish all 46 of the exercises from the first chapter of the book before I attempted the ICFP contest again. All through March, April, May, and June of 2008, I worked on the exercises, and on the day before the contest started in early July, I finally finished the last SICP exercise that had been giving me trouble. Alex and I were ready to compete. We dubbed ourselves Team K&R, a name which alluded to the first letters of our last names as well as to another well-known programming book, and for 72 hours, we furiously wrote code (with breaks for eating, sleeping, and occasional bouts of smooching).
That year, there was no automatic scoring; instead, there were a series of heats held after the 72-hour submission period ended. After each heat, the competitors were ranked by their scores, and some fraction advanced to the next heat, until a single winner remained. How did we do? Well, there ended up having to be eleven heats in all, and our team was eliminated in the fourth heat; another way of looking at it is that we came in 174th out of a total of 282 teams, or that 108 teams were eliminated before we were. Not terrible, but not exactly anything to write home about, either. Nevertheless, I was ecstatic. In just one year, I had gone from feeling incapable of participating at all to actually submitting a contest entry that had been moderately competitive! At the time, I was just about to move across the country and start graduate school for computer science, and the ICFP contest gave me the jolt of self-assurance that I needed to feel confident returning to school after four years of rather middle-of-the-road code-monkey jobs. I got to give a talk at Code n’ Splode, a monthly social event attended by many of my programmer friends in Portland, about how much fun I’d had doing the contest, and when I arrived at school, I was able to hit the ground running in my classes, because some of the things I needed to know were already fresh in my mind from having been in the contest. It was awesome.
Since then, for better or for worse, the ICFP contest has become a small but significant part of my identity. So, when it came time for this year’s contest, I had high expectations for myself. I wanted to make as big of a leap from 2008 to 2009 as I had from 2007 to 2008. I knew that a year of grad school had made me a better programmer. I already felt embarrassed by the code I’d written in the contest in 2008, and of how much time and effort it had taken to do things that now seemed easy — an excellent sign! And I was excited about the chance to try an ICFP-scale problem on for size again and see how I did now that I had a year of school under my belt. So I took a long weekend off from my summer research project to fly to Atlanta, where Alex was living, and spend the weekend working on the contest with him.
As usual, the contest spanned 72 hours, from a Friday afternoon (June 26th, in this case) to the following Monday afternoon. Alex took Friday afternoon off work, and we sat around his kitchen table drinking coffee, waiting for the problem specification to be released online at the appointed hour, and getting extremely fidgety. When the moment arrived, we downloaded the problem spec — and, of course, failed, because thousands of other would-be competitors were, of course, simultaneously trying to do precisely the same thing and overwhelming the poor web server with requests. Undaunted, Team K&R mashed the “reload” buttons on our browsers — exacerbating the problem for everyone else, I’m sure — and finally had the problem spec a few minutes later.
The spec told us that for this year’s contest, we would be writing programs to control (simulated) satellites moving through (simulated) outer space. Our satellites would have to perform a series of four increasingly difficult tasks: move from lower to higher orbit around a planet, meet up with another orbiting satellite, and so on, all of which were intended as “training missions” to prepare us for the eventual “Operation Clear Skies”, in which we would program a satellite to clean up space debris in orbit around Earth. So far, this was all sounding like an appropriately nerdy ICFP contest problem. We read on.
For each of the four tasks we had to complete, we had been provided with a rather enigmatic binary file. We couldn’t read the data in these files, but from reading the spec, we came to understand that they contained little computer programs — series of instructions that could simulate the physics of the satellites and celestial bodies for each task, accept input from us to control the satellites, and eventually produce a score according to how well or poorly we had accomplished the task. These little simulator programs were going to be indispensable — but we weren’t going to be able to run them directly on our computers. Instead, we would have to write our own program that would know how to run the programs that were encoded in the four binary files. Before we could start using the simulators to help us work out how to move things around in space, we would have to implement a virtual machine, or VM, and then run the simulators on that. Now things were getting interesting.
Alex spent the first few hours of the contest digging into the four binary files and figuring out how to decompile them into a format that we humans might be able to understand (and also, he hoped, that our as-yet-nonexistent VM would be able to run). Thankfully, we’d been provided a specification for the virtual machine we were supposed to build, and it was quite straightforward and complete. While Alex worked, I was busy absorbing the fifteen-page spec. Before long, I was having one of those incredibly trite revelations that I suspect I’m doomed to keep having for the rest of my programming career. Let me explain.
Everyone who uses computers deals with binary files all the time, but few of us open them up to peer at their insides. Many programmers have had the experience of accidentally opening up a binary file in a program designed to edit text. Typically, we see garbage, feel momentarily disoriented, realize our mistake, and quickly close the file again. Afterward, we may not think twice about it. If we do, we may have a sense of having seen something we shouldn’t have, of having blundered behind a curtain.
In such situations, the file we actually intended to open was, in all likelihood, a “plain text” file. Wikipedia speaks of plain text as being “unformatted”, but the scare quotes are there for a reason; in fact, the data in a plain text file is highly organized. We may think of plain text files as being fundamentally different from binary files, but as Wikipedia says, the distinction is arbitrary. A plain text file is made of data encoded in binary numerals, just like any other file on your computer. (We could go into how “file” is an abstraction, too, but I don’t really want to touch that one right now.) The only thing special about a “text” file is that it happens to be in a format that programs called text editors can process easily and turn into something that’s convenient for (some) humans to read.
So, when you open up that binary file in your text editor and see a bunch of garbage, it’s not that your text editor is doing something wrong. To the contrary, it’s faithfully doing its job, displaying the data it’s been told to display in the only way it knows how! It doesn’t know, can’t know, that what you’re seeing is garbage.
Some readers, I’m sure, are rolling their eyes and thinking “Obviously!” right now, and in retrospect, I am, too. But this realization hit me like a sledgehammer during ICFP. It took me an embarrassingly long time to understand what the hell Alex was doing, simply because it hadn’t occurred to me that we could write a program to process binary data just as well as we could write a program to process text, and, moreover, that there was absolutely no reason why we shouldn’t do such a thing. In fact, my misunderstanding had been not so much technical as psychological. I couldn’t follow what Alex was doing because I didn’t really believe that binary files were hackable, despite the fact that I had been attempting to preach the all-things-are-hackable gospel for years. When I saw that processing binary data was not only possible, but in fact utterly natural and unexceptional and reasonable, I caught a glimpse of enlightenment for a second.
Once I had made that leap, it followed that as long as we were processing binary data, we would do well to turn it into text, because while the computer didn’t care about the difference, it would certainly help us understand the data and therefore be able to write the rest of the VM more easily. By the time I had wrapped my head around all of this and started reading about how to deal with binary input and output in Scheme, which is my programming language of choice and the one that we had been planning to use for the whole contest, Alex was already under way with his implementation of the decompiler in C — a language which, notwithstanding our team name, I was pretty terrible at.
This part of Friday evening was the worst part of the whole contest for me. It had taken me a lot of effort to understand something that had been immediately apparent to Alex, which was a blow to my confidence, and to top it off, he’d written a bunch of code that I couldn’t really help with at all. This was not exactly my idea of a good time. But Alex, anticipating my frustration, started a much-friendlier-to-me Python version that we could work on together, and between us, we finished the decompiler. (The “obf” in its name stands for “Orbit binary format”, “Orbit” being the name of the still-nonexistent-at-this-point VM on which the simulators would eventually need to run.) Then, at Alex’s urging, we went outside for a quick three-mile run around nearby Piedmont Park. If he hadn’t suggested going out for a run, I think I would have sat there and tried to keep coding, and I probably would’ve gotten frustrated all over again. But running cleared my head and dissolved my remaining disgruntlement, and by the time we got back, I was pretty sure of how I was going to write the VM.
What followed was one of the best parts of the contest. Alex had gotten the decompiler to turn the binary files into something that looked remarkably similar to parenthesized assembly language. From there, I realized, all I had to do was write a little interpreter — something I’d had a lot of practice with. Interpreters were the bread and butter of what I had been studying in school for the past year, and I was delighted to have a chance to apply something I had learned. In short order, I had the VM up and running, and we were successfully crashing simulated satellites into the simulated earth! We were elated.
(Next time: the thrilling conclusion! To be continued in a future post.)