Tuesday, August 26, 2008

Reflections on Google Summer of Code 2008

Background

Back in April, we applied Hackystat to the 2008 Google Summer of Code program. We didn't know too much about it, other than that it provided a chance for students to be funded by Google to work on open source projects for the summer.

With great glee, we learned in March that Hackystat was accepted as one of the 140 projects sponsored by Google. The next step was to solicit student applications, which we did by sending email to the Hackystat discussion lists. We ended up with around 20 applications. There were a few that were totally off the wall from people who had no clue what Hackystat was, and a few others that were disorganized, incomplete, or otherwise indicative of a student who would probably not be successful. But, a good dozen of the 20 applications appeared quite promising and deserving of funding.

Google then posted the number of "slots" for each project--the maximum number of students that they would support. Hackystat got 4 slots. The number of slots is apparently based partially on the number of applications received by the project, and partially on the organization's past track record with GSoC. Hackystat had no prior track record, and couldn't compete with the number of applications for, say, the Apache Foundation. The GSoC Program Administrator answered the anguished pleas of new organizations who got less slots than they wanted by basically saying, "Look, we don't want to give you a zillion slots and then have half a zillion projects fail. Do a good job this year with the slots you were given and reapply next year." Sound advice, actually.

We then started ranking the applications to figure out which four students should be funded. It was difficult and frustrating, because there were many good applications. At the end, we came up with four students who we felt had a combination of interesting project ideas and a good chance of success based on their skills and situations.

We were right. Three out of four of the students successfully completed their projects, and the fourth student had to drop out of the program due to sudden illness, which no one could have foreseen.

GSoC requires each student to have a mentor. This summer, Greg Wilson of the University of Toronto and I each took two students. Greg's students were physically sited at the University of Toronto, so he was able to have face-to-face interactions. My students were in China.

Student support took several forms over the summer. First, there was email and the Hackystat developer mailing lists. At the beginning of the summer, I received a few emails from students that I redirected to the mailing list, so that other project developers could respond, and also because the question asked was of general Hackystat interest. Fairly quickly, the students caught on, and started posting most of their general-interest questions to the list. I think this was one conceptual hurdle for the students to get over: they were not in a relationship just with me or Greg, but also with the entire Hackystat developer and user community. While there were certainly issues pertaining to the GSoC program that they discussed privately with their mentors, they were also "real" Hackystat developers and needed to learn how to interact with the wider community. All of the students acclimated to their new role.

We also requested that the students maintain a blog and post an entry at least once a week that would summarize what they'd been working on, what problems they'd run into, and what they were planning to do next. This was also pretty successful. You can see Shaoxuan's, Eva's, and Matthew's blogs for details. Interestingly, the Chinese students found they could not access their (U.S. created) blogs once they were in China, and so had to use Wiki pages.

Finally, I also set up weekly teleconferences via Skype with the two students I was mentoring in China. This was a miserable failure, probably due to my own lameness. Despite the fact that I live in a timezone (HST) shared by very few of my software engineering colleagues, and thus have lots of experience with multi-timezone teleconferencing, the Hawaii-China difference just totally threw me. The international dateline did not help matters. At any rate, we simply fell back to asynchronous communication via blogs and email and that worked fine.

For source code and documentation hosting, we used two mechanisms. The Hackystat project uses Google Project Hosting, and so the students I mentored used this service. Greg is the force behind Dr. Project, and so the students he mentored used that service. As part of the wrapup activities, his students ported their work to Google Project Hosting to conform to the Hackystat conventions.

Results

So, what did they actually accomplish? Matthew Bassett created a sensor for Microsoft Team Foundation server. Here's a screen shot of one page where the user can customize the events the sensor collects:

The sensor itself is available at: http://code.google.com/p/hackystat-sensor-tfs/.

Eva Wong worked on a data visualization system for Hackystat based on Flare.

Her project is available at: http://code.google.com/p/hackystat-ui-sensordatavisualizer/.

Finally, Shaoxuan Zhang worked on multi-project analysis mechanisms for Hackystat using Wicket. Here is a screen shot of the Portfolio page:

His project is available at: http://code.google.com/p/hackystat-ui-wicket/.

Reflections

So, what makes for a successful GSoC?

First, and most obviously, it's important to have good students. "Good", with respect to GSoC, seems to boil down to two essential attributes: a passion for the problem, and the ability to be self-starting. (As long as the student "starts", the mentors and other developers can help them "finish"). It was delightful to read Matthew's blog entries about Team Foundation Server: he obviously likes the technology and enjoyed digging into its internals. At one point in the summer, Shaoxuan sent me an email in which he apologized that he had not been working much for the past week because he just got married, but he'd work extra hard the next week to catch up! We clearly had passionate students.

It also helps to have good mentors. In the Hackystat project, we have an embarrassment of riches on this front, since the project includes a large number of academics who mentor as part of their day jobs. In the end, we only needed two active mentors for the four students, but we easily had mentoring capacity for a couple dozen students.

Establishing effective communication systems is critical. Part of this is technological. We found that email and blogs worked well. Skype did not work well for me, but that was probably operator error on my part. Greg had the additional opportunity to use face-to-face communication, which is certainly helpful but not at all necessary to success. The other part is social. Most of our students needed to learn over the summer to: (a) request help quickly when they ran into problems, and (b) direct their question to the appropriate forum: either the Hackystat developer mailing list or privately to a mentor via email. This wasn't particularly difficult or anything, it was just a part of the process of understanding the Hackystat project culture.

I think I would have more insightful "lessons learned" had any of the student projects crashed and burned, but fortunately for the students (and unfortunately for this blog posting), that simply didn't happen.

For the Hackystat project, participation in GSoC this summer has had many benefits. Clearly, we'll benefit from the code that the students have created and which now is publically visible in the Hackystat Component Directory. We are crossing our fingers that the students will continue to remain active members of the Hackystat community.

GSoC has also helped to create a new "center" of Hackystat expertise at the University of Toronto. We hope to build upon that in the future.

GSoC also catalyzed a number of discussions within the Hackystat developer community about the direction of the project and how students could most effectively participate. These insights will have long term value to the project.

I believe we are now significantly more skillful at mentoring students. I hope we get a chance to participate in GSoC 2009, and that we can build upon our experiences this summer next year.

No comments: