PMJ Engineering Log: Hackystat

Showing posts with label Hackystat. Show all posts

Tuesday, August 26, 2008

Reflections on Google Summer of Code 2008

Background

Back in April, we applied Hackystat to the 2008 Google Summer of Code program. We didn't know too much about it, other than that it provided a chance for students to be funded by Google to work on open source projects for the summer.

With great glee, we learned in March that Hackystat was accepted as one of the 140 projects sponsored by Google. The next step was to solicit student applications, which we did by sending email to the Hackystat discussion lists. We ended up with around 20 applications. There were a few that were totally off the wall from people who had no clue what Hackystat was, and a few others that were disorganized, incomplete, or otherwise indicative of a student who would probably not be successful. But, a good dozen of the 20 applications appeared quite promising and deserving of funding.

Google then posted the number of "slots" for each project--the maximum number of students that they would support. Hackystat got 4 slots. The number of slots is apparently based partially on the number of applications received by the project, and partially on the organization's past track record with GSoC. Hackystat had no prior track record, and couldn't compete with the number of applications for, say, the Apache Foundation. The GSoC Program Administrator answered the anguished pleas of new organizations who got less slots than they wanted by basically saying, "Look, we don't want to give you a zillion slots and then have half a zillion projects fail. Do a good job this year with the slots you were given and reapply next year." Sound advice, actually.

We then started ranking the applications to figure out which four students should be funded. It was difficult and frustrating, because there were many good applications. At the end, we came up with four students who we felt had a combination of interesting project ideas and a good chance of success based on their skills and situations.

We were right. Three out of four of the students successfully completed their projects, and the fourth student had to drop out of the program due to sudden illness, which no one could have foreseen.

GSoC requires each student to have a mentor. This summer, Greg Wilson of the University of Toronto and I each took two students. Greg's students were physically sited at the University of Toronto, so he was able to have face-to-face interactions. My students were in China.

Student support took several forms over the summer. First, there was email and the Hackystat developer mailing lists. At the beginning of the summer, I received a few emails from students that I redirected to the mailing list, so that other project developers could respond, and also because the question asked was of general Hackystat interest. Fairly quickly, the students caught on, and started posting most of their general-interest questions to the list. I think this was one conceptual hurdle for the students to get over: they were not in a relationship just with me or Greg, but also with the entire Hackystat developer and user community. While there were certainly issues pertaining to the GSoC program that they discussed privately with their mentors, they were also "real" Hackystat developers and needed to learn how to interact with the wider community. All of the students acclimated to their new role.

We also requested that the students maintain a blog and post an entry at least once a week that would summarize what they'd been working on, what problems they'd run into, and what they were planning to do next. This was also pretty successful. You can see Shaoxuan's, Eva's, and Matthew's blogs for details. Interestingly, the Chinese students found they could not access their (U.S. created) blogs once they were in China, and so had to use Wiki pages.

Finally, I also set up weekly teleconferences via Skype with the two students I was mentoring in China. This was a miserable failure, probably due to my own lameness. Despite the fact that I live in a timezone (HST) shared by very few of my software engineering colleagues, and thus have lots of experience with multi-timezone teleconferencing, the Hawaii-China difference just totally threw me. The international dateline did not help matters. At any rate, we simply fell back to asynchronous communication via blogs and email and that worked fine.

For source code and documentation hosting, we used two mechanisms. The Hackystat project uses Google Project Hosting, and so the students I mentored used this service. Greg is the force behind Dr. Project, and so the students he mentored used that service. As part of the wrapup activities, his students ported their work to Google Project Hosting to conform to the Hackystat conventions.

Results

So, what did they actually accomplish? Matthew Bassett created a sensor for Microsoft Team Foundation server. Here's a screen shot of one page where the user can customize the events the sensor collects:

The sensor itself is available at: http://code.google.com/p/hackystat-sensor-tfs/.

Eva Wong worked on a data visualization system for Hackystat based on Flare.

Her project is available at: http://code.google.com/p/hackystat-ui-sensordatavisualizer/.

Finally, Shaoxuan Zhang worked on multi-project analysis mechanisms for Hackystat using Wicket. Here is a screen shot of the Portfolio page:

His project is available at: http://code.google.com/p/hackystat-ui-wicket/.

Reflections

So, what makes for a successful GSoC?

First, and most obviously, it's important to have good students. "Good", with respect to GSoC, seems to boil down to two essential attributes: a passion for the problem, and the ability to be self-starting. (As long as the student "starts", the mentors and other developers can help them "finish"). It was delightful to read Matthew's blog entries about Team Foundation Server: he obviously likes the technology and enjoyed digging into its internals. At one point in the summer, Shaoxuan sent me an email in which he apologized that he had not been working much for the past week because he just got married, but he'd work extra hard the next week to catch up! We clearly had passionate students.

It also helps to have good mentors. In the Hackystat project, we have an embarrassment of riches on this front, since the project includes a large number of academics who mentor as part of their day jobs. In the end, we only needed two active mentors for the four students, but we easily had mentoring capacity for a couple dozen students.

Establishing effective communication systems is critical. Part of this is technological. We found that email and blogs worked well. Skype did not work well for me, but that was probably operator error on my part. Greg had the additional opportunity to use face-to-face communication, which is certainly helpful but not at all necessary to success. The other part is social. Most of our students needed to learn over the summer to: (a) request help quickly when they ran into problems, and (b) direct their question to the appropriate forum: either the Hackystat developer mailing list or privately to a mentor via email. This wasn't particularly difficult or anything, it was just a part of the process of understanding the Hackystat project culture.

I think I would have more insightful "lessons learned" had any of the student projects crashed and burned, but fortunately for the students (and unfortunately for this blog posting), that simply didn't happen.

For the Hackystat project, participation in GSoC this summer has had many benefits. Clearly, we'll benefit from the code that the students have created and which now is publically visible in the Hackystat Component Directory. We are crossing our fingers that the students will continue to remain active members of the Hackystat community.

GSoC has also helped to create a new "center" of Hackystat expertise at the University of Toronto. We hope to build upon that in the future.

GSoC also catalyzed a number of discussions within the Hackystat developer community about the direction of the project and how students could most effectively participate. These insights will have long term value to the project.

I believe we are now significantly more skillful at mentoring students. I hope we get a chance to participate in GSoC 2009, and that we can build upon our experiences this summer next year.

Thursday, January 17, 2008

Hackystat Version 8

Hackystat Version 8 is now in public release. Hackystat is an open source framework for collection, analysis, visualization, interpretation, annotation, and dissemination of software development process and product data. This eighth major redesign of the system is intended to retain the advantages of previous versions while incorporating significant new capabilities:

RESTful web service architecture. The most significant change in Version 8 is its re-implementation as a set of web services communicating using REST principles. This architecture facilitates several of the features noted below, including scalability, openness, and platform/language neutrality.

Sensor-based. A primary means of data collection in Hackystat is through "sensors": small software plugins to development tools that unobtrusively collect and transmit low-level data to the Hackystat sensor data repository service called the "SensorBase".

Extensible. Hackystat can be extended to support new development tools by the creation of new sensors. It can also be extended to support new analyses by the creation of new services.

Open. All sensors and services communicate via HTTP PUT, GET, POST, and DELETE, according to RESTful web service principles. This "open" API has two advantages: (1) it makes it easy to extend the Hackystat Framework with new sensors and services; and (2) it makes it easy to integrate Hackystat sensors and services with other information services. Hackystat can participate as just one part of an "ecosystem" of information services for an organization.

High performance. The default version of the SensorBase uses an embedded Derby RDBMS for its back-end data store. Initial performance evaluation of this repository, in combination with our multi-threaded client-side SensorShell, has been been quite encouraging: we have achieved sustained transmission rates of approximately 1.2 million sensor data instances per hour. The SensorBase is designed to allow "pluggable" back-end data stores. One organization, for example, is using Microsoft SQL server as the back-end data store.

Scalable. A natural outcome of a web service architecture is scalability: one can distribute services across multiple services or aggregate them on a single server depending upon load and resource availability. Hackystat is also scalable due to the fact that each organization can run its own local SensorBase, or even multiple SensorBases if required. Finally, Hackystat can exploit HTTP caching as yet another scalability mechanism.

Secure. While Hackystat maintains a "public" SensorBase and associated services for use by the community, we expect that most organizations adopting Hackystat will choose to install and run the SensorBase and associated services locally and internally. This facilitates data security and privacy for organizations who do not wish sensitive product or process information to go beyond their corporate firewalls.

Platform and language neutrality. Hackystat's implementation as a set of RESTful web services makes it language and platform neutral. For example, a sensor implemented in .NET and running on Windows might send information to a SensorBase written in Java running on a Macintosh, which is queried by a user interface written in Ruby on Rails web application hosted on a Linux machine.

Open Source. Hackystat is hosted at Google Project Hosting, and distributed among approximately a dozen individual projects. The "umbrella" Hackystat project includes a Component Directory page with links to all of the related subprojects. Since most subprojects correspond to independent Hackystat services, they are typically free to choose their own open source license, though most have chosen GNU V2.

Out of box support for process and product data collection and analysis. The standard Hackystat includes a variety of process and product data collection and analyses, including: editor events and developer editing time, coverage, unit test invocations, build invocations, code issues discovered through static analysis tools, size metrics, complexity metrics, churn, and commits. Of course, the Open API makes it possible to extend this list with more.

When we began work on Hackystat in 2001, we thought of it primarily as a software metrics framework. Seven years later, we find that vision limiting, because it tends to focus one on the collection and display of numbers. Our vision for Hackystat now is broader: we believe that the collection and display of numbers is just the first step in an ongoing process of collaborative sense-making within a software development organization. An organization needs numbers, but it also needs ways to get those numbers to the right people at the right time. More importantly, it needs ways to incrementally interpret, re-interpret, and annotate those numbers over time to build up a collective consensus as to their meaning and implications for the organization. Our goal for Hackystat Version 8 is to be an effective infrastructure for participation in the broader knowledge gathering and refinement processes of an organization, or even the software development community as a whole. If successful, it can play a role in creating new mechanisms for improving the collective intelligence of a software development group.

Friday, December 14, 2007

If Collective Intelligence is the question, then dashboards are not the answer

I've been thinking recently about collective intelligence and how it applies to software engineering in general and software metrics in particular.

My initial perspective on collective intelligence is that provides an organizing principle for a process in which a small "nugget" of information is somehow made available to one or more people, who then refine it, communicate it to others, or discard it as the case may be. Collective Intelligence results when mechanisms exist to prevent these nuggets from being viewed as spam (and the people communicating them from being viewed as spammers), along with mechanisms to support the refinement and elaboration of the initial nugget of information into an actual "chunk" of insight or knowledge. Such chunks, as they grow over time, enable long-term organization learning from collective intelligence processes.

Or something like that. What strikes me about the software metrics community and literature is that when it comes to how "measures become knowledge", the most common approach seems to be:

Management hires some people to be the "Software Process Group"
The SPG goes out and measures developers and artifacts somehow.
The SPG returns to their office and generates some statistics and regression lines.
The SPG reports to management about the "best practices" they have discovered, such as "The optimal code inspection review rate is 200 LOC per hour".
Management issues a decree to developers that, from now on, they are to review code at the rate of 200 LOC per hour.

I'm not sure what to call this, but collective intelligence does not come to mind.

When we started work on Hackystat 8, it became clear that there were new opportunities to integrate our system with technologies like Google Gadgets, Twitter, Facebook, and so forth. I suspect that some of these integrations, like Google Gadgets, will turn out to be little more than neat hacks with only minor impact on the usability of the system. My conjecture is that the Hackystat killer app has nothing to do with "dashboards"; most modern metrics collection systems provide dashboards and people can ignore them just as easily as they ignore their web app predecessors.

On the other hand, integrating Hackystat with a communication facility like Twitter or with a social networking application has much more profound implications, because these integrations have the potential to create brand new ways to coordinate, communicate, and annotate an initial "nugget" generated by Hackystat into a "chunk" of knowledge of wider use to developers. It could also work the other way: an anecdotal "nugget" generated by a developer ("Hey folks, I think that we should all run verify before committing our code to reduce continuous integration build failure") could be refined into institutional knowledge (a telemetry graph showing the relationship between verify-before-commit and build success), or discarded (if the telemetry graph shows no relationship).

Thursday, December 6, 2007

Social Networks for Software Engineers

I've been thinking lately about social networks, and what kind of social network infrastructure would attract me as a software engineer. Let's assume, of course, that my development processes and products can be captured via Hackystat and made available in some form to the social network. Why would this be cool?

The first reason would be because the social network could enable improved communication and coordination by providing greater transparency into the software development process. For example:

Software project telemetry would reveal the "trajectory" of development with respect to various measures, helping to reveal potential bottlenecks and problems earlier in development.
Integration with Twitter could support automated "tweets" informing the developers when events of interest occur.
An annotated Simile/Timeline representation of the project history could help developers understand and reflect upon a project and what could be done to improve it.

I'm not sure, however, that this is enough for the average developer. Where things get more interesting is when you realize that Hackystat is capable of developing a fairly rich representation of an individual developer's skills and knowledge areas.

As a simple example, when Java programmer edits a class file, the set of import statements reveal the libraries being used in that file, and thus the libraries that this developer has some familiarity with, because he or she is using those libraries to implement the class in question. When a Java programming edits a class file, they are also using some kind of editor---Emacs, Eclipse, Idea, NetBeans, and thus revealing some level of expertise with that environment. Indeed, Hackystat sensors can not only capture knowledge like "I've used the Eclipse IDE over 500 hours during the past year", but even things like "I know how to invoke the inspector and trace through functions in Eclipse", or "I've never once used the refactoring capabilities." Of course, Hackystat sensors can also capture information about what languages you write programs in, what operating systems you are familiar with, what other development tools you know about, and so forth. Shoots, Hackystat could even keep a record of the kinds of exceptions your code has generated.

Let's assume that all of this information can be processed and made available to you as, say, a FaceBook Application. And, you can edit the automatically generated profile to remove any skills you don't want revealed. You might also be able to annotate the information to provide explanatory information. You can provide details about yourself, such as "Student" or "Professional", and also your level of "openness" to the community. After all that's done, you press "Publish" and this becomes part of your FaceBook or OpenSocial profile.

So what?

Well, how about the following scenarios:

[1] I'm a student and just encountered a weird Exception. I search the community for others with experience with this Exception. I find three people, send them an IM, and shortly thereafter one of them gives me a tip on how to debug it.

[2] I'm interested in developing a Fortress mode for Emacs, but don't want to do it alone. I search the community for developers with both expertise in Fortress and Emacs, and contact them to see if they want to work with me on such a mode.

[3] I'm an employer and am interested in developers with a demonstrated experience with compiler development for a short-term, well paying consulting position. I need people who don't require any time to come up to speed on my problem; I don't want to hire someone who took compilers 10 years ago in college and hasn't thought about it since. I search the community, and find a set of candidates who have extensive, recent experience using Lex, YACC, and JavaCC. I contact them to see if they would be interested in a consulting arrangement.

[4] I'm a student who has been participating in open source projects and making extensive contributions, but has never had a "real" job. I need a way to convince employers that I have significant experience. I provide a pointer in my resume to my profile, showing that I have thousands of hours of contributions to the Apache web server and Python language projects.

Hackystat is often thought of as a measurement system, and indeed all the above capabilities result from measurement. However, the above doesn't feel like measurement, it feels like social coordination and communication of relatively sophisticated and efficient nature.

Sunday, October 7, 2007

Hackystat and Crap4J

The folks at Agitar, who clearly have a sense of humor in addition to being excellent hackers, have recently produced a plug-in for Eclipse called Crap4J that calculates a measure of your code's "crappiness".

From their web page:

There is no fool-proof, 100% objective and accurate way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and empirical evidence – is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit a “This is crap!” response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into “Oh crap!”

Since writing automated tests (e.g., using JUnit) for complex code is particularly hard to do, crappy code usually comes with few, if any, automated tests. The presence of automated tests implies not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but it also means that the developers cared enough and had enough time to write tests – which is a good sign for the people inheriting the code.

Because the combination of complexity and lack of tests appear to be good indicators of code that is potentially crappy – and a maintenance challenge – my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (CRAP) score uses cyclomatic complexity and code coverage from automated tests to help estimate the effort and risk associated with maintaining legacy code. We started working on an open-source experimental tool called “crap4j” that calculates the CRAP score for Java code. We need more experience and time to fine tune it, but the initial results are encouraging and we have started to experiment with it in-house.

Here's a screenshot of Crap4J after a run over the SensorShell service code:

Immediately after sending this link to the Hackystat Hackers, a few of us started playing with it. While the metric seems intuitively appealing (and requires one to use lots of bad puns when reporting on the results), its implementation as an Eclipse plugin is quite limiting. We have found, for example, that the plugin fails on the SensorBase code, not through any fault of the SensorBase code (whose unit tests run quite happily within Eclipse and Ant) but seemingly because of some interaction with Agitar's auto-test invocation or coverage mechanism.

Thus, this seems like an opportunity for Hackystat. If we implement the CRAP metric as a higher level analysis (for example, at the Daily Project Data level), then any combination of tools that send Coverage data and FileMetric data (that provides cyclomatic complexity) can produce CRAP. Hackystat can thus measure CRAP independently of Eclipse or even Java.

The Agitar folks go on to say:

We are also aware that the CRAP formula doesn’t currently take into account higher-order, more design-oriented metrics that are relevant to maintainability (such as cohesion and coupling).

Here is another opportunity for Hackystat: it would be trivial, once we have a DPD analysis that produces CRAP, to provide a variant CRAP calculation that factors in Dependency sensor data (which provides measures of coupling and cohesion).

Then we could do a simple case study in which we run these two measures of CRAP over a code base, order the classes in the code base by their level of crappiness according to the two measures, and ask experts to assess which ordering appears to be more consistent with the code's "True" crappiness.

I think such a study could form the basis for a really crappy B.S. or M.S. Thesis.

Friday, October 5, 2007

Fast Fourier Telemetry Transforms

Dan Port, who has a real talent for coming up with interesting Hackystat research projects, sent me the following in an email today:

Some while back I had started thinking about automated ways analyze telemetry data and it occurred to me that maybe we should be looking at independent variables other than time. That is, we seem to look at some metric vs. time for the most part. This time series view is very tough to analyze and especially automate the recognition of interesting patterns. While dozing off at a conference last week something hit me (no, not my neighbor waking me up). What if we transformed the time-series telemetry data stream into.... a frequency-series. That is, do a FFT on the data. This would be like looking at the frequency spectrum of an audio stream (although MUCH simpler).

I am extremely naive about FFT, but my sense is that this approach basically converts a telemetry stream into a 'fingerprint' which is based upon oscillations. My hypothesis is that this fingerprint could represent, in some sense, a development 'process'.

If that's so, then the next set of questions might be:

Is this 'process' representation stable? Would we get the same/similar FFT later or from a different group?
Is this process representation meaningful? Are the oscillations indicative of anything useful/important about the way people work? Does this depend upon the actual kind of telemetry data that is collected?

It would be nice to come up with some plausible ideas for telemetry streams that might exhibit 'meaningful' oscillations as a next step in investigating this idea.

Thursday, September 27, 2007

Representation of Developer Expertise

Developer expertise is generally represented by "years of experience", which is generally useless. Does someone have 10 years of experience, or 1 year of experience 10 times over?

The process and product data collected by Hackystat has the potential to provide a much richer and more meaningful representation of developer expertise. Let's restrict ourselves to the domain of Java software development, for the sake of discussion. First, let's consider the kinds of data we could potentially collect as a developer works:

When they are working on a Java system.
The packages imported by the class that they are editing.
The IDE they are using.
The IDE features (debugger, build system, refactoring, configuration management) they are using.
Their behaviors within the IDE (writing tests, writing production code, running the system, running the debugger, running tests, invoking the build system, etc.)
Their configuration management behavior: frequency of commits, degree of churn; conflicts.
The number of developers associated with their current project.
The level of interdependency between developers: are files worked on by multiple developers? Has the developer ever worked on code written by someone else? Has someone else ever worked on code written by this developer?

I believe that capturing this kind of information can provide a much richer and more useful representation of developer expertise. It can provide information useful to determining:

Who has experience with a particular tool/technique?
Who has complementary skills to my own?
Who has recent experience with a given tool/technique?

At a higher level, this information could also be useful in forming groups, by helping identify developers with similar and/or complementary skills.

Important research issues include providing the developer with "control" over their profile, how to display this information, and how to perform queries.

Tuesday, September 18, 2007

Twitter, Hackystat, and solving the context switching problem

A recent conversation with Lorin Hochstein at ISERN 2007 has led me to wonder if Twitter + Hackystat solves what we found in prior research to be a fundamental show stopper with the manual metrics collection techniques like the Personal Software Process: the "context switching problem".

Back in the day when we were building automated support for PSP, a basic problem we couldn't solve was developer hatred for having to constantly stop what they were doing in order to tell the tool what they were doing. We called this the "context switching problem", and we came to believe that no amount of tool support for the kind of data that the PSP wants to collect can overcome it, because PSP data intrinsically requires ongoing developer interruption.

I believe a central design win in Twitter is that it does not force a context switch: the messages that you write are so constrained in size and form that they do not interrupt the "flow" of whatever else you're doing. This is fundamentally different from a phone call, a blog entry, an email, or a PSP log entry.

What makes Twitter + Hackystat so synergistic (and, IMHO, so compelling) is that Hackystat sensors can provide a significant amount of useful context to a Twitter message. For example, suppose Developer Joe twitters:

"Argh, I'm so frustrated with JUnit!"

Joe's recent Hackystat event stream could reveal, for example, that he is struggling to resolve a compiler error involving annotations. Developer Jane could see that combination of Twitter and Hackystat information, realize she could help Joe in a couple of minutes, and IM to the rescue.

A second very cool hypothesis is that this combination overcomes the need to "ask questions the smart way". Indeed, Developer Joe is not asking a question or even asking anyone explicitly for help: he is merely expressing an emotion. The additional Hackystat data turns it into a "smart question" that Developer Jane decides to volunteer to answer.

So, how do we create this mashup? I can see at least two different user interfaces:

(a) Hackystat-centric. Under this model, developers in a group use Twitter in the normal way, but Hackystat will also be a member of that community and subscribe to the feed. Then, we create a widget in, say, NetVibes that displays the twitter messages augmented with (possibly abstractions) of the raw sensor data which provides the additional interesting context. Developers then use this UI to monitor the HackyTwitter conversation.

(b) Twitter-centric. In this case, Hackystat abstractions of events are posted to Twitter, which thus becomes part of the normal Twitter feed. People use Twitter in the normal way, but now they are getting an additional stream of Twitter messages representing info from Hackystat.

How might we test these hypotheses? As an initial step, I propose a simple pilot study in which a software engineering class works on a group project in the "normal" way for a month, then installs Hackystat+Twitter and continues on. After the second month, a questionnaire is administered to get feedback from the students on how their communication and coordination changed from month one to month two, and what benefits and problems the introduction of Hackystat + Twitter created.

If this experience is successful, then we refine our experimental method and move on to an industrial case study with more longitudinal data collection. For example, we could build upon the case study by Ko, Deline, and Venolia to see if Hackystat+Twitter reduces the activities engaged in by developers in order to know what their co-workers on a project are currently doing.

What would we call this? I still like the term Project Proprioception.

Sunday, September 16, 2007

Panopticode

I came across the Panopticode project today. It is an interesting approach to metrics aggregation. They motivate their approach by listing the following limitations of current Java metrics tools:

Installation and configuration can be difficult and time consuming
Most only measure the status at a point in time and have no concept of historical data
Each tool represents its data in a proprietary format
It is very difficult to correlate data from different tools
It can be difficult to switch between competing tools that provide similar functions
Most have extremely limited reporting and visualization capabilities

Of course, I agree absolutely. Panopticon provides a way to simplify the download, installation, and invocation of the following tools so far: Emma, JDepend, Checkstyle, JavaNCSS, and provide an interesting visualization of the results called TreeMaps.

There are some substantial differences between their approach and ours in Hackystat:

Panopticode limits itself to the world of Java code built using Ant. This is the simplifying assumption they use to achieve (1). Hackystat is agnostic with respect to language and build technology.
Current reports do not appear to include history, so I don't know how they plan to do provide (2). Hackystat includes a domain specific language for analysis and visualization of project history called Software Project Telemetry. This also provides a general solution to problem (4) of correlating data from different tools. Panopticode does not appear to provide a solution to (4), at least from perusing the gallery and documentation pages. I will also be interested to see how they create a scalable solution as the toolset grows to, say 30 or 40. This is a hard problem that the Telemetry DSL addresses.
While I agree with statement (6) that current reporting tools have an extremely limited reporting and visualization capability, Panopticode seems to currently suffer from that same problem :-) Hackystat, at least with Version 8, will break out of the Java JSP WebApp prison with an architecture well suited to a variety of reporting and visualization approaches, includes Ambient devices, Ruby on Rails, GWT, and so forth. Finally, while TreeMaps are certainly sexy, I don't really see how they are fundamentally better than the unsexy HTML reports of JavaNCSS, Emma, etc. (at least, I don't see it given the way Panopticode uses TreeMaps at present). If I am trying to find low coverage, Emma's HTML interface gets me there just about as easily as the TreeMap does. TreeMaps are cute and all, but they feel more like syntactic sugar than some fundamental interface paradigm shift.

The project is in its bootstrapping phases, so in some sense it's not fair to compare it to Hackystat, which is in its 6th year of public release. I also think it's an interesting decision to limit oneself to Java/Ant, which I think can really simplify certain problems that Hackystat faces in order to appeal to a broader audience. I look forward to seeing how this project progresses in the future.

Tuesday, August 28, 2007

Grid Computing, Java, and Hackystat

I just got finished watching a really interesting screencast called "Grid Application in 15 minutes" that features GridGain, a new open source grid computing framework in Java. See their home page for the link to the screencast.

Things I found interesting while watching the screencast:

It uses some advanced Java features (Generics, Annotations, AOP) to dramatically simplify the number of lines of code required to grid-enable a conventional application.
It is a nice example of how to use the Eclipse framework to maximize the amount of code that Eclipse writes for you and minimize the amount that you have to type yourself.

I think there are some really interesting opportunities in Hackystat for grid computing. Many computations related to DailyProjectData and Telemetry (for example) are "embarrassingly parallel" and GridGain seems like the shortest path to exploiting this domain attribute.

Thursday, August 23, 2007

Project Proprioception

In the latest issue of Wired Magazine, there is an interesting article in defense of Twitter. One thing he says is that you can't really understand Twitter unless you actually do it (which might explain why I don't really understand Twitter.)

He goes on to say that the benefit of Twitter is "social proprioception":

When I see that my friend Misha is "waiting at Genius Bar to send my MacBook to the shop," that's not much information. But when I get such granular updates every day for a month, I know a lot more about her. And when my four closest friends and worldmates send me dozens of updates a week for five months, I begin to develop an almost telepathic awareness of the people most important to me.

It's like proprioception, your body's ability to know where your limbs are. That subliminal sense of orientation is crucial for coordination: It keeps you from accidentally bumping into objects, and it makes possible amazing feats of balance and dexterity.

Twitter and other constant-contact media create social proprioception. They give a group of people a sense of itself, making possible weird, fascinating feats of coordination.

Aha! That makes a lot more sense to me, and also suggests the following hypothesis:

Hackystat's fine-grained data collection capabilities can support "project proprioception": the ability for a group of developers to have "a sense of themselves" within a given software development project.

I think that DevEvents and Builds and so forth support a certain level of project proprioception without any further interaction with the developer. But, what if Hackystat had a kind of "Twitter sensor", in which developers could post small nuggets of information about what they were thinking about or struggling with that could be combined with the DevEvents:

"Trying to figure out the JAXB newInstance API"
"WTF is with this RunTime Exception?"
"General housecleaning for the Milestone Release"
"Pair Programming With Pavel"
"Reviewing the Ant Sensor"
"Upgrading Tomcat"

Now imagine these messages being combined with the other Hackystat DevEvents and being visualized using something like Simile/Timeline. Further, imagine the timeline being integrated into a widget with a near-real-time nature like the Sensor Data Viewer, such that you could see the HackyTwitter information along with occurrences of builds, tests, and commits scrolling by on a little window in the corner of your screen. Would this enable "weird, fascinating feats of coordination" within a software development project?

Sounds cool to me.

Wednesday, July 11, 2007

Empirical Software Engineering and Web 3.0

I came across two interesting web pages today that started me thinking about empirical software engineering in general and Hackystat in particular with respect to the future of web technology.

The first page contains an interview with Tim Berners-Lee on the Semantic Web. In his response to the request to describe the Semantic Web in simple terms, he talks about the lack of interoperability between the data in your mailer, PDA calendar, phone, etc. and pages on the web. The idea of the Semantic Web, I guess, is to add sufficient semantic tagging to the Web to provide seamlessness between your "internal" data and the web's "external" data. So, for example, any web page containing a description of an event would contain enough tagging that you could, say, right click on the page and have the event added to your calendar.

There is a link on that page to another article by Nova Spivak on Web 3.0. It contains the following visualization of the web's evolution:

To me, what's interesting about this is the transition we're now in between Web 2.0, which is primarily focused on user-generated, manual "tagging" of pages, and Web 3.0, where this kind of "external" tagging will be augmented by "internal" tagging that provides additional semantics about the content of the web document.

It seems to me that the combination of internal and external tagging can provide interesting new opportunities for empirical software engineering. Let's go back to Tim Berners-Lee's analogy for a second: it's easy to transpose this analogy to the domain of software development. Currently, a development project produces a wide range of artifacts-- requirements documents, source code, documentation, test plans, test results, coverage, defect reports, and so forth. All of these evolve over time, all are related to each other, and I would claim that all use (if anything) a form of "external" tagging to show relationships. For example, a configuration management system enables a certain kind of "tagging" between artifacts which is temporal in nature. Some issue management systems, like Jira, will parse CV commit messages looking for Issue IDs and use that to generate linkages between Issues and the modifications to the code base related to them.

Nova Spivak adds a few other technologies to the Web 3.0 mix besides the Semantic Web and its "internal" tagging:

Ubiquitous connectivity
Software as service
Distributed computing
Open APIs
Open Data
Open Identity
Open Reputation
Autonomous Agents

The "Open" stuff is especially interesting to me in light of the recent emergence of "evaluation" sites for open source software such as Ohloh, SQO-OSS, and Coverity/Scan. Each of these sites are trying to address the issue of how to evaluate open source quality. Each of them are more-or-less trying to do it within the confines of Web 2.0.

Evaluation of open source software is an interesting focus for the application of Web 3.0 to empirical software engineering, because open source development is already fairly transparent and accessible to the research community, and also because increasing numbers of open source software are becoming mission-critical to organizational and governmental infrastructure. The Coverity/Scan effort was financed by the Department of Homeland Security, for example.

Back to Hackystat. It seems to me that Hackystat sensors are, in some sense, an attempt to take a software engineering artifact (the sensor data "Resource" field, in Hackystat 8 terminology), and retrofit Web 3.0 semantics on top of it (the SensorDataType field being a simple example). The RESTful Hackystat 8 services are then a way to "republish" aspects of these artifacts in a Web 3.0 format (i.e. as Resources with a unique URI and an XML representation) . What is currently lacking in Hackystat 8 is the ability to obtain a resource in RDF representation rather than our home-grown XML, but that is a very small step from where we are now.

There is a lot more thinking I'd like to do on this topic (probably enough for an NSF proposal), but I need to stop this post now. So, I'll conclude with three research questions at the intersection of Web 3.0 and empirical software engineering:

Can Web 3.0 improve our ability to evaluate the quality/security/etc. of open source software development projects?
Can Web 3.0 improve our ability to create a credible representation of an open source programmer's skills?
Can Web 3.0 improve our ability to create autonomous agents that can provide more help in supporting the software development process?

Friday, May 11, 2007

Sample Restlet Application for Hackystat

I decided to get my feet wet with Restlet by building a small little server with the following API:

http://localhost:9876/samplerestlet/file/{filename}

The idea is that it will retrieve and display {filename}, which is an instance of the "file" resource.

This was a nice way to wade into the Restlet framework; more than a Hello World app, lacking stuff we don't need (like Virtual Hosts), and requiring stuff we do (URL-based dispatching).

To see what I did, check out the following 'samplerestlet' module from SVN:

svn://www.hackystat.org/csdl/samplerestlet

To build and run it:

Download Restlet-1.0, unzip, and point a RESTLET_HOME env variable at it.
Build the system with "ant jar"
Run the result with "java -jar samplerestlet.jar"

This starts up an HTTP server that listens on port 9876.

Try retrieving the following in your browser: http://localhost:9876/samplerestlet/file/build.xml

Now look through the following three files, each only about 50 LOC:

build.xml, which shows what's needed to build the executable jar file.
Server.java, which creates the server and dispatches to a FileResource instance to handle URLs satisfying the file/{filename} pattern.
FileResource.java, which handles those GET requests by reading the file from disk and returning a string representation of it.

If this looks confusing, the Restlet Tutorial is a reasonable introduction. There's also a pretty good Powerpoint presentation that introduces both REST architectural design and the Restlet Framework at the Restlet Wiki Home Page. It comes with some decent sample code as well.

Next step is to add this kind of URL processing to SensorBase.

Tuesday, May 8, 2007

JAXB for Hackystat for Dummies

I spent today working through the XML/Java conversion process for SensorBase resources, and it occurred to me near the end that my struggles could significantly shorten the learning curve for others writing higher level services that consume SensorBase data (such as the UI services being built by Alexey, Pavel, and David.)

So, I did a quick writeup on the approach, in which I refer to a library jar file I have made available as the first SensorBase download.

After so many years using JDOM, which was nice in its own way, it is great to move onward to an even faster, simpler, and easier approach.

Friday, May 4, 2007

SensorBase coding has begun!

To my great delight (given that the May 15 milestone is rapidly approaching) I have committed my first bit of SensorBase code today.

Some interesting tidbits:

First, I am continuing to observe the Hackystat tradition of always including a reference to an Issue in the SVN commit message. In this case, the reference looks like:

http://code.google.com/p/hackystat-sensorbase-uh/issues/detail?id=3

Second, to my surprise, I am coding 100% in a TDD style, not out of any philosophical commitment or moral imperative, but simply out of the sense that this is just the most natural way to start to get some leverage on the SensorBase implementation. The REST API specification turns out to form a very nice specification of the target behavior, and so I just picked the first URI in the table (GET host/hackystat/sensordatatypes) which is supposed to return a list of sensordatatype resource references, and wrote a unit test which tries that call on a server. Of course, the test fails, because I haven't written the server yet.

Third, to my relief, the Restlet framework makes that test case wicked easy to write. In fact, here it is:


@Test public void getSdtIndex () {
 // Set up the call.
 Method method = Method.GET;
 Reference reference = new Reference("http://localhost:9090/hackystat/sensordatatypes");
 Request request = new Request(method, reference);

 // Make the call.
 Client client = new Client(Protocol.HTTP);
 Response response = client.handle(request);

 // Test that the request was received and processed by the server OK.
 assertTrue("Testing for successful status", response.getStatus().isSuccess());

 // Now test that the response is OK.
 XmlRepresentation data = response.getEntityAsSax();
 assertEquals("Checking SDT", "SampleSDT", data.getText("SensorDataTypes/SensorDataType/@Name"));
 }

There's a couple of rough edges (I can't hard code the server URI, and my XPath is probably bogus), but the code runs and does the right thing (i.e. fails at the getStatus call with a connection not found error.)

I'm sure things won't be this easy forever, but it's nice to get off to a smooth start.

Thursday, May 3, 2007

Minimize library imports to your Google Projects

As we transition to Google Project Hosting, one thing we need to be particularly careful about is uploading of third party libraries into SVN. In general, try to avoid doing this. There are two reasons for this. First, there is limited disk space in Google Project Hosting, and its easy to burn up your space with libraries (remember that since SVN never deletes, libraries that need updating frequently will burn through your space quickly.) Second, different services will often share the same library. For example, most of our Java-based services will probably want to use the Restlet framework. It is generally better to install that in one place as a developer.

To avoid uploading libraries to SVN, you can generally do one of the following alternatives:

Instruct your developers in the installation guide to download the library to a local directory, create an environment variable called {LIBRARY}_HOME, and point to those jar files from your Eclipse classpath or Ant environment variable.
For files that need to be in a specific location in your project, such as GWT, download the GWT to a local directory, then copy the relevant subdirectories into your project.

Binary distributions of releases is a different situation. In that case, we will typically want to bundle the libraries into the binary distribution. That will cause its own difficulties, since Google Project Hosting limits us to 10MB files for the download section, but we'll cross that bridge when we come to it.

Monday, April 30, 2007

Xml Schema definition for dummies

Today I defined my first batch of Xml Schemas for Version 8. The results of my labors are now available at http://hackystat-sensorbase-uh.googlecode.com/svn/trunk/xml/schema/

For each XSD file, I also provide a couple of "example" XML files, available in http://hackystat-sensorbase-uh.googlecode.com/svn/trunk/xml/examples/

To test that the XSD validates the XML, I used the online DecisionSoft Xml Validator. Provide it with an XSD schema definition file, and an XML file to validate against it, and away it goes. The error messages were a little obtuse, but good enough for my purposes.

It's possible to include a reference to the XSD file within the XML instance, which is probably what we want to do in practice.

The next step is to parse the XML. Here's a nice example of using JAXB 2.0 to parse XML efficiently (both in terms of code size and execution time).

Saturday, April 28, 2007

Version 8 Design Progress

Lots of progress this week on the design of Version 8. There is a milestone on May 15, just over two weeks away, and I've been fleshing out a bunch of pages in order to add some details and direction as to what we might want to try to accomplish by then.

I've completed an initial draft of the SensorBase REST API specification. This is currently out for review.
I've updated the home pages for the TelemetryViewer and SensorDataViewer services with more information about what these services should accomplish.
I've created the SensorBase Version8 Schedule, TelemetryViewer Version8 Schedule, and the SensorDataViewer Version8 Schedule pages. These pages provide more detail on what I hope are reasonable expectations for the May 15 milestone.

Finally, I gave a talk on REST in ICS 414 yesterday, and noticed the following blog entry by Josh about REST in general and the implications for Hackystat understandability and usability in particular. This gives me hope that we're heading in the right direction!

Friday, April 20, 2007

Near real time communication using Restlet

Some services (such as a UI service to watch the arrival of sensor data at a SensorBase) want "near real time" communication, using something like Jabber. There is a new project that integrates XMPP and Restlet which might be quite useful for this:

http://permalink.gmane.org/gmane.comp.java.restlet/1944

David might want to check this out.

Wednesday, April 18, 2007

Why REST for Hackystat?

Cedric asked a really good question on the hackystat mailing list today, and I thought it was worth posting to this blog:

> Probably my question is too late since you have already decide use REST, but I want to
> know the rationale behind it.
>
> Since you are still returning data in xml format, what makes you decide not to publish
> a collection of WSDL and go along with more industrial standard web service calls?

Excellent question! No, it's not too late at all. This is exactly the right time to be discussing this kind of thing.

It turns out that when I started the Version 8 design process, I was still thinking in terms of a monolithic server and was heading down the SOAP/WSDL route. I was, for example, investigating Glassfish as an alternative to Tomcat due to its purportedly better support for web services.

Then the Version 8 design process took an unexpected turn, and the monolithic server fragmented into a set of communicating services: SensorBase services for raw sensor data, Analysis services that would request data from SensorBases and provide higher level abstractions, and UI services that would request data from SensorBases and Analyses and display it with a user interface.

What worried me about this design initially was that every Analysis service would have to be able to both produce and consume data (kind of like being a web server and a web browser at the same time), and that Glassfish might be overkill for this situation. So, I started looking for a lightweight Java-based framework for producing/consuming web services, and came upon the Restlet Framework (http://www.restlet.org/), which then got me thinking more deeply about REST.

It's hard to quickly sum up the differences between REST and WSDL, but here's a few thoughts to get you started. WSDL is basically based upon the remote procedure call architectural style, with HTTP used as a "tunnel". As a result, you generally have a single "endpoint", or URL, such as /soap/servlet/messagerouter, that is used for all communication. Every single communication with the service, whether it is to "get" data from the service, "put" data to the service, or modify existing data is always implemented (from an HTTP perspective) in exactly the same way: an HTTP POST to a single URL. From the perspective of HTTP, the "meaning" of the request is completely opaque.

In REST, in contrast, you design your system so that your URLs actually "mean" something: they name a "resource". Furthermore, the type of HTTP method also "means" something: GET means "get" a representation of the resource named by the URL, "POST" means create a new resource which will have a unique URL as its name, DELETE means "delete" the resource named by the URL, and so forth.

For example, in Hackystat Version 7, to send sensor data to the server, we use Axis, SOAP, and WSDL to send an HTTP POST to http://hackystat.ics.hawaii.edu/hackystat/soap/rpcrouter, and the content of the message indicates that we want to create some sensor data. All sensor data, of all types, for all users, is sent to the same URL in the same way. If we wanted to enable programmatic access to sensor data in Version 7, we would tell clients to continue to use HTTP POST to http://hackystat.ics.hawaii.edu/hackystat/soap/rpcrouter, but tell them that the content of the POST could now invoke a method in the server to obtain data.

A RESTful interface does it differently: to request data, you use GET with an URL that identifies the data you want. To put data, you use POST with an URL that identifies the resource you are creating on the server. For example:

GET http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit/1176759070170

might return the Commit sensor data with timestamp 1176759070170 for user x3fhU784vcEW. Similarly,

POST http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit/1176759070170

would contain a payload with the actual Commit data contents that should be created on the server. And

DELETE http://hackystat.ics.hawaii.edu/hackystat/sensordata/x3fhU784vcEW/Commit/1176759070170

would delete that resource. (There are authentication issues, of course.)

In fact, REST asserts a direct correspondance between the CRUD (create/read/update/delete) DB operations and the POST, GET, PUT, and DELETE methods for resources named by URLs.

Now, why do we care? What's so good about REST anyway? In the case of Hackystat, I think there are two really significant advantages of a RESTfully designed system over an RPC/SOAP/WSDL designed system:

(1) Caching can be done by the Internet. If you obey a few more principles when designing your system, then you can use HTTP techniques as a way to cache data rather than build in your own caching system. It's exactly the same way that your browser avoids going back to Amazon to get the logo files and so forth when you move between pages. In the case of Hackystat, when someone invokes a GET on the SensorBase with a specific URL, the results can be transparently cached to speed up future GETs of the same URL, since that represents the same resource. (There are cache expiration issues, which I'm pretty sure we can deal with.)

In Hackystat Version 7, there is a huge amount of code that is devoted to caching, and this code is also a huge source of bugs and concurrency issues. With a REST architecture, it is possible that most, perhaps all, of this code can be completely eliminated without a performance hit. Indeed, performance might actually be significantly better in Version 8.

(2) A REST API is substantially more "accessible" than a WSDL API. One thing I want from Hackystat Version 8 is a substantially simpler, more accessible interface, that enables outsiders to quickly learn how to extend Hackystat for their own purposes with new services and/or extract low-level or high-level data from Hackystat for their own analyses. To do this with a RESTful API, it's straightforward: here are some URLs, here's how they translate into resources, invoke GET and you are on your way. Pretty much every programming language has library support for invoking an HTTP GET with an URL. One could expect a first semester programming student to be able to write a program to do that. Shoots, you can do it in a browser. The "barrier to entry" for this kind of API is really, really low.

Now consider a WSDL API. All of a sudden, you need to learn about SOAP, and you need to find out how to do Web Services in your chosen programming language, and you have to study the remote procedure calls that are available, and so forth. The "barrier to entry" is suddenly much higher: there are incompatible versions of SOAP, there's way more to learn, and I bet more than a few people will quickly decide to just bail and request direct access to the database, which cuts them out of 90% of the cool stuff in Hackystat.

So, from my reckoning, if we decided to use Axis/SOAP/WSDL in Version 8, we'd (1) continue to need to do all our own caching with all of the headaches that entails, and (2) we'd be stuck with a relatively complex interface to the data.

I want to emphasize that a RESTful architecture is more subtle than simply using GET, POST, PUT, and DELETE. For example, the following is probably not restful:

GET http://foo/bar/baz&action=delete

For more details, http://en.wikipedia.org/wiki/Representational_State_Transfer has a good intro with pointers to other readings.

Your email made another interesting assertion:

> what makes you decide not to publish
> a collection of WSDL and go along with more industrial standard web service calls?

Although I agree that WSDL is an "industry standard", this doesn't mean that REST isn't one as well. Indeed, my sense after a few weeks of research on the topic is that most significant industrial players have already moved to REST or offer REST as an alternative to WSDL: eBay, Google, Yahoo, Flickr, and Amazon all have REST-based services. I recall reading that the REST API gets far more traffic than the correponding WSDL API for at least some of these services.

Finally, no architecture is a silver bullet, and REST is no exception. For example, if you can't effectively model your domain as a set of resources, or if the CRUD operations aren't a good fit with the kinds of manipulations you want to do, then REST isn't right. Another REST requirement is statelessness, which can be a problem for some applications. So far in my design process, however, I haven't run into any showstoppers for the case of Hackystat.

Version 8 is still in the early stages, and the advantages of REST are still hypothetical, so I'm really happy to have this conversation. There are no hard commitments to anything yet, and if there turns out to be a showstopping problem with REST, then we can of course make a change. The more we talk about it, the greater the odds we'll figure out the right thing.

Cheers,
Philip