Saturday, September 13, 2008

Why emacs can survive without automated tests

Automated tests are an accepted best practice in software engineering, and it's hard to find a recently-formed, well-run software organization these days that doesn't have some form of automated series of small tests run against their codebase.

The argument for automated tests is relatively simple: without them, there is no easy way to verify that a release of a project works. Manual testing is slow and expensive, and all the work you put into it is lost when the next release comes out. Furthermore, with frequently run automated tests, you can find out very quickly what changes caused which bugs, and so revert or fix those changes.

Given all that, how does emacs survive, given that it has a huge codebase and no tests?

The best thing it has going for it is that it's a desktop app. That means that users have to upgrade from one version to the next, as opposed to web apps in which all users are upgraded when the server version changes. Since users have to upgrade, and are in control of which version they get, user's versions range from very old to a build of yesterday's CVS head. That means that a certain percentage of users are going to be on the bleeding edge. Those users are either contributors on the project, or those that can tolerate increased bugginess.

Another key is that emacs is a tool people use every day. In fact, it's fair to say that everyone who is contributing to emacs is probably using emacs as their development environment. So those contributing are using the CVS head, and are actively using the latest version.

Let's assume that 1% of the emacs-using population is running off of a recent version of CVS head. So, 1% is exposed to the latest code, and since these same people use it all the time, to do a diverse variety of things, they are testing it continuously, while getting their work done at the same time. Not only are they testing it, but since these are power users, they know how to fix most problems they encounter.

For example, imagine two developers, each working off of CVS head. One checks in a change that breaks dired. The second will get the latest changes, build it, run it, work on it, notice the error, know how to fix it, and can submit a fix immediately.

Eventually, at some time period after any particular changes is made, it can be assumed to be safe, because enough emacs users have been running systems with this change in, and have not had a problem.

However, this implies that if emacs ever wants a stable version, it has to stop non-bugfix checkins, otherwise there will always be a change that hasn't been fully vetted yet. And this is what emacs does - for a new release there is a long phase in which no new functionality is put in.
It's worth noting that this phase is particularly long. Emacs release cycles for major versions can last years.

Let's imagine a world in which emacs did have automated tests of reasonable coverage. What would change?

First, the codebase would be more stable overall, so if you are an emacs contributor, you can expect a less buggy environment. This certainly would make contributing more efficient, since you can concentrate on functionality, not fixing bugs.

Secondly, the feature freeze stage could be significantly shortened. If the unit tests were extremely comprehensive, one could imagine that no feature freeze would be necessary.

All in all, then, in exchange for the considerable effort of developing unit tests for emacs, we can expect a codebase that can change must more rapidly and radically. Whether this is a tradeoff that makes sense I can only guess. Emacs is a mature product, not going through many radical changes anymore. But, part of the reason that radical changes are not happening is not because they aren't needed, but because it is frankly just too hard. It would take a very brave developer who would take on making emacs multithreaded, or switching to Scheme as a basis for elisp. That could change with unit tests. The tasks would still be hard, but it would not be so destabilizing anymore. And automated tests wouldn't take much bravery to write, so I think automated tests may be a path to allow the more radical changes in emacs's future.

What these tests would look like is a subject for another post.

1 comment:

Heath M. said...

You missed the biggest reason why emacs has survived without tests.
It's written in a functional language (elisp). Doing this makes testing anything you write dirt simple because functions should be 'side-effect free'. I don't think elisp is purely functional so that statement may not apply to the entire codebase, but I would wager a significant amount of it would be purely functional.
Most developers who wrote elisp code tested it in their REPL saw it worked and checked it in, this is unfortunate for regression, but you can see how it would decrease bugs just being a functional language in the first place (if they write pure functions).
Not to say unit testing isn't still needed... just a good reason why something as big as emacs could get where they are without them.