Saturday, November 29, 2008

Why I've abandoned Eclipse for Emacs

Maybe this isn't a big surprise. I've always used emacs, even when using Eclipse. I'm used to emacs, and I use it for absolutely everything: shell, mail, task planner and more. For a while, though I did not do most of my Java coding in it. There I mostly used Eclipse, unless I was just doing something quick. After all Eclipse has great tools for refactoring, automatic imports, quick detection and correction of errors, and more. It's a really great product. Not only that, but it's unsurpassed at quick unit test cycles, mostly requiring no complication.

Emacs has none of that.  But it does have a number of things going for it, and because of these things, I've stopped using Eclipse entirely, and have continued to be productive over the last few months using only emacs.  Here's why:
  • Editing.  Emacs is a great editor, unquestionably better at editing than Eclipse. For example, I can do comment formatting extremely quickly in emacs (Meta-q). It's possible to do this Eclipse as well, but it's less flexible and requires more interaction. There are many similar minor things that make emacs a better code editor, as far as textual edits go.
  • Lack of projects.  Emacs is unconstrained. Editing a file for Project A, I can switch to a file in Project B as quickly as I can switch to another file in Project A. Emacs does not care about projects, which means it can't give you all the goodies Eclipse can, but it also doesn't tie you down in the same way Eclipse does. 
  • Speed.  This is probably the most important reasons I've been doing everything in emacs. There are many different kinds of speed, and most of them are important. Emacs, being lightweight in comparison to Eclipse, has greater speed of editing. Eclipse has greater speed in the change/compile/test cycle, which is very important. But the one that seems important to me is that Emacs has no large pauses, where Eclipse does. I switch projects a lot, and every time I do it seems like Eclipse has to build itself for 10 minutes. IntelliJ doesn't have this problem, but that's because it does less up-front work; everything else is slower. When I go work with someone who uses Eclipse, we often spend the first 10 minutes just waiting for Eclipse to build the project. It's very frustrating. If we were using emacs, we would already be on our way.
  • Reliability.  Eclipse is a large moving part, and it is one that seems prone to failure. Once you get burned a couple of times by problems that only are in Eclipse, getting that bit of unpredictability out of the system is a time-saver.  Emacs can get that way too, especially if you use something like jdee-mode.  I like to keep it relatively simple in emacs, though. 
  • It's emacs. This is somewhat tautological, but the truth is that once emacs is handling several important tasks for you, you get more and more benefit by bringing activity into it. This is why I can read a mail in gnus (using nnimap), store it in my to-do list in org-mode, and then find some relevant piece of code in emacs and link it to the task in org-mode. It all fits together beautifully.  And whatever you need is there for you, somewhere.  Writing a javadoc and want to put a diagram in there?  Emacs can make it easy; use artist-mode.  There's so much stuff to discover here.

For programming in Java, I just use java-mode, but I've written a few things to make my life easier. I use anything-mode to quickly go to any Java file, ffap so I can quickly open up include files, and use M-x compile so I can jump to errors. I've also written a little import management system that I'll post on this blog shortly. So I find that I'm reasonably efficient.

All in all, my startup time is much quicker for emacs, but I lose time on editing. I'd recommend emacs over Eclipse if you tend to work on a variety of tasks and you use emacs now. If you spend most of your day writing Java, Eclipse is probably going to be better suited to that.

Saturday, September 27, 2008

Git in the office

I've been using git recently. Not just me, in fact, but lots of open-source projects have started using it. Right now I use it on github, to keep track of my emacs customizations.

Git is a very interesting source control system to use, mainly because it makes it extremely easy to do branches and patching. Linus has a very interesting video on the rationale behind creating git, but to sum up his point in a few sentences: distributed source control, by allowing local modifications and local patches that can be easiliy merged, avoids the problems of latency and namespace pollution you get using centralized source control systems. Also, he says git is the best distributed one out there, because it is very fast. I really can't comment on whether it is the best, I haven't used darcs or any of the other distributed source control systems that have become popular.

Git can be useful for the corporate developer. When you have the ability to do local commits and local branches, development becomes more efficient. You can commit changes at notable points, so if you've spent the last few hours fruitlessly exploring an interesting idea, you can revert back to your last good commit and proceed without having to manually unwind your changes. Or, if you need to make a quick bugfix, you can just branch off of the current head branch, make a fix, then check it in and return to your previous branch. The ability to always stay in one directory tree can be an efficiency gain as well. And you can work offline, doing commits and everything you can do with a centralized source control system. When you get online and you are ready, you just push all those commits to some other git repository.

So, how can software companies use git? Companies have traditionally used centralized source control systems. I've used CVS (ugh), Perforce (not bad), StarTeam (somewhat yucky, but tolerable), and Visual SourceSafe (flaky) at various companies. It's easy to comprehend a centralized system like this; you have your code, it's in one place, and that's all. Git can be like that too, in a way. There could be one central git repository, and each developer would have their own local repository. Developers could do their own commits on their machine, and when they have something worth checking into the main repository, they do a "git push" to push it there. Or, instead someone in charge of the next release specifically pulls relevant changelists from developers machines to create a coherent release. This is basically what open-source projects do. This would work for small companies, who don't need anything more complicated than that.

If small companies could use git like a centralized repository system, large companies can use it in more interesting ways.

Large companies often have distributed development offices where the pain of using a centralized source repository system is fairly large. Also, large companies also have lots of source which they tend to put in one repository. After a certain point, these repositories get slow. Repository slowness plus network lag for remote offices create a real issue for distributed teams. Git can help by allowing each team to have it's own local repository which they use. Changes are aggregated from each developer's git repository to a local team repository. When it's time to do a release, or at regular intervals, each team's repository is merged with the release repository. Similarly, each team has to make sure their repository is in sync with the release repository, because that one has all relevant changes.

There is a disadvantage to this: with distributed repositories, merges don't happen on every checkin. Merges get put off, and when they happen they the conflicts are typically worse. In such situations, it should be the responsibility of the individual teams to fix any merge problems, that happen when either pushing or pulling their changes.

There would be some great advantages. If you run a continuous build, you only see your own build breakages, not others. The release branch can merge only repositories that are passing their tests, and therefore they always get changes that are good. Of course, it should really run its own continuous build. So you need more build machines, but you get more developer productivity, since any developer's local build is not affected by other developer's build breakages.

From what I've been told, Microsoft has a similar tiered source control system for Windows. For any sufficiently complicated project, such a system is in fact a necessity.

Could this work? I've never seen it happen, but git and distributed source control in general is relatively new. The problem is that the benefits are greatest when the company is large, but the decision on what system to use is made when the company is small, where the benefits are less obvious. It may be that some companies will adopt it for certain projects where it makes sense, but not universally.

Saturday, September 13, 2008

Why emacs can survive without automated tests

Automated tests are an accepted best practice in software engineering, and it's hard to find a recently-formed, well-run software organization these days that doesn't have some form of automated series of small tests run against their codebase.

The argument for automated tests is relatively simple: without them, there is no easy way to verify that a release of a project works. Manual testing is slow and expensive, and all the work you put into it is lost when the next release comes out. Furthermore, with frequently run automated tests, you can find out very quickly what changes caused which bugs, and so revert or fix those changes.

Given all that, how does emacs survive, given that it has a huge codebase and no tests?

The best thing it has going for it is that it's a desktop app. That means that users have to upgrade from one version to the next, as opposed to web apps in which all users are upgraded when the server version changes. Since users have to upgrade, and are in control of which version they get, user's versions range from very old to a build of yesterday's CVS head. That means that a certain percentage of users are going to be on the bleeding edge. Those users are either contributors on the project, or those that can tolerate increased bugginess.

Another key is that emacs is a tool people use every day. In fact, it's fair to say that everyone who is contributing to emacs is probably using emacs as their development environment. So those contributing are using the CVS head, and are actively using the latest version.

Let's assume that 1% of the emacs-using population is running off of a recent version of CVS head. So, 1% is exposed to the latest code, and since these same people use it all the time, to do a diverse variety of things, they are testing it continuously, while getting their work done at the same time. Not only are they testing it, but since these are power users, they know how to fix most problems they encounter.

For example, imagine two developers, each working off of CVS head. One checks in a change that breaks dired. The second will get the latest changes, build it, run it, work on it, notice the error, know how to fix it, and can submit a fix immediately.

Eventually, at some time period after any particular changes is made, it can be assumed to be safe, because enough emacs users have been running systems with this change in, and have not had a problem.

However, this implies that if emacs ever wants a stable version, it has to stop non-bugfix checkins, otherwise there will always be a change that hasn't been fully vetted yet. And this is what emacs does - for a new release there is a long phase in which no new functionality is put in.
It's worth noting that this phase is particularly long. Emacs release cycles for major versions can last years.

Let's imagine a world in which emacs did have automated tests of reasonable coverage. What would change?

First, the codebase would be more stable overall, so if you are an emacs contributor, you can expect a less buggy environment. This certainly would make contributing more efficient, since you can concentrate on functionality, not fixing bugs.

Secondly, the feature freeze stage could be significantly shortened. If the unit tests were extremely comprehensive, one could imagine that no feature freeze would be necessary.

All in all, then, in exchange for the considerable effort of developing unit tests for emacs, we can expect a codebase that can change must more rapidly and radically. Whether this is a tradeoff that makes sense I can only guess. Emacs is a mature product, not going through many radical changes anymore. But, part of the reason that radical changes are not happening is not because they aren't needed, but because it is frankly just too hard. It would take a very brave developer who would take on making emacs multithreaded, or switching to Scheme as a basis for elisp. That could change with unit tests. The tasks would still be hard, but it would not be so destabilizing anymore. And automated tests wouldn't take much bravery to write, so I think automated tests may be a path to allow the more radical changes in emacs's future.

What these tests would look like is a subject for another post.

Wednesday, August 06, 2008

Code is the easy part: Preventing data corruptions

Coding, coding, coding. There's an awful lot of attention paid to coding in the software engineering world. I read redditt's programming category a lot these days, and most software engineering posts are about specifically code. I think this is not because coding is the most important part of being a programmer, but because it is the most interesting and fun part.   Code is easy to change. If you have a bug, you reproduce it and fix it, and then it goes away. Sometimes it's difficult, but even if it is, you fix it, make sure it doesn't happen again by writing test cases (preferably before you fix it), and then you forget about it.

Let's contrast the fixing of bugs in code, with fixing corrupted data. Corrupted data in any reasonable system is probably going to be caused by a software bug. So, you first have to find out why the data is corrupted.  Of course, maybe the corruption is old, and the bug has been fixed.  Hard to say.  You have to investigate.  You could try and reproduce it, if you can figure out a likely scenario.  Most likely you will fail to reproduce it, and even more likely is that you know you will fail to reproduce it, so you don't try.  So instead you look at the code, seeing how it could happen.  This is a useful practice.  You can sometimes find the cause here.  Then you can write a test for it and fix it.   After you fix it, your work isn't done, of course.  You have to actually fix the corruptions.  This could be a simple as running a SQL query, or as complicated as writing a script to patch things up using a code library to do the work.   Of course, sometimes you never can find out why your data is corrupted, so you just have to fix it, if you can figure out how.

Some amount of corrupted data is an inevitability, and in fact some may come from design decisions.  For example, some database systems cannot do two-phased commits, and if you need to hook into one, you may have to accept a certain amount of data corruption due to not having atomicity in your transactions.  If the error rate is very low, some corruption may be a fair price to pay for whatever benefits the second system is getting you.   Even so, this is dangerous, and a low error rate today may be a severe error rate tomorrow, leaving you with a lot of angry customers with corrupted data, and a few dejected developers who have to clean it all up.

There's a few best practices you can do to avoid data corruption
  • Use one transactional system with ACID properties, and use it correctly.
  • When using SQL, use foreign key references whenever possible. 
  • Before saving data, assert its correctness as a precondition.  This includes both the values stored, and the relationship of the data to data that both links to it, and is linked to.
  • Create a data checker that will run a series of tests on your data.  This is basically like unit-tests for data, but you can run it not only after a series of tests, but in your production system too.  Run this program regularly, and pay attention to the output.  You want to be alerted of any changes in the sanity of your data.  Like unit tests, the more invariants you encode into this tool, the better you will be.  When changing or adding data,  modify the data checker code.  After each QA cycle, run the data checker.  Any errors should be entered as bugs.
  • If your data can be repaired, have an automated data repairer.  This shouldn't be run regularly, because you don't want to get too complacent about your corruptions.  Instead, if you notice that the data checker has picked up some new errors, then you run this, modifying it first if the errors are of a new type.

Doing a good job on all these tips should prevent most data corruption, but not all.  Like bugs, even the best preventative measures will not guarantee success.

Having clean data is extremely important.  This data is not yours, it is your customer's, and they trust you with it.  You need to protect it, and it isn't easy, but preventing data corruptions is always the right thing to do.

In org-mode, abandoning GTD

I've already mentioned the problem with contexts in the GTD system. In using org-mode, I've come to realize that another important concept of GTD is either flawed or redundant: the next action.

Next actions seem good at first. Every project has a next action, the action that is a short-doable task that will advance the state of the project. Of course, some projects have several actions you can do in parallel, so there are several next actions. As the number of projects you have grow, you get more and more next actions. At some point, the number of next actions becomes too much to keep track of. Some people might say this goes away with proper contexts. However, I've never found a good use for contexts, because in truth at any point I can do any of my next actions. Another strategy for getting rid of excessive projects is to move some to a someday/maybe folder. I think this is reasonable, but sometimes you know that in a particular week you are just going to work on a few different projects, perhaps because of deadlines or other prioritization. What then?

In org-mode, I solved this problem by using the agenda, and scheduling my next actions that I wanted to work on for the current day. I would then see a list of the next actions I had to accomplish that day. If I didn't get them done that day, the next day I'd move them up a day, to the current day. It ended up being a daily-planner-like system a lot like Tom Limoncelli recommends in Time Management for System Administrators. But with next actions.

Last week I had a revelation: scheduled items, for me, were the same as next actions! So I removed all next actions, just having states TODO, WAITING, and DONE. Like next actions, when I finish a scheduled item, the next item in the project becomes scheduled. I like my new system. It combines the flexibility of Limoncelli's day-planner system with the project-planning of GTD.

I spend my day as before, taking tasks from the day view of org-mode, and using a weekly review to schedule or de-schedule items from that view. I also add notes to tasks all the time, which proves to be helpful. I think I've benefitted from next-actions, since they force you to think of actions at level of granularity such that each task is specific, concrete, and immediately actionable. I treat tasks the exact same way now, even without next actions themselves.

For more info about org-mode, check out the talk on org-mode which happened after I invited Carsten Dominik to give a talk at Google about his system.