Steve: Developing on the Edge - Future build systems
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
7Feb
Tue2006
Future build systems

Leo Simons is thinking of a better build system.. Leo is the Keeper of the Gump; the man in charge of the nightly build of everything, so he knows a lot about what does and and doesnt work. He also knows which projects break a lot and whose fault it is (mine, usually).

I await his design with interest.

I've also been thinking a bit about build systems recently, possibly from a different perspective. Here are my initial observations

  1. Build tools exist to implement the build processes of the time, on the languages of the time.
  2. Different problems result in radically different tools
  3. Mostly, they are a form of damage limitation
  4. End users often end up seeing them, for late-binding build stages on the target boxes.

Examples

Shell scripts

Easy to learn, very flexible. rebuild times are awful; rarely portable. If you come across someone building java code from a .sh file, run. No scalability.

Make

Make is one of the great innovations in Unix land, the authors deserved that acm award they got, though they should have then had it taken away for using an invisible bit of whitespace effectively as a keyword in the language. Make was built to minimise .c rebuilds, and to that end it lets you declare the depencencies; the little inference engine will work out the transition graph from what you want from what you have, and get on with it. Some scalability and makefile reuse, though usually they get pretty scary.

imake

X10,X11 preprocessor for make. Written for making X compile across many platforms. you can decouple platform specific details (compiler options) from the build target/dependency data. In damage limitation terms, it supports a dev model in which a gold code image MIT/Athena X source gets ported to various boxes by their vendor teams.

Microsoft NMake

this is an aberration. I can't see the point of this thing at all. It's worst-in-class and probably exists to keep the FTEs in building 10 or wherever in Redmond compiling the OS. In damage limitation terms, its an ugly hack that exists because the IDE doesnt scale/go low level.

Autoconf+Gnu make

This is state of the art in Make. It not only has the platform independence of imake, it can probe the local machine and work out what features do and dont work there. It exists to solve the problem of building the gnu toolchain across platforms, without any human intervention on the target systems. This is why it is so good in unix-land, where every box can be different. In linux we have more consistency of underlying CPU (usually 32 or 64 bit x86, but wide variation in compiler versions and other OS features). Although really a developer tool, its the usual way to install stuff in Linux that doesnt come packaged for you.

IDE hosted builds, such as Visual Studio

These came into being from the CASE tools of the eighties, the idea that an integrated editor+builder+debugger was what we needed. They are usually vendor-specific things that support individuals, occasionally small teams (badly). Minimal portability between users, let alone across platforms. An IDE-based project usually ends up having a build box that is configured to do good builds. The problem that it was trying to address was ease of development.

Ant

Written because JDR could not make the Sun assumption: that everyone had a unix box with make underneath. It originally got a toehold in OSS by being cross platform, where you couldn't mandate what IDE to use. It also soon evolved to meet the current needs of the apps: there was HTTP integration, it would run junit and create reports, package things to JARS, do basic deployment. Ant's and JUnit's successes probably went hand in hand: Ant made running JUnit easy; having JUnit test cases was a reason to use Ant. So we have moved the build process, from simple "build+package" to "build+package+test+report". Although really a developer tool, it gets bundled with so many things as a boot loader, startup script or helper engine.

Maven

Maven is a rethink of the build process from Ant, one where the problems are scale and repetition. Library management is handled for you, because all projects have similar needs. you also have similar goals: build, test, package, report. Its a very open-source centric tool, which generates the web site with reports alongside the deliverables. It's also strongly network aware; if your local box lacks a plugin or JAR, Maven will get it. This is very powerful.

apt

Debian's apt framework is more of a configuration management (CM) tool than a build tool. But if you want something on your box and it needs compiling, apt-get will do that for you too. It has crossed the gulf from a build tool for developers to one for end users. You ask for what you want, if its in compatible binary, you get that; if it needs compiling, that is done for you. In damage limitation terms, what it addresses is the problem that you usually need to rebuild stuff for your debian-unstable box, so why not integrate that with downloading and installation.

If you look at the trends then, it is for the tools to go beyond building things to testing and installation, and also to become something end users encounter too. Which means the tools should really think about error messages for end users, internationalised text, etc.

But the other point is that the tools reflect the languages. C/C++ apps were rarely portable across OS families, so a tool that was Unix only or Windows only was fine, as long as it handled the problem of getting header and library dependencies right. Java apps were portable, so the build tool needed to be. Header file dependendencies were a non-issue, and because javac is itself quite fast, you don't need to worry about other dependencies so much; just do a clean build and be done with it. Instead Ant and Maven can focus on testing and packaging.

So where next with build tools? Well, they aren't 'build tools', they are really download-test-install tools; they are delivery tools. Or they are configuration management tools. Once we move (back) to languages that don't need compilation, there is no need to compile code before distribution. Instead you redist your source with something that declares other dependencies, tests to run, how to do deploy the thing. Whenever someone wants to install an app, you give it a pointer to the program, the runtime will pull down the source+descriptor, run through the local tests (validating the installation is good), then "install" it for other dependents to use. If you are doing iterative dev, all you need is the test suite running, continually.

With releases being stuff that gets rebuilt by the recipients your release process changes. A release is just a tag in the SCM repository that is registered (somewhere) as a release; the downstream things handle the work

And there, the deployment engine is the build tool, or at least choreographs it. If your CM system continually strives to keep the system in the declared state, then changing the state declaration to a new version of a dependency will trigger a download, checkout and redeploy. That is, continual itegration and continal deployment become the same thing. Gump just becomes a deployment using CVS_HEAD and SVN_HEAD versions of code.

Comments

On 9 February 2006 at 02: 50 tom commented:
You may also want to look at "jam" from the folks at www.perforce.com. I am no expert in build tools, but those who know more than me say it is a well-designed step forward from make.