Leo Simons is thinking of a better
build system.. Leo is the Keeper of the Gump; the man in charge
of the nightly build of everything, so he knows a lot about what
does and and doesnt work. He also knows which projects break a lot
and whose fault it is (mine, usually).
I await his design with interest.
I've also been thinking a bit about build systems recently,
possibly from a different perspective. Here are my initial
observations
- Build tools exist to implement the build processes of the time,
on the languages of the time.
- Different problems result in radically different tools
- Mostly, they are a form of damage limitation
- End users often end up seeing them, for late-binding build
stages on the target boxes.
Examples
Shell scripts
Easy to learn, very flexible. rebuild times are awful; rarely
portable. If you come across someone building java code from a .sh
file, run. No scalability.
Make
Make is one of the great innovations in Unix land, the authors
deserved that acm award they got, though they should have then had
it taken away for using an invisible bit of whitespace effectively
as a keyword in the language. Make was built to minimise .c
rebuilds, and to that end it lets you declare the depencencies; the
little inference engine will work out the transition graph from
what you want from what you have, and get on with it. Some
scalability and makefile reuse, though usually they get pretty
scary.
imake
X10,X11 preprocessor for make. Written for making X compile
across many platforms. you can decouple platform specific details
(compiler options) from the build target/dependency data. In damage
limitation terms, it supports a dev model in which a gold code
image MIT/Athena X source gets ported to various boxes by their
vendor teams.
Microsoft NMake
this is an aberration. I can't see the point of this thing at
all. It's worst-in-class and probably exists to keep the FTEs in
building 10 or wherever in Redmond compiling the OS. In damage
limitation terms, its an ugly hack that exists because the IDE
doesnt scale/go low level.
Autoconf+Gnu make
This is state of the art in Make. It not only has the platform
independence of imake, it can probe the local machine and work out
what features do and dont work there. It exists to solve the
problem of building the gnu toolchain across platforms, without any
human intervention on the target systems. This is why it is so good
in unix-land, where every box can be different. In linux we have
more consistency of underlying CPU (usually 32 or 64 bit x86, but
wide variation in compiler versions and other OS features).
Although really a developer tool, its the usual way to install
stuff in Linux that doesnt come packaged for you.
IDE hosted builds, such as Visual Studio
These came into being from the CASE tools of the eighties, the
idea that an integrated editor+builder+debugger was what we needed.
They are usually vendor-specific things that support individuals,
occasionally small teams (badly). Minimal portability between
users, let alone across platforms. An IDE-based project usually
ends up having a build box that is configured to do good builds.
The problem that it was trying to address was ease of
development.
Ant
Written because JDR could not make the Sun assumption: that
everyone had a unix box with make underneath. It originally got a
toehold in OSS by being cross platform, where you couldn't mandate
what IDE to use. It also soon evolved to meet the current needs of
the apps: there was HTTP integration, it would run junit and create
reports, package things to JARS, do basic deployment. Ant's and
JUnit's successes probably went hand in hand: Ant made running
JUnit easy; having JUnit test cases was a reason to use Ant. So we
have moved the build process, from simple "build+package" to
"build+package+test+report". Although really a developer tool, it
gets bundled with so many things as a boot loader, startup script
or helper engine.
Maven
Maven is a rethink of the build process from Ant, one where the
problems are scale and repetition. Library management is handled
for you, because all projects have similar needs. you also have
similar goals: build, test, package, report. Its a very open-source
centric tool, which generates the web site with reports alongside
the deliverables. It's also strongly network aware; if your local
box lacks a plugin or JAR, Maven will get it. This is very
powerful.
apt
Debian's apt framework is more of a configuration management
(CM) tool than a build tool. But if you want something on your box
and it needs compiling, apt-get will do that for you too. It has
crossed the gulf from a build tool for developers to one for end
users. You ask for what you want, if its in compatible binary, you
get that; if it needs compiling, that is done for you. In damage
limitation terms, what it addresses is the problem that you usually
need to rebuild stuff for your debian-unstable box, so why not
integrate that with downloading and installation.
If you look at the trends then, it is for the tools to go beyond
building things to testing and installation, and also to become
something end users encounter too. Which means the tools should
really think about error messages for end users, internationalised
text, etc.
But the other point is that the tools reflect the languages.
C/C++ apps were rarely portable across OS families, so a tool that
was Unix only or Windows only was fine, as long as it handled the
problem of getting header and library dependencies right. Java apps
were portable, so the build tool needed to be. Header file
dependendencies were a non-issue, and because javac is itself quite
fast, you don't need to worry about other dependencies so much;
just do a clean build and be done with it. Instead Ant and Maven
can focus on testing and packaging.
So where next with build tools? Well, they aren't 'build tools',
they are really download-test-install tools; they are delivery
tools. Or they are configuration management tools. Once we move
(back) to languages that don't need compilation, there is no need
to compile code before distribution. Instead you redist your source
with something that declares other dependencies, tests to run, how
to do deploy the thing. Whenever someone wants to install an app,
you give it a pointer to the program, the runtime will pull down
the source+descriptor, run through the local tests (validating the
installation is good), then "install" it for other dependents to
use. If you are doing iterative dev, all you need is the test suite
running, continually.
With releases being stuff that gets rebuilt by the recipients
your release process changes. A release is just a tag in the SCM
repository that is registered (somewhere) as a release; the
downstream things handle the work
And there, the deployment engine is the build tool, or at least
choreographs it. If your CM system continually strives to keep the
system in the declared state, then changing the state declaration
to a new version of a dependency will trigger a download, checkout
and redeploy. That is, continual itegration and continal deployment
become the same thing. Gump just becomes a deployment using
CVS_HEAD and SVN_HEAD versions of code.