Lovely set of slides Programming
the Virtual Infrastructure,, by Paul Anderson of Edinburgh
Uni. This was his keynote at LISA last year
The key premise is that when the first computers were built,
they were wired up and programmed at the machine code level; HLLs
were written so that others could work with them, and to improve
productivity, later portability.
Paul then argues that system configuration should take the same
path, as with a highly virtualised infrastructure, you are now
telling the infrastructure what to create and where. Instead of
going MiniMRCluster mrCluster=new MiniMRCluster(), as
you do in Hadoop to create an in-process cluster for testing, you
can go
Cluster cluster=new Cluster();
cluster.add(new NameNode);
cluster.add(new DataNode);
cluster.add(new DataNode);
cluster.add(new DataNode);
cluster.start();
And then you are off with four virtualised hosts forming a small
distributed filesystem.
The next issue becomes: what is the language to use. Paul (and
we) are clearly in the declarative world; Sun's Project Caroline
more procedural. One is good for monitoring, the other easier for
classic-imperative-programmers to pick up. Where Paul does go it a
bit further is that he argues that everything deployed in the
infrastructure is in fact an agent, and that instead of
describing its state, you have to consider describing/modelling its
behaviour. That's a fairly interesting way to look at things.
Having dabbled in Agent stuff two decades ago (!), what I remember
from those days is that agents have beliefs that have to be viewed
relative to that agent and the time they held them. There are no
facts, only beliefs. That is probably a more accurate view of the
world, (my laptop thinks it needs a proxy, that belief was valid at
work, but not now it is at home), but it also complicates a lot.
Question is, is it a necessary step?