I did some Prolog programming last year, code to walk the Apache
artifact repository to look for problems. It was kind of fun going
back, though I need to finish the work with the full dataset. What
was not so good was picking up a copy of The
Art of Prolog from 15 years ago, and finding some of my
undergraduate coursework as one of the examples. That made me feel
old. Just the idea of having studied CS twenty years ago, which
should have been a long time ago in the history of computing. But
what's changed? C-derived languages are still mainstream, x86 CPUs
are everywhere, and PCs still crash on a regular basis. The big
profound change is the Web, but CS still has to come round and
embrace REST as the one reliable way of building large-scale
distributed systems...if you spend time in the grid world, you can
see that distributed objects are still highly regarded, despite
some people's best efforts.
Which brings me to Erlang. Is it (a) a step forwards, or (b) a
step back?
I haven't been doing much in terms of Erlang-play, as I've been
distracted at work (new release) and at home (stolen MTB, major
injuries). But I have toyed with ErlyBird, the NetBeans-based IDE,
and the basics of parsing a few megabytes worth of bluetooth device
IDs and timestamps.
Tim Bray immediately
hit the brick wall of "This is not a C-derived language"; it
has much of the Prolog punctuation, and its model of unification of
variables and Atoms. But Erlang is not prolog: it doesn't backtrack
except in the condition clauses in case statements, all of which
are restricted to functions that are side effect free. Someone in
the Erlang team's clearly been burned by the classic Prolog problem
of an error sending the logic engine off into some corner of your
assertion-space, backtracking wildly.
There's a price for this change, which is that Erlang is less
graph-centric than Prolog. Erlang gives you lists, tuples and
object instances (Processes) to which you can send such datatypes,
but less of the depth-first-search unification algorithm which
makes working with [some] graph structures so trivial. Some is
important, breadth-first searches and cyclic graphs are still
troublesome. So what you get for Erlang is a bit of the
unification, less of the graph, and more of the functional
programming world view, albeit without Standard ML's polymorphic
type inference, questions on how to implement that being the
nightmare of my CS degree exams.
For many of those developers who didn't study under Milner,
Functional Programming is going to be new, even if they struggled
through STL C++, which, IMO, is often the worst of both worlds: an
attempt to retrofit functional programming to a language that has
already had OO patched in badly. But what you gain from Erlang is
distribution, either in host, or across-machines. Which is always
cute.
Tim: the way to build up data structures in a functional
language is to pass down something that gets built up, either on
the way down or the way back. If you can write functions that take
a fragment of the log and aggregate that, and another function to
merge the results from the analysis of two such fragments, then you
are on your way to doing map reduction, without having to set up
Hadoop.
What is interesting is the review on the overall experience from
Tim
and especially Cedric
Beust. Yes, even with some NetBeans integration, working with
Erlang felt a bit like going back in time to a Prolog or SML shell,
manually running things and looking at the results.
That may integrate well under Emacs, but it misses out on the
main improvement of modern software development: test-centric
development with IDEs and build tools that have embraced the same
methodology.
This is tractable, and possibly tractable in ways that are so
profound that they even make TestNG look primitive. Take, for
example, PLUnit,
the test framework of SWI Prolog. PLUnit lets you turn every clause
resolution into an assertion, with the ability to declare that a
clause should pass once, pass multiple times, or fail with a
specific cause. It also goes coverage analysis and can build up a
test suite from the command line.
Erlang needs something like that, or, if it already has it, it
needs to be embraced to become the central way of developing.
Instead of running your commands in the shell, you write your tests
and keep an eye on the build status.
This is something I've been thinking of for a while: how
test-centric, deployment-centric development with interpreted
languages should change how you code. Right from the outset you'd
write some tests that describe the functional behaviour of the
deployed system, then you set something up to run those tests
all the time Current IDEs, they compile and analyse your
code in the background, and flag errors. But imagine if they also,
continually, highlit which clauses didnt have coverage, if they
were triggering deployment and test runs on five different
(virtualized) machines, and could flag which functions were failing
on the Mac but not Windows. There'd be a console if you needed it,
but it would be as secondary to the development process as actually
manually viewing the application/web site is in today's
test-centric teams. You look at the site to make sure it is pretty,
not that it works, because that is the job of tools like Selenium.
I don't know where I'd get started with Erlang on this. I'd
probably go for an Erlang component for SmartFrog that continually
monitors a directory and rebuilds and retests whenever something
changes. I'd feed the results into the test result framework, a
framework that is intended to support multiple underlying test
frameworks and serialise the results over RMI, then into XML or
XHTML. So whenever you saved a file, after a short pause
(CruiseControl-style), I'd trigger a retest. You'd point your web
browser at the results page, hit reload and get the latest results.
Yes, that would be the way to build. The reason for doing this in
SmartFrog, rather is that production deployment would be something
you'd follow on...if it worked locally, you could deploy it to the
staging systems and test there next. Or you'd deploy something on
the production site to run the same workflow every night, against
the SVN repository. There wouldn't be a build process any more,
just a deploy-and-test, with policies in place to go live on the
production site only when staging worked.