Steve: Developing on the Edge - Languages and built-test-processes
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
23Sep
Sun2007
Languages and built-test-processes

I did some Prolog programming last year, code to walk the Apache artifact repository to look for problems. It was kind of fun going back, though I need to finish the work with the full dataset. What was not so good was picking up a copy of The Art of Prolog from 15 years ago, and finding some of my undergraduate coursework as one of the examples. That made me feel old. Just the idea of having studied CS twenty years ago, which should have been a long time ago in the history of computing. But what's changed? C-derived languages are still mainstream, x86 CPUs are everywhere, and PCs still crash on a regular basis. The big profound change is the Web, but CS still has to come round and embrace REST as the one reliable way of building large-scale distributed systems...if you spend time in the grid world, you can see that distributed objects are still highly regarded, despite some people's best efforts.

Which brings me to Erlang. Is it (a) a step forwards, or (b) a step back?

I haven't been doing much in terms of Erlang-play, as I've been distracted at work (new release) and at home (stolen MTB, major injuries). But I have toyed with ErlyBird, the NetBeans-based IDE, and the basics of parsing a few megabytes worth of bluetooth device IDs and timestamps.

Tim Bray immediately hit the brick wall of "This is not a C-derived language"; it has much of the Prolog punctuation, and its model of unification of variables and Atoms. But Erlang is not prolog: it doesn't backtrack except in the condition clauses in case statements, all of which are restricted to functions that are side effect free. Someone in the Erlang team's clearly been burned by the classic Prolog problem of an error sending the logic engine off into some corner of your assertion-space, backtracking wildly.

There's a price for this change, which is that Erlang is less graph-centric than Prolog. Erlang gives you lists, tuples and object instances (Processes) to which you can send such datatypes, but less of the depth-first-search unification algorithm which makes working with [some] graph structures so trivial. Some is important, breadth-first searches and cyclic graphs are still troublesome. So what you get for Erlang is a bit of the unification, less of the graph, and more of the functional programming world view, albeit without Standard ML's polymorphic type inference, questions on how to implement that being the nightmare of my CS degree exams.

For many of those developers who didn't study under Milner, Functional Programming is going to be new, even if they struggled through STL C++, which, IMO, is often the worst of both worlds: an attempt to retrofit functional programming to a language that has already had OO patched in badly. But what you gain from Erlang is distribution, either in host, or across-machines. Which is always cute.

Tim: the way to build up data structures in a functional language is to pass down something that gets built up, either on the way down or the way back. If you can write functions that take a fragment of the log and aggregate that, and another function to merge the results from the analysis of two such fragments, then you are on your way to doing map reduction, without having to set up Hadoop.

What is interesting is the review on the overall experience from Tim and especially Cedric Beust. Yes, even with some NetBeans integration, working with Erlang felt a bit like going back in time to a Prolog or SML shell, manually running things and looking at the results.

That may integrate well under Emacs, but it misses out on the main improvement of modern software development: test-centric development with IDEs and build tools that have embraced the same methodology.

This is tractable, and possibly tractable in ways that are so profound that they even make TestNG look primitive. Take, for example, PLUnit, the test framework of SWI Prolog. PLUnit lets you turn every clause resolution into an assertion, with the ability to declare that a clause should pass once, pass multiple times, or fail with a specific cause. It also goes coverage analysis and can build up a test suite from the command line.

Erlang needs something like that, or, if it already has it, it needs to be embraced to become the central way of developing. Instead of running your commands in the shell, you write your tests and keep an eye on the build status.

This is something I've been thinking of for a while: how test-centric, deployment-centric development with interpreted languages should change how you code. Right from the outset you'd write some tests that describe the functional behaviour of the deployed system, then you set something up to run those tests all the time Current IDEs, they compile and analyse your code in the background, and flag errors. But imagine if they also, continually, highlit which clauses didnt have coverage, if they were triggering deployment and test runs on five different (virtualized) machines, and could flag which functions were failing on the Mac but not Windows. There'd be a console if you needed it, but it would be as secondary to the development process as actually manually viewing the application/web site is in today's test-centric teams. You look at the site to make sure it is pretty, not that it works, because that is the job of tools like Selenium.

I don't know where I'd get started with Erlang on this. I'd probably go for an Erlang component for SmartFrog that continually monitors a directory and rebuilds and retests whenever something changes. I'd feed the results into the test result framework, a framework that is intended to support multiple underlying test frameworks and serialise the results over RMI, then into XML or XHTML. So whenever you saved a file, after a short pause (CruiseControl-style), I'd trigger a retest. You'd point your web browser at the results page, hit reload and get the latest results. Yes, that would be the way to build. The reason for doing this in SmartFrog, rather is that production deployment would be something you'd follow on...if it worked locally, you could deploy it to the staging systems and test there next. Or you'd deploy something on the production site to run the same workflow every night, against the SVN repository. There wouldn't be a build process any more, just a deploy-and-test, with policies in place to go live on the production site only when staging worked.

Comments

test-centric development process for Erlangreply to this thread
On 26 September 2007 at 10: 51 James Abley commented:
+1 - I've had discussions with Lisp-ers about this, and they've stated that playing with the REPL is their testing. But to my mind, that misses one of the big gains of test-centric processes, in that you have a reproducible permanent record of how the code should behave. Just doing stuff in the REPL doesn't give you a tangible thing that you can use to help you when you go back to the same code after a long break, or pick up someone else's code.