Steve: Developing on the Edge
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
Page1234567891011121314151617181920
113 - 120 of 1263
29Apr
Wed2009
GPFS: A Shared-Disk File System for Large Computing Clusters

I am currently reading the paper GPFS: A Shared-Disk File System for Large Computing Clusters...I got the opportunity to visit our city's local installation of this. Everyone working on HDFS ought to have a look through this paper.

There are some different hardware design decisions (SAN, RAID) that make this a premium filesystem, but it delivers something that GFS and HDFS don't: Posix semantics on a filesystem that scales to a few hundred TB of physical disk, a filesystem with no SPOF on a par with the namenode, and the ability for disk IO to scale with the #of physical disks, even for apps that aren't distributed across the computation servers.

MapReduce makes a lot of these hardware options obsolete in favour of commodity SATA disks close to the (power efficient) CPUs, but that results in filesystems that are unsuited to other application needs

I wonder what it would take to run Hadoop over GPFS? You could run HDFS over it, but you'd have the worst of both worlds. You really want to directly work with files in GPFS, let it handle security and (maybe) replication. Perhaps we need a "posix://" FS binding which can work with any posix FS and let the filesystem handle its side of the problem

24Apr
Fri2009
Bringing Wisdom to the Cloud

William Vambenepe has an absolutely wonderful article looking at past attempts to standardise management APIs for infrastructure on demand. He is also clear to separate the Model from the API. The API is what you need to talk to the far end; the model is what you can talk to the far end to ask for. It's notable that you don't always need a model -EC2 doesn't have one, you just ask for some machines with a given image. Though very limited, that does avoid model politics. Because just as wire protocols have their eternal battles as to what is the best way to talk between machines, in the CM space we have eternal battles about what is the best way to represent things, including lifecycle, configuration descriptions, etc. William also mentions the CDDLM work at the Global Grid Foundation, which provided a WS-RF based long-haul API to deployment servers, and an XML alternative to the SmartFrog language. Both were orthogonal: you can use the API to deploy SF models, you can use the CDL language to describe stuff you want deployed over the command line.

I spent many years at standards meetings on this problem, so recognised everything that William was talking about. Here are my views.

First, the protocol

  1. WS-RF and WS-DM was fundamentally wrong -an attempt to impose an object model to URLs. Savas and Jim may have something to say on that item.
  2. As an attempt to completely rethink/rewrite the WS-* world view, not only was it doomed politically, but it made something bad (WS-*) worse.
  3. Some bits of it were nice; I think it's extended error format was slightly better. But as SOAP stacks retain the right to return classic Soap Faults whenever they feel like, attempts by the WSDM people to mandate the format for all errors an endpoint could raise was unrealistic.
  4. The WS-A attempt to rethink what an endpoint was -not a URL, but a URL with XML metadata- was and is equally wrong. The fact that it went to 1.0 without a single test case, and there are two incompatible versions that share the same XSD namespace merely makes the fundamental flaw worse.
  5. The WS-Notification protocol never delivered its goal, through-firewall messaging
  6. The fact that WS-N used a different draft of WS-Addressing from other bits of WS-RF shows how everything was pushed out in a rush. Better to evolve working systems than mandate in standards bodies.
  7. Tests would be nice, but are viewed as a dirty implementation detail by all to many people who work at the standards layer

From the CDDLM work, I have my own Xom-based SOAP stack, Alpine, with WSDM support, client and server code to let me do long-haul deployment of things. But as it based on WS-RF, the whole thing is of very little use except with the three other implementations of the API -one of which is now owned by Microsoft, amusingly. Far better to have done a decent RESTy API from the outset, with Atom polling, XMPP or twitter for notifications. Which is why I like the Sun one -fairly easy to implement, were I to sit down and do it, either on a descendent of Alpine, or on a JAX-RS implementation.

Now, the model

  1. All attempts to define a single model of everything is doomed. It may be possible, but the politics gets in the way. By the time you incorporate the world views of everyone in the standards groups, you have something that is unusable
  2. Even if you work within a standard body that believes in testing, it's very hard to define tests for a model.

There's actually something to be said for adopting, say, RDF as your machine-level language for describing things, but that leads to a separate problem: whereas Prolog works in the Horn-clause-subset of facts (not provable as true implies false), RDF generally assumes that not resolvable as true means not known, which may be more valid in a big soup of facts, but makes working with those facts trickier.

22Apr
Wed2009
Back in the country

I am now back from a two week trip to cuba -without a laptop. I did take 32GB worth of SDCard, so will be uploading some photos over the next few days/weeks

2Apr
Thu2009
Amazon Elastic Hadoop

Someone in Amazon Web Services has been running round placing elephant stickers on their servers, as they have announced Elastic MapReduce today, though as it is a PAYG Hadoop, it should really be called Elastic Hadoop

I never believe anything I read or hear on April 1, but it seems legit, even if the getting started guide is a 404 page

They are working direct with S3, charging a premium of $0.015/hour on the base workers, $0.12/hour for the premium CPU instances.

I've not seen any AWS presence on the Hadoop lists, which is a shame as I am sure they could have found some bugs, and got some stability into the bits of the service they depend on. As it is, the team is free to diverge to meet the needs of different users. Still, nice to see Amazon chose the foundational APIs of Apache Cloud Computing Edition rather than do their own. It will reduce the costs of migrating off their infrastructure.

It also shows an interesting problem for companies building stuff on top of another vendors infrastructure, like Cloudera. It is a rerun of the old Windows extension ecosystem. Companies started off making a living adding new features to make windows usable, but the OS vendor adopted those things that were good, leaving every add-on vendor to struggle to stay ahead. Who remembers hWnd now? The way to cope is to move up the food chain, focus on algorithms and datamining, or to play in an area that the platform provider is fundamentally incapable of working. The Windows security tool business is the only windows add-on segment still surviving, due to structural issues with Windows itself.

28Mar
Sat2009
Back from ApacheCon

I left the conference just after 18:00, got the train to the airport and was home by 22:30 UK time. There's no point readjusting to GMT as we switch to summer time tomorrow.

I had a very intense conference; I had three official presentations, and gave repeat showings of the gave various presentations and demos of things, so I was pretty worn out by yesterday. I am having a rest now.

It was good fun meeting everyone, and there were lots of interesting stuff to hear about. I will try and write about them in more detail, but for now

  1. Wendy Smoak on Apache Continuuum: integrated release process with the CI tooling is a nice idea. I fear our release days, primarily for the half hour wasted with sourceforge's upload process and form hell
  2. Carlos Sanchez on using EC2 hosted VMs to run different browsers and selenium tests. I could imagine someone providing selenium-as-a-service on these VMs, as you don't want to bring up a new VM for every test run, not for $0.10 a shot.
  3. The CouchDB talks. I need to play with this. We had a talk afterwards about the difference between Erlang OTP and JavaEE. Erlang OTP: a framework for configuring, starting and stopping code, making sure it is up, with no restriction on what you run. Java EE: one use case -O/R mapped objects bonded to JSP pages or Corba calls- no real thought on lifecycle management, configuration, and all the stuff needed to keep your work working
  4. Apache DS. This directory service not only has revision control and change tracking, it implements NFS and can serve up text files dynamically created from data in the registry. This would let you define log4j.properties or hadoop-site.xml from an directory service. Nice.
26Mar
Thu2009
Application Architecture for the Cloud

This is my little proposal for an Apache Cloud Computing Edition.

There was a good discussion afterwards, Robert Burrel-Donkin already has clearance to set up the mailing list, which should be live before the end of ApacheCon.

25Mar
Wed2009
Dynamic Hadoop Clusters

My slides are live. They've been in SVN for a while, but now they are officially avaialable for viewing.

One little fact from the presentation is that some time in May, we hope to switch the SmartFrog license from LGPL to Apache 2.0. This will make it easy for people in Apache-land to use it, and so provide some very interesting test and deployment options for everyone.

The demo I gave was

  1. Bringing up hadoop as a separate processes on a single physical host
  2. Submitting some simple test job
  3. pinging bits of the cluster to show they check their health
  4. Killing a node's process and showing that it was detected and triggered a termination of the whole hadoop app.

I think a better demo of the kill would be to deploy the policy in a container that restarted the hadoop cluster whenever it died; use the Retry component. Next time. People can ask me for demos of this tomorrow. After I have done my Apache Cloud Computing Edition sales pitch

25Mar
Wed2009
Apache Cloud Computing Edition -IDEs

My talk on Thursday will be to spread the idea of having an Apache Cloud Computing edition, a software stack that combines core Apache technology with a bridge to whichever VM-hosting infrastructure you choose to run on. That way, nobody owns your application.

One issue is going to be development and testing. Historically people have brought up something like tomcat locally, then pushed out to websphere or weblogic in staging/production. For a cloud-based application, your developers need their own isolated copy of the server stack and the data. This could be local, but it could be just as effective on a remote server farm. What is key is that you need to be able to run tests and set breakpoints in the cloud. Because developers do need to step through the code and work out why things aren't working. If you deploy 20 copies of your front end, you may want the breakpoint to be set on all of them, and the IDE to hook you up to whichever box hits that breakpoint first.

Which IDE? Eclipse. NetBeans is doomed, and IntelliJ IDEA -while I love it- has a hard time competing with the ubquity of eclipse. It is hard to build a business model based on plugins for IDEA, whereas one on Eclipse is possibly viable. Watch Borland to see if it works.

With a focus on eclipse, one of the key drivers behind command line build tools -Ant, Maven- and (scarily) testing goes away. Now you have a single IDE for everyone to use again, and the ability to debug your code without writing unit tests. People could get lazy. Hopefully we've made enough progress in test-centric development that developers won't return to the visual studio development ethos of debug-first development

Returning to the cloud, we are going to need IDEs that hook into the infrastructure, to let you work with it as if it were a server on the LAN. Today -and I don't think it is a coincidence that Amazon have timed it for my presentation- Amazon have announced the AWS toolkit for Eclipse. This will let you set breakpoints on Apache code -Tomcat- hosted in their datacentres. That's pretty profound.