| Steve: Developing on the Edge Thoughts on development, Web-services, technology and mountains. | |
29Apr Wed2009 | GPFS: A Shared-Disk File System for Large Computing Clusters
I am currently reading the paper
GPFS: A Shared-Disk File System for Large Computing
Clusters...I got the opportunity to visit our city's local
installation of this. Everyone working on HDFS ought to have a look
through this paper.
There are some different hardware design decisions (SAN, RAID)
that make this a premium filesystem, but it delivers something that
GFS and HDFS don't: Posix semantics on a filesystem that scales to
a few hundred TB of physical disk, a filesystem with no SPOF on a
par with the namenode, and the ability for disk IO to scale with
the #of physical disks, even for apps that aren't distributed
across the computation servers.
MapReduce makes a lot of these hardware options obsolete in
favour of commodity SATA disks close to the (power efficient) CPUs,
but that results in filesystems that are unsuited to other
application needs
I wonder what it would take to run Hadoop over GPFS? You could
run HDFS over it, but you'd have the worst of both worlds. You
really want to directly work with files in GPFS, let it handle
security and (maybe) replication. Perhaps we need a "posix://" FS
binding which can work with any posix FS and let the filesystem
handle its side of the problem
|
| |
Posted by steve at
11:57comments
[
1
]
trackbacks
[
0
]
| 24Apr Fri2009 | Bringing Wisdom to the Cloud
William Vambenepe has
an absolutely wonderful article looking at past attempts to standardise
management APIs for infrastructure on demand. He is also clear
to separate the Model from the API. The API is what you need to
talk to the far end; the model is what you can talk to the far end
to ask for. It's notable that you don't always need a model -EC2
doesn't have one, you just ask for some machines with a given
image. Though very limited, that does avoid model politics. Because
just as wire protocols have their eternal battles as to what is the
best way to talk between machines, in the CM space we have eternal
battles about what is the best way to represent things, including
lifecycle, configuration descriptions, etc. William also mentions
the CDDLM work at the Global Grid Foundation, which provided a
WS-RF based long-haul API to deployment servers, and an XML
alternative to the SmartFrog language. Both were orthogonal: you
can use the API to deploy SF models, you can use the CDL language
to describe stuff you want deployed over the command line.
I spent many years at standards meetings on this problem, so
recognised everything that William was talking about. Here are my
views.
First, the protocol
- WS-RF and WS-DM was fundamentally wrong -an attempt to impose
an object model to URLs. Savas and Jim may have something to say on
that item.
- As an attempt to completely rethink/rewrite the WS-* world
view, not only was it doomed politically, but it made something bad
(WS-*) worse.
- Some bits of it were nice; I think it's extended error format
was slightly better. But as SOAP stacks retain the right to return
classic Soap Faults whenever they feel like, attempts by the WSDM
people to mandate the format for all errors an endpoint could raise
was unrealistic.
- The WS-A attempt to rethink what an endpoint was -not a URL,
but a URL with XML metadata- was and is equally wrong. The fact
that it went to 1.0 without a single test case, and there are two
incompatible versions that share the same XSD namespace merely
makes the fundamental flaw worse.
- The WS-Notification protocol never delivered its goal,
through-firewall messaging
- The fact that WS-N used a different draft of WS-Addressing from
other bits of WS-RF shows how everything was pushed out in a rush.
Better to evolve working systems than mandate in standards
bodies.
- Tests would be nice, but are viewed as a dirty implementation
detail by all to many people who work at the standards layer
From the CDDLM work, I have my own Xom-based SOAP stack, Alpine,
with WSDM support, client and server code to let me do long-haul
deployment of things. But as it based on WS-RF, the whole thing is
of very little use except with the three other implementations of
the API -one of which is now owned by Microsoft, amusingly. Far
better to have done a decent RESTy API from the outset, with Atom
polling, XMPP or twitter for notifications. Which is why I like the
Sun one -fairly easy to implement, were I to sit down and do it,
either on a descendent of Alpine, or on a JAX-RS
implementation.
Now, the model
- All attempts to define a single model of everything is doomed.
It may be possible, but the politics gets in the way. By the time
you incorporate the world views of everyone in the standards
groups, you have something that is unusable
- Even if you work within a standard body that believes in
testing, it's very hard to define tests for a model.
There's actually something to be said for adopting, say, RDF as
your machine-level language for describing things, but that leads
to a separate problem: whereas Prolog works in the
Horn-clause-subset of facts (not provable as true implies false),
RDF generally assumes that not resolvable as true means not known,
which may be more valid in a big soup of facts, but makes working
with those facts trickier.
|
| |
Posted by steve at
11:36comments
[
0
]
trackbacks
[
0
]
| 22Apr Wed2009 | Back in the country
I am now back from a two week trip to cuba -without a laptop. I
did take 32GB worth of SDCard, so will be uploading some photos
over the next few days/weeks
|
| |
Posted by steve at
09:15comments
[
0
]
trackbacks
[
0
]
| 2Apr Thu2009 | Amazon Elastic Hadoop
Someone in Amazon Web Services has been running round placing
elephant stickers on their servers, as they have announced Elastic
MapReduce today, though as it is a PAYG Hadoop, it should
really be called Elastic Hadoop
I never believe anything I read or hear on April 1, but it seems
legit, even if
the getting started guide is a 404 page
They are working direct with S3, charging a premium of
$0.015/hour on the base workers, $0.12/hour for the premium CPU
instances.
I've not seen any AWS presence on the Hadoop lists, which is a
shame as I am sure they could have found some bugs, and got some
stability into the bits of the service they depend on. As it is,
the team is free to diverge to meet the needs of different users.
Still, nice to see Amazon chose the foundational APIs of Apache
Cloud Computing Edition rather than do their own. It will
reduce the costs of migrating off their infrastructure.
It also shows an interesting problem for companies building
stuff on top of another vendors infrastructure, like Cloudera. It is a rerun of the
old Windows extension ecosystem. Companies started off making a
living adding new features to make windows usable, but the OS
vendor adopted those things that were good, leaving every add-on
vendor to struggle to stay ahead. Who remembers hWnd now? The way
to cope is to move up the food chain, focus on algorithms and
datamining, or to play in an area that the platform provider is
fundamentally incapable of working. The Windows security tool
business is the only windows add-on segment still surviving, due to
structural issues with Windows itself.
|
| |
Posted by steve at
11:59comments
[
0
]
trackbacks
[
0
]
| 28Mar Sat2009 | Back from ApacheCon
I left the conference just after 18:00, got the train to the
airport and was home by 22:30 UK time. There's no point readjusting
to GMT as we switch to summer time tomorrow.
I had a very intense conference; I had three official
presentations, and gave repeat showings of the gave various
presentations and demos of things, so I was pretty worn out by
yesterday. I am having a rest now.
It was good fun meeting everyone, and there were lots of
interesting stuff to hear about. I will try and write about them in
more detail, but for now
- Wendy Smoak on Apache Continuuum: integrated release process
with the CI tooling is a nice idea. I fear our release days,
primarily for the half hour wasted with sourceforge's upload
process and form hell
- Carlos Sanchez on using EC2 hosted VMs to run different
browsers and selenium tests. I could imagine someone providing
selenium-as-a-service on these VMs, as you don't want to bring up a
new VM for every test run, not for $0.10 a shot.
- The CouchDB talks. I need to play with this. We had a talk
afterwards about the difference between Erlang OTP and JavaEE.
Erlang OTP: a framework for configuring, starting and stopping
code, making sure it is up, with no restriction on what you run.
Java EE: one use case -O/R mapped objects bonded to JSP pages or
Corba calls- no real thought on lifecycle management,
configuration, and all the stuff needed to keep your work
working
- Apache DS. This directory service not only has revision control
and change tracking, it implements NFS and can serve up text files
dynamically created from data in the registry. This would let you
define log4j.properties or hadoop-site.xml from an directory
service. Nice.
|
| |
Posted by steve at
11:44comments
[
0
]
trackbacks
[
0
]
| 26Mar Thu2009 | Application Architecture for the Cloud
This is my little proposal for an Apache Cloud Computing
Edition.
There was a good discussion afterwards, Robert Burrel-Donkin
already has clearance to set up the mailing list, which should be
live before the end of ApacheCon.
|
| |
Posted by steve at
13:40comments
[
0
]
trackbacks
[
0
]
| 25Mar Wed2009 | Dynamic Hadoop Clusters
My
slides are live. They've
been in SVN for a while, but now they are officially avaialable
for viewing.
One little fact from the presentation is that some time in May,
we hope to switch the SmartFrog license from LGPL to Apache 2.0.
This will make it easy for people in Apache-land to use it, and so
provide some very interesting test and deployment options for
everyone.
The demo I gave was
- Bringing up hadoop as a separate processes on a single physical
host
- Submitting some simple test job
- pinging bits of the cluster to show they check their
health
- Killing a node's process and showing that it was detected and
triggered a termination of the whole hadoop app.
I think a better demo of the kill would be to deploy the policy
in a container that restarted the hadoop cluster whenever it died;
use the Retry component. Next time. People can ask me for demos of
this tomorrow. After I have done my Apache Cloud Computing Edition
sales pitch
|
| |
Posted by steve at
18:19comments
[
0
]
trackbacks
[
0
]
| 25Mar Wed2009 | Apache Cloud Computing Edition -IDEs
My talk on Thursday will be to spread the idea of having an
Apache Cloud Computing edition, a software stack that combines core
Apache technology with a bridge to whichever VM-hosting
infrastructure you choose to run on. That way, nobody owns your
application.
One issue is going to be development and testing. Historically
people have brought up something like tomcat locally, then pushed
out to websphere or weblogic in staging/production. For a
cloud-based application, your developers need their own isolated
copy of the server stack and the data. This could be local, but it
could be just as effective on a remote server farm. What is key is
that you need to be able to run tests and set breakpoints in the
cloud. Because developers do need to step through the code and work
out why things aren't working. If you deploy 20 copies of your
front end, you may want the breakpoint to be set on all of them,
and the IDE to hook you up to whichever box hits that breakpoint
first.
Which IDE? Eclipse. NetBeans is doomed, and IntelliJ IDEA -while
I love it- has a hard time competing with the ubquity of eclipse.
It is hard to build a business model based on plugins for IDEA,
whereas one on Eclipse is possibly viable. Watch Borland to see if
it works.
With a focus on eclipse, one of the key drivers behind command
line build tools -Ant, Maven- and (scarily) testing goes away. Now
you have a single IDE for everyone to use again, and the ability to
debug your code without writing unit tests. People could get lazy.
Hopefully we've made enough progress in test-centric development
that developers won't return to the visual studio development ethos
of debug-first development
Returning to the cloud, we are going to need IDEs that hook into
the infrastructure, to let you work with it as if it were a server
on the LAN. Today -and I don't think it is a coincidence that
Amazon have timed it for my presentation- Amazon have announced the
AWS toolkit for
Eclipse. This will let you set breakpoints on Apache code
-Tomcat- hosted in their datacentres. That's pretty profound.
|
| |
Posted by steve at
09:03comments
[
0
]
trackbacks
[
0
]
|
  | |