Steve: Developing on the Edge
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
Page1234567891011121314151617181920
1 - 8 of 1227
28Jan
Thu2010
Private Clouds, good or bad?

James Hamilton -who I have a lot of respect for- has big posting, Private Clouds are not the future.

His Arguments

  1. You don't get the scale in hardware purchases
  2. Only the big datacentres can justify the investment in free-air cooled, low-power servers, negotiate low cost power from PNW hydro facilities, etc.
  3. "Cloud computing providers have some of the best distributed systems specialists in the world.They also have open source experts and depend deeply upon both open source and internally produced software."
  4. Costs of keeping High Availability are high, best outsourced

Interesting, but I don't agree with all of them

  1. If you are doing something private you don't get the economy of scale of a brand new rack-in-container setup somewhere near Yakima or Eastern Oregon, yes your power budget may be higher. But you don't need any upfront investment in your own hardware, you contact your favourite server vendor and tell them how many you want, where and when.
  2. You don't need brand new datacentre facilities. If you can get away with what you have: less capital outlay. Whereas AWS and facebook are spending $$$, and that has be paid for somehow
  3. Yes, the providers do have some of the experts. But here's the thing, a lot of that experience can feed back into the source, be it open or closed. When we get some wierd DNS bug or something, that gets patched, the app is better at working in those situations -or a least recognising them. Amazon may think they are gaining a strategic edge by not contributing back any of their bug fixes to the big applications, but all they are doing is forking their code away from everyone elses. In open source, regardless of the license, if you keep your patches closed, you gain a short term advantage, but risk the long-term. And if you roll-your-own app from the ground up (SimpleDB) then anyone who uses it is locked into your platform forever.
  4. HA is best outsourced. Maybe so, but I note that apps on EC2 aren't necessarily HA, as the task of keeping the application alive still belongs on that ops team. Only now if something is wrong you don't get access to the datacentre, to its routers, to find out why things are wrong.

I don't see why any infrastructure shouldn't have an API that lets me create VMs from my remote command line, web UI, build tools. Something that lets me share infrastructure with other people, rather than have dedicated machines to dedicated apps. Because in a sufficiently large organisation, there are always some old under-used apps floating around, and those apps that are used have varying demand. Exactly the kind of thing you need an agile infrastructure for

28Jan
Thu2010
Who are you and what are you doing in my room in the middle of the night? I'm a paramedic

Had a fun event on Tuesday night, come round to some lights on and some noise in the bedroom and it turns out there is a paramedic there and an ambulance outside. For me. In fact, the paramedic had been there for a while talking to me but I don't remember that bit at all.

Apparently I woke my wife up by having some kind of seizure, maybe related to my ongoing Illness that is not Bolivian Haemorrhagic Fever. Anyway, Bina can't bring me round, calls the ambulance, they come round and I do regain my usual half-awake-in-the-morning-consciousness, then go to sleep

The next day I nip in to the doctors with the paramedic report, they make a couple of calls and send me down to this GP-referral-unit at the hospital, which is kind of a second-line support facility. You get coffee and somewhere quiet to sit while the two doctors on duty send you off for tests and bring in people who know more about the subject.

More tests planned next week: MRI and cardio scans; root cause is not known. It may be due to my ongoing illness, as the breathing is still below normal. But maybe not. Nobody knows. They know this though: I am not allowed to drive for the next 12 months unless the cause is discovered and addressed.

I am down for a trip to Berlin Monday and Tuesday, giving a talk with some screen shots: New Roles in the Cloud. Anyone in the area should still plan to attend this seminal presentation, though it may be the medical will staff tell me that I can't fly (altitude) or spend a couple of nights on my own. In which case, no talk. Watch this space.

20Jan
Wed2010
Datamining the bug database

My stance on developer surveys: go ask the tools is known.

I am pleased to see someone has done this with a lovely presentation on 10 years worth of Firefox bugzilla bugreps.

See that? Far more information than a survey provides, results of interest to developers, rather than just annoyance.

20Jan
Wed2010
Making java server apps suspend/resume aware

I'm looking for an easy way for a server-side Java app to recognise that it has just resumed, and should handle that event (wait for the network to return, reconnect to things).

I don't see any easy way to do this, other than have a shell script to run on power resume which somehow notifies the app (touches a file, probably; queues something). The java app has to look for this file and react (polling, bad), or when it encounters network problems look at this file and say "did we just resume? Maybe the lan isn't live".

Life would be simpler if java apps could subscribe to power/LAN events

19Jan
Tue2010
Cloud MapReduce

Someone forwards me a link to an AWS Blog entry, Cloud MapReduce, which looks at Accenture's prototype MR engine built directly on top of the AWS stack: S3, simpledb, the message queue, etc

The paper is worth a read, they are very pleased about how few LOC it took to get working, and how it can be 60X faster than Hadoop.

My Initial thoughts

  1. A lot of the speedup comes from not shuffling; in Hadoop, shuffling/sorting stuff is optional. If you need to do it, and you do it in the right place, it pays off
  2. Physical Hadoop clusters normally bottleneck at the disk IO rates. Perhaps they are comparing Hadoop-on-EC2-VM performance with their Cloud MRs
  3. The LOC comparison is somewhat flawed as it doesn't include the lines of code needed for S3 (Java based, I believe), EC2 infrastructure, the message queue, database, etc
  4. Those LOC are the lines that Accenture get to maintain, from a maintenance cost perspective, the low #of lines is better.
  5. The SPOFs have not gone, just moved. No, the namenode may not fail, but all I have to do is report your credit card stolen to the bank and your cluster goes offline.

Its interesting in that it shows that a base cloud "stack" is important, and there is more than just VM hosting. You do benefit from a highly available , highly scalable filestore with direct (no need for a VM) remote access. Some kind of database is good, as is a message queue. Sometimes. And yes, you can write/rewrite applications that only work in this single environment. And once you do that, whoever owns the stack owns your code and your data

.
13Jan
Wed2010
Breaking and Entering

The BBC is showing Jude Law and others in Anthony Minghella's "Breaking and Entering". It's amusing that the dodgy-inner-city housing estate where the teenage burglar that is part of the story lives is somewhere some of my school-friends grew up in, and is off Abbey Road, one of the most famous roads in the city. Either St John's Wood has got rougher in 20 years, now that Maida Vale has been gentrified, or the film crew were scared to go to any part of London that they'd actually get robbed in.

13Jan
Wed2010
Debugging Linux Power Management

I like working on OSS projects, good teams, good problems. But I don't like being made to get involved with something way off my line-of-research just because it isn't working. Like my linux laptop's power management behaviour.

Yesterday: plug it in to AC power and an emergency hibernate is triggered, one it doesn't recover from. There's no point looking in /var/log/pm-suspend.log for what happened, because when the machine crashed: no log. Seen that before when logging laptop behaviour on Win98 boxes. They sleep, but they don't wake up.

One finding so far: pulse-audio can't suspend unless you have the machine name set to 127.0.0.1 in /etc/hosts, which may be the ubuntu default, but something that is not much good for Hadoop or RMI on the laptop. It's sudo's fault, and I may go for a network-unfriendly hostname just to see if that is the cause. That or try out Windows 7 for a while.

7Jan
Thu2010
PDF vulnerabilities

The computing industry has got so used to criticising windows for being insecure that it's got complacent. Anything that opens content from an untrusted source -which means any email, any URL- is vulnerable and has to be kept secure. The big problem with Windows and OS/X is that neither platform has support for updating all your installed apps, keeping them secure.

Which is a problem given how ubiquitous PDF readers are in the enterprise. Sans has a good analysis up of a malware attack in a PDF for which Acrobat Reader does not have a patch for. The only way to secure it is to turn JavaScript off.

This raises a question. for me, the primary use of Acroread is to read PDF files. No scripting, no browser integration. So why does acroread have these features? It's feature creep for the benefit of Adobe "Let's make acrobat a platform! No need for HTML and web forms! We can do it all in PDF!". This benefit imposes a cost on all users, we have to keep our systems up to date, worry about every Windows VM, add something else to the weekly linux updates. And until such 0-day exploits are fixed, worry a lot.