Way back in 2003, when we were really just starting 1060, I
wrote a paper trying to put into perspective the Web, Web-services,
XML and NetKernel. It no longer features on the new site so I thought I'd post
it here as an historical record.
How has time treated it? Well its dated thats for sure. Things
it gets right: AJAX can be seen as an inevitable consequence of the
thesis that web-applications are the original web-service. AJAX can
be considered the fine-grained composition of web 'pages' from
data-services and is spot on. It's still true that XML is too
damned hard to do with object-oriented libraries but it probably
over emphasises declarative design to the detriment of missing the
broader rise of 'dynamic languages'.
Also, it was written before we really appreciated and understood
the significance of URI addressing and RESTful composition. I think
at the time I wrote this we hadn't even booted the first NetKernel
kernel and we were still working with the original HP codebase
(that didn't have a microkernel).
Finally, I'll let you decide if the modest final paragraph holds
true ;-)
NetKernel
From Websites to Internet Operating
Systems. A perspective on the evolution of the web and a rethink of
web-services.
Peter Rodgers
(C) 2003, 1060 Research Limited
This whitepaper presents thinking that crystallised during the
development of the 1060 NetKernel from it's conception in
Hewlett-Packard to it's realization at 1060 Research. Many of the
concepts will be familiar. Perhaps the general patterns that are
presented will offer a novel perspective. I am indebted to my
former colleagues Dr Russell Perry and Dr Royston Sellman for
nucleating ideas and proving concepts that made NetKernel
possible.
In the recent past we have seen a number of technologies that
promise to expose service interfaces using XML. Take for example,
HP's e-speak, XML-RPC, SOAP. Whilst these technologies have their
merits they each neglect the declaritive power of the web and
instead adopt a procedural model for service composition. Each is
therefore a point solution to what is a universal requirement to
take a generalized approach to Internet resource processing.
This whitepaper begins by offering a history of the evolution of
the web and web-technologies and attempts to offer a perspective on
this mysterious XML stuff. It probably misses many important
details but it does firmly assert, to paraphrase, XML is the
message, XML is not the medium.
History
The Web is a stunning application. It is scalable to huge
numbers of concurrent requests, it enables the presentational
integration of content and of databases and seamlessly combines
with foundational internet technologies such as email; whilst at
the same time it is essentially very simple: a declarative language
for presentational layout, a rendering engine, a stateless
client-server protocol.
During it's rapid development the Web has undergone several
overlapping epochs each characterized by a different driver. For
the sake of argument let's take for granted the explosive expansion
of the internet itself and it's exponential increase in value as
represented by Metcalf's Law.
Initially the Web was driven by the web-browser rendering
engine. Improvements in client processing made the presentational
layer more capable and created a pull which drove the evolution of
the HTML standard and latterly ancillary standards such as
Cascading Stylesheets.
The next discernible epoch was the move to the stateful,
dynamic-Web; characterized by CGI, cookies and the database backed
web-server. It's arguable but this technological phase was driven
by or drove the South Sea Bubble of commercial frenzy that the
birth of the web created. SSL, or the armour plating of HTTP,
emerged as an ancillary supporting standard.
J2EE saw the evolution of the web-server to today's scalable,
stateful middleware integration platforms and marked the
encroachment of the presentational Web front-end into the back-end
systems of the enterprise.
Lastly we have the generalization of HTML into XML. With less
perspective we can guess at a tenuous driver; client diversity,
motivated by accessibility, WAP and other mobile internet devices,
speech, and probably dominantly driven by the inconsistencies in
HTML implementations of browser rendering engines. With XML came
XSLT transforms that allowed any XML dataset to be transformed into
an HTML/XHTML representation.
Each phase adding more capabilities, each phase compatible with
the previous and each essentially oriented towards enhancing the
central client-server presentational Web application model. Until
it was realized, XML didn't have to be about presentation...
The XML Revolution
XML is very simple. XML is super-ASCII, and yet we are in the
process of undergoing an XML industrial revolution. What makes XML
so powerful?
Web-ness
The Web may be characterized as that domain which observes
uniform adherence to a common presentational markup language.
Historically that mark-up language has been HTML but increasingly
this is XHTML. This definition of the Web removes the network and
transport protocols, the client and the server, it neglects links;
it simply states that something is Web-like if it processes
HTML/XHTML. This is a gross simplification in comparison to
Fielding's REST architecture for example, but it provides an
interesting axiom to explore.
Suppose now we assert that the historical driver, the
presentational application, is only one application in a wider
context. The definition of the Web can be recast as: that domain
which observes uniform adherence to a common markup language. This
can be expressed more succinctly as the Web is XML, or perhaps, XML
will be the Web.
So by expressing data in XML it acquires the capability to be
consumed in the Web and processed by web-technologies. If web-sites
are pervasive and web communication is mandatory in business it
makes sense to ensure data is encapsulated in XML. It is not the
XML that is powerful it is the potential that XML and the universal
set of applications that can potentially process this data, it is
XML's communicability that makes XML powerful and which is fuelling
the XML-transition.
XML-Services
Before XML the presentational HTML-application demanded a tight
binding between web-server and web-browser. XML liberates this
relationship. Applications which consume XML can be
non-presentational they can even be non-networked.
The XML-transition is leading to a loosely coupled interplay
between xml-services and xml-clients, it is this rich relationship
between xml-providers and xml-consumers, which the industry is
frenetically labelling 'web-services'. More correctly we should
call them XML-services. In XML-services we see that the
presentational web-site is simply a particular class of
XML-service; that is, the Web is the original
web-service.
We should also contextualize SOAP. SOAP is a protocol. SOAP
services are dominated by Remote Procedure Calls (RPC) over XML and
are a special class of more general XML-services. SOAP is
essentially independent of the presentational XHTML web
application, SOAP offers hard-wired client-server applications.
SOAP is not backwards compatible with the Web.
SOAP-RPC services generally endeavour to cast XML to local
object types so that procedural applications can transparently
interact over an HTTP transport. This is cool, it is CORBA-lite;
but as we will discuss below, by casting XML into the procedural
domain it freezes the fluid communicability promised by XML.
Why would a technology such as SOAP that has attracted so much
attention relegate XML to the transport medium? It's because
procedural XML processing is too hard...
XML Processing
Procedural XML
Procedural processing of XML, that is, writing code to process
XML, is complex, fragile, verbose and hard to maintain,
irrespective of the API or platform chosen for the XML processing
application. It is the difficulty of processing XML that has driven
the procedural RPC-object binding seen in SOAP and it is the
difficulty that has minimized the extent to which XML can be
combined with custom applications.
XML processing has three stages Parsing, Transformation and
Serialization. XML's Web heritage has subconsciously influenced the
processing patterns associated with standard XML technologies at
each of these stages.
Parsing XML is a basic pre-requisite of any XML application. An
application requires an in memory representation of an XML
resource; this could be an event stream, as in SAX, or an object
model as in DOM, or it could be a combination using a pull-parser
object model.
Unfortunately the parsing stage has been overloaded with other
XML processing considerations. Validation, both DTD and Schema, is
a parse time process. Validation technologies allow external
resources to be requested and incorporated at parse-time making the
process of parsing require a full-blown resource management
infrastructure. Equally, namespace support is a configurable parser
option which leaves the developer unsure if a namespace is a
mandatory semantic feature of the XML document or an ignorable
formatting option.
Transformation is any act of manipulating the XML tree
structure. XSLT is the standard transform processing technology
though direct manipulation of the DOM is also a transform. Again,
in XSLT, external entities can be referenced and incorporated into
a transform making it a full-blown resource management task.
The DOM is a complete representation of a document and yet it
offers a baroque API which makes any relatively complex
manipulation of an XML structure into an exercise in pointer
arithmetic that a C hacker would run away from. The DOM's uniform
treatment of nodes breaks down with an arbitrary distinction
between elements and attributes or between the document node and
the root element, all of which require elaborate and error-prone
corner-case code to be developed.
Serialization though a lesser issue requires careful
consideration of whitespace and is an all or nothing operation.
Related to serialization is the lack of a model, other than a
simple linear transform chain in XSLT, to pipeline intermediate
processing results together.
Finally there exists a set of XML processing technologies that
place infrastructural demands on the procedural domain that cannot
be met in a typical application. Take for example Xpointer,
Xinclude and Xbase all of which may reference or incorporate
resources from arbitrary URI specified locations, a task which
really requires a general universal resource resolver
infrastructure. We need not even discuss application level
technologies such as XML Signature or XACML.
In summary procedural XML processing is complex, partially due
to the API's and design decisions of processing models but mostly
because XML is extensible, flexible and intrinsically declarative
in nature.
More History
The picture we have painted of the current state of XML
processing has historical parallels. The Relational Database
Management System (RDBMS) is a pervasive technology that functions
on top of a declarative abstraction of the underlying
implementation. The SQL database today offers a clean separation of
code, data model and administration. It was not always this
way.
Early database systems were bespoke. Each database could have
several custom client applications; for users, for administrators,
for backup. Databases were written as programmatic procedures they
were complex, difficult to maintain, fragile - there is an
impedance mismatch between databases and procedural code. There is
an impedance mismatch between XML and procedural code.
Declarative Processing
The development of Web-sites and software services has
transformed the traditional software development cycle. Web-sites
are constantly in flux. Services are subject to almost daily
maintenance and upgrade schedules. The X in XML is not for nothing,
it stands for extensible but it means change.
As we've hinted above procedural XML processing is not
responsive to change. XML applications require an adaptable,
maintainable processing model that readily combines with other
processes and applications. Just as with the historical analogy of
the declarative SQL approach to database development, XML
processing is considerably lower-cost, more robust, adaptable and
maintainable in the declarative domain.
Anyone who's authored a web-page or written an XSLT transform
has created a declarative application. HTML is a set of declarative
instructions to combine resources into a rendered representation of
a web-page. Declarative processing is very adaptable to change, has
low maintenance costs and is readily amenable to development tools.
Declarative processing is a natural fit with XML what's needed is
declarative XML processing platform ...
XML Kernel
This whitepaper is not just an overblown history and critique of
the Web and XML technologies. No indeed, it is blatant marketing
propaganda for the 1060 NetKernel.
Brutal honesty aside. We have developed the NetKernel concept
and implementation over several years. Initially we were motivated
by the early boom phase of XML adoption that saw hundreds of
specialist industry groups set up to develop XML languages to
express the data interchanges in their fields. Languages were
created for Insurance, Medical, Geography, Farming ... you name it
there's probably a DTD or Schema for it.
Our observation was that the desire for industry domain
languages is exciting but economically naive. The cost of
implementing a custom run-time application for every sector would
be more than each industry's participants could bear and certainly,
given XML's slippery ability to morph as people add good ideas, far
too costly to maintain.
Concurrently we had seen the initial precursors to an XML
processing platform in efforts such as the Apache Cocoon XML web
publishing system. However we felt that, worthy as Cocoon is, it's
driver was as a presentational Web server.
Additionally we had HP contacts involved in RossettaNet, still
the largest XML business integration platform. This gave great
insight into the general issues of implementing XML messaging
between IT businesses; the maintenance and scalability of the XML
messages, the associated code and the economics of
implementation.
We felt we needed a generalized XML architecture that could
incorporate any XML technology and present an infrastructure upon
which arbitrary XML applications could be developed. This led to
our development of a common XML platform. Something we called
Dexter - Declarative XML Transform Engine.
Dexter
Dexter was developed with a set of simple concepts in mind.
- Declarative processes. Any XML process must be easily expressed
in a declarative language which itself should be processable
XML.
- Universal Resource Infrastructure. All low-level resource I/O
such as sourcing, parsing, caching and serializing must be handled
transparently.
- Everything is a Document. Everything is expressible as an XML
document including low-level system resources and components.
- References are URIs. All resources are addressed through
URI's
- Clean Modular Procedural Code Separation. New XML technologies,
custom XML transformation and procedural XML processes must be
easily added as modules
- Transport Independent. HTTP is just one of many transports,
treat all transports equally
During the development of Dexter the original web-services star
rose and fell within HP. Clever technology, too complex and not
related to the declarative Web model. Following this we observed
the rise of SOAP - clever technology, cool but over-hyped, not
related to the declarative Web model. For both we were bemused by
the relegation of XML to the transport layer, especially given the
evidence of people's desire to express real exchanges in XML.
As Dexter has evolved and been completed in version 2, now known
as the 1060 NetKernel, we have discovered the freedom to manipulate
XML in any manner we can conceive. The abstraction provided by a
Universal Resource Infrastructure presents a clean and consistent
separation of standard XML technologies, custom code and
declarative application components. It is only having created and
built XML applications which execute billions of XML operations on
the NetKernel that we have understood the challenges faced by
Web-service developers constrained to procedurally processing
XML.
We are biased, we built the 1060 NetKernel and we want you to
try it, but we honestly think it is the technology that liberates
XML from the presentational web application and which truly can
provide the basis for the explosion of XML applications that will
come ...