Uncooked First Draft, 3 September 1999, some revision 18 September. All comments and suggestions are welcome. Many of the ideas in this document were inspired by Walter Perry's editorial at the 1999 XML Developer Days, as well as David Megginson's presentation on RDF and assorted conversations throughout the conference. All blame for this essay, of course, is mine.
We are designing this, not investigating it as a natural phenomenon.
- Tim Berners-Lee, on xml-dev.
A common thread runs through most of the existing literature on XML and SGML schema development:'Put all the experts in a room. Let them combine their skills and their knowledge to create information models that best reflect the material being described. Then disseminate that knowledge widely, binding the rest of the world to the agreement reached by the experts.' This recipe has dominated both the recommendations the XML community typically makes for using XML's schema and DTD facilities and the process by which XML and its supporting standards have themselves been created. While it does soothe the nerves of those troubled by XML's apparent potential for anarchy, this recipe for top-down development creates its own problems, and obscures other development styles that may be more appropriate to Internet applications.
Exploring the problems that exist in expert-based systems is difficult, unless of course one has the benefit of doing a post-mortem. The XML 'Experts Community' is very much alive and doing business, so critiquing it is both difficult and dangerous. This discussion of problems will remain brief and general, in order to move discussion to the alternative models and avoid antagonizing people who undoubtedly (and often rightfully) believe that they are making substantial contributions.
A number of simple problems combine to create significant difficulties for experts building systems that must be shared by large numbers of users:
Experts and users often have different perspectives on key issues like 'what constitutes ease of use' and 'how should new technology fit with existing usage'. Experts are experts because they've demonstrated their ability to solve problems on some level. Unfortunately, that doesn't mean that the experts have learned to solve problems of the kinds that afflict ordinary users. 'Gurus' who are accustomed to using a particular set of tools and are experienced in using those tools often have difficulty understanding users who value those tools differently (or not at all.) An expert's 'solution' to a problem may simply be another problem to an ordinary user.
Experts often come from a different community than the users whose problems they are setting out to solve. While their different viewpoint may in fact be important and constructive, in many cases the clashes between the expectations of the expert's 'home' community and the community for which the expert is working make implementation of the 'best' solution difficult.
In most technological communities, not all members are equal. In many cases, there are community members for whom a particular set of technological decisions is critical, but who cannot easily commit the money or time needed to participate in the organizations that develop standards. In other cases, these members have learned from experience that 'some animals are more equal than others', and are dubious about the value of any such process.
Companies and other organizations that rely on hierarchies like to assume that managers have an understanding of the work that goes on below them. In fact, what managers typically deal with are simplifications of the work done for them, with simplifications continuing up the chain of command. Similar simplifications may obscure the 'real' nature of complex situations when a group of experts attempts to create a single solution for a large set of issues.
Everyone likes to do a good job, and leaving behind unfinished business may seem to tarnish the quality of the work. When a group sits down to solve a problem, especially in data modeling, the result is often a complete specification that attempts to cover all bases. The nature of XML validation encourages such an approach, providing an automated way for developers to make sure an entire (not a partial) document structure conforms to a schema. The need to build such complete solutions also often reduces the chances of developing multi-part solutions, in which multiple interdependent components may contribute to different aspects of solving a particular problem. By solving large problems (and working in committees), experts often have to compromise and allow an options approach, which can lead to incompatible implementations that purport to be the same solution.
If the expert model used by standards organizations has important negative side effects, are there any good ways to build XML models that don't produce those effects? Most of the alternatives, though threatening 'anarchy', involve bringing the decision-making process to larger groups of people.
Distributed decision-making has always had a bad reputation in the computing world, where a small core of experts has long had to tend the needs of much larger groups of people. While power users solve some problems, they usually create others - and typically the new problems are more challenging to the computing staff. Keeping business people out of computing and computing people out of business is a common management goal, with support from both sides of the divide. When it comes to higher-level issues, like data structures and network architectures, complex processes designed to bridge these gaps by assessing needs and returning with solutions have typically been used to keep real work on such issues in the 'experts' camp.
If you're willing to take the chance and ask which people really understand what happens in a company, you may be able to encourage those people to describe - and even model - the information they send and receive. While asking employees to model all of the information their employer works with may not make sense, at least to the employer, teaching employees that they can get their work done more easily, without inflicting any major costs on them (you'll need to promise not to fire them), can reduce the overall costs to the employer. More efficient transfer of information - even without reduced headcount - can help an organization run more smoothly.
Distributing decision-making in this way is going to require both a change in mind-set and a change in infrastructure. Users are going to need a way to describe and use their documents, as well as tools for accepting other people documents and establishing relationships. These tools are going to have to present a friendlier face than the current XML syntax, and need to be able to support format conversions (transformations between different XML vocabularies) automatically and easily. 'Letting a thousand flowers bloom' requires a thousand flowerpots, if it is to be successful. These tools are not yet available.
If we can stand to distribute decision-making, we can take advantage of features that are readily available in systems like paper and telephones but not readily available in computers. People can adapt the printed page and the telephone to carry amazingly different types of information. While most of us stick to common approaches, and 'stay inside the lines' while filling out forms, it's possible to break out of the mold. If an invoice needs to have extra information scrawled on it, there are always the margins or some kind of whitespace, and the mere act of writing 'outside the lines' draws attention to exactly that content.
By treating XML as unformed plastic, amorphous, and moldable, rather than trying to distribute the same pieces to everyone, we can bring that same flexibility to XML and let users adapt it to new situations. Rather than simply providing 'escape clause' areas where people can add notes if necessary (though always with a warning that they may be misunderstood or simply discarded), we can let people express through structures as well as language the meaning they are trying to convey. Building extensibility into documents and document descriptions allows the documents to carry information on an as-needed basis.
Leaving XML 'unformed', and letting people create their own structures ensures that XML applications won't be trapped when needs change and old standards are no longer up to the task. It builds in a flexibility that isn't possible with the current approach of centralized schemas and DTDs, allowing XML applications - even applications distributed across thousands of desktops - to grow and evolve into new forms without the need for large consulting contracts and another call to Central Planning.
Experts and standards organizations still have an important role to play even in the brave new world of distributed decision-making. Although many levels of standards development can be handed over to user organizations, centralized decision-making bodies can still contribute. Small standards, which are easy to use and enhance interoperability, can still provide a foundation on which more diverse standards can be built. XML 1.0, and HTML and HTTP's earliest versions are all examples of these kinds of standards, which enabled large numbers of people to go out and do their own thing. As those standards age, they undoubtedly become more complex, but they have the benefit of experience on their side.
As more and more people create vocabularies, a certain amount of standardization will no doubt emerge, based on the convenience factor it promises. While mapping information between schemas may not be terribly difficult, common vocabularies promise to reduce the need to do such work at all. Rather than starting with a complete vocabulary, however, a distributed approach would let people build their own vocabularies and gradually map their intersections into 'suggested' conventions.
While this approach might take longer than an expert community developing standards, it might also better reflect the needs of all involved. Experts might well have a role in exploring intersections and developing solutions that will be optimal, for a time, but the point is to leave final decision making with users rather than strapping them into a straitjacket someone else built.
Copyright 1999 by Simon St.Laurent.