today we live in a world with ever more digital data from an ever increasing number of sources. and a world all in which all of this data is ever more connected via the web. information technology, no longer controlled by an ordained elite with the power to control by whom, how, and wherefore information is created, processed, and distributed is now largely in the hands of "the people" who are now using the means at their disposal to create massive amounts of data with an unprecedented level of freedom and ease, driving unprecedented levels of creativity and innovation, as well as noise. several important open standards for how this data is represented and distributed have been critical in enabling this tidal wave of information to set forth - TCP/IP, HTTP, and HTML being chief among them. the philosophy of "open source" computer code has been important, as well.
okay, we know all this, i hear you saying. get to the point, you say. we're gettin there ...
by and large the data in this tidal wave is unstructured. HTML being in large part a standard for marking up unstructured text, this makes sense. while Google does an admirable job of helping you harvest this sea of unstructured data, it can't help you with all that structured data out there, much of which is locked up in relational databases behind firewalls, only presented to the outside world in chopped up, regurgited, mixed-with-HTML form. what's missing is a standard for structured data that will scale to the broad, decentralized, and open nature of the web. old models of data that worked well within isolated, well-controlled domains will not scale to meet the requirements of a massive, global web of data.
but i misspoke. we do have such a model of data, and for anyone interested enough to read this far you probably know what I'm about to say: RDF. in RDF, everything has an identifier, called a URI, which is global in scope. more importantly, RDF's structural properties give it the flexibility to accomodate all of the world's structured data in one big structured database - the fabled "Semantic Web", that could be queried with a language that is as powerful as SQL is for relational databases. don't underestimate the gravity and presumption of this statement. all of the data now locked up in relational database silos, and in non-relational ones, with the great multitude of world views, concepts, and prejudices that the schemas underlying those databases embody, could be united into one giant database. and then, at any time, anything, anywhere, could be related to anything anywhere else in the world, in any way, by merely creating a labeled pointer, and then a query involving the relationship between these two things could be executed. the phrases "at any time" and in "any way" are key here. in RDF the relationships are dynamic, rather than being predefined by a schema as they are in the relational world.
"wow - data integration nirvana!", some who have worked in enterprise data integration might say. but then they would scratch their heads and say, "it's not so simple as that". there are all kinds of issues surrounding how data from different sources was modeled, the meanings of the different fields and tables and such, formatting issues, and all that dirty data out there. but this would only underscore RDF's unique potential as a model of structured data for the web. these sorts of problems have perenially plagued those working in the trenches of enterprise data integration efforts. many of these problems are in large part due to the fact that there is no perfect schema; the corporate data model is a myth; or as clay shirky would say: "ontologies are overrated". and rather than going away, these problems are only magnified exponentially when you scale out to the web. the genius of RDF is that it doesn't see resolving all of these "ontological" issues as a prerequisite for integration (that is, unless you're in the ontology-oriented RDF camp, in which case you see the use of ontologies modeled in languages like OWL as a key component of the semantic web. i actually believe that the dissonance in the discourse about RDF and the semantic web, between discussions of its fundamental flexibility on the one hand and very esoteric discussions about ontologies on the other, is largely responsible for the confusion surrounding it, and for how slow RDF has been on the uptake). we can unify and connect all of the world's structured data even though it's all quite messy, complicated, and multi-faceted. and even as there is ever more data produced, and the lines we draw in the data are continually erased and redrawn, RDF accomodates all of this roiling diversity, change, instability, and uncertainty quite well.
of course, the class model is rarely perfect and often changes. iterative development styles and refactoring techniques arose to address this reality. more recently, reflection-based techniques and dynamic byte-code manipulation are the rage, allowing for programs that are more robust and flexible in the face of variability in class structures. but these techniques are rather cumbersome to use, and seem like a big ugly patch on a language that is fundamentally statically typed. prototype-based languages, on the other hand, start out with the assumption that you cannot predefine a perfect class model. there are no classes of data, only instances. some of those instances may serve as prototypes for other instances, but by and large the language is much more empirically oriented than formally oriented.