Metadata and vagueness

posted: Oct 31, 2003 7:58:53 PM

I am a huge fan of David Weinberger's writings and blog, but his recent Metadata and Desire article seems pretty vague. In general, the idea that data and metadata are similar, but different, is often confusing, and I think David's arguments about the limitations of metadata fail to clarify what he really means by metadata.

In a closed system of data, say a table listing first name and last name, there can be metadata—for example, a catalog which lists that this table is called NameTable and that it lives on my C drive. The table is data, the table definition is metadata. Clean boundary.

If you don't have a strictly closed system, you can still draw boundaries and talk about what is maybe inside (data) or outside (metadata). But, if you don't make clear what your boundaries are (and, in most cases, where there are multiple ways to look at it, which set of boundaries you are talking about), then talking about metadata having some issues different than data issues can be pretty vague.

For example, David writes:

But the most important fact about metadata is also its limitation. For metadata to work, it has to be part of a "schema," a set of categories and relationships that computers can make sense of.

How is this different than data. Is the numeral 23 not part of a "schema"? It is, in fact, part of a schema called a number system. Same with alphabets, dates, addresses (at least to some degree), etc.

I think what David is getting at is that, in an open system like the Internet, creating new metadata schemas and sharing them is not easy. But, I think the same can be said for data schemas as well—and I am not sure there is that big a difference.

Is it really a bigger philosophical dilemma if people try to use FOAF than if they try to use a two-digit year date format? Is it really a bigger philosophical issue converting from one metadata schema to another than converting from one date format to another?

One of the reasons boundaries are made between data and metadata is that, typically, data is the stuff that is easy to convert and manipulate and metadata is less so. It is easier to display a date in a different format than to restructure a database—in a database. Maybe in some other systems, the opposite is true.

In an open system, if people work with metadata like its data, well, then it is like data to them. I think folks who query RDF data using ontologies to map between schemas basically treat schemas like data (from the point of view of the ontology metadata).

To me, the real issue is that all metadata should be allowed to be data, and made accessible as data. The semantic web might envision something like this, except it seems to keep requiring more and more abstractions—in other words, in order for something to be more metadata, it has to be expressed in a more abstract system than the data it describes.

I think David's critique is actually in reaction to this. As he aptly concludes his article:

But that means that metadata, an abstraction of an abstraction, is directly and intimately tied to human projects and human desire. And what's desire? Nothing but the way we're pulled into the world, over and over, against our will and in ways that constantly surprise us. So, the increasing need for metadata pulls us out of the world as our desire continuously pulls us into the world.

Again, I think he could say the same about data. But, I also think that the reason for the increasing abstraction is perhaps the desire for increasing abstraction, rather than need for metadata itself.

permalink | comments {2} · trackbacks {2}

Comments and Tracbacks

Comment by: David Weinberger ·
posted: Nov 1, 2003 8:10:24 AM

Jay, this is really helpful. Thanks. You're right that I'm vague about the diff between data and metadata (md). That's somewhat on purpose since I think it - like all useful terms - is loosely defined. MD is necessarily vague because there's no such thing as a datum on its own; it only becomes a datum if it's part of an information scheme, with the row/col headers counting as sorta kinda md. I suspect, however, that you disagree that md is loosely defined; that could be the core of our disagreement. You're of course right that md isn't new and I didn't mean to say otherwise. What's new is the amount of it and our increasing need to manage it, and its effect on privacy, and the weird set of beliefs (from my POV) about md as exemplified by the adherents of the grand version of the Semantic Web. I disagree with you about date formats vs. FOAF. Once we agree on the categories we need for FOAF then the discussion becomes like the date format discussion; it's the discussions before that that are interesting and important and fraught with impossibility. I do think that there are societal forces bringing us to greater abstraction; the cowardly urge for irony is one such force. But the explosion of information and its consequent need for more/better md, combined with our cowardly decision to swap real freedom for the illusion of security, combined with the worldwide opportunity to mistake human voice for information, all directly feed our desire for md...which then in turn increases the abstractness with which we live.

Comment awaiting approval
posted: Mar 21, 2006 4:03:38 PM

trackback from: A Networked World
posted: Nov 3, 2003 12:12:07 AM
title: Metadata and the Joy of Vague Boundaries

Dave Weinberger points with approval to Jay Fienberg's reflexive post on Metadata and vagueness. I agree with most of what both of them are saying, so I wonder why all this stuff about metadata is such a dog's breakfast. Why

trackback from: the iCite net development blog
posted: Nov 11, 2003 3:57:48 PM
title: Semantic web, my last word?

Since I have been talking too much about this already, I thought I should declare my "last words" on the matter, and move on to some other topics. So, my last words are this:

« prev post
» next post
