by Jay Fienberg

Introducing WTF (Whoopdedoo Text Format)

posted: Apr 10, 2005 9:54:30 PM

Note: my original plan was to post this on April 1st so as to further suggest the joke in this otherwise very serious development!

In the process of developing the LinkList for blojsom (aka LL4blojsom), I settled on a non-XML data file format that I've found convenient enough to use that I've decided to release it as it's own sub-project of the iCite net. The format and project are called WTF, which is an acronym for "Whoopdedoo Text Format".

I'll be releasing a full "spec" for WTF, but if you read its description in the LL4blojsom documentation, or look at this powered by blojsom list in WTF, you can see that it's pretty simple.

So, along with the spec, I'll be releasing a Java library that parses WTF and also does various data operations on Java collections loaded from / saved to WTF sources. I'd also like to (eventually) develop a way for blojsom to use WTF as an alternative format for its blog entries (and comments, etc.).

There is also a more complex part of WTF that I haven't fully documented, but which I believe will allow WTF to fully represent RDF (i.e., it accounts for namespaces). I'm currently working on the transform of my blogroll into FOAF as an example.

(Note also: WTF supports internationalization and different character sets by requiring that character set info about a file be accessible outside of the file, e.g., in the content-type of an HTTP header or in [meta]data in a referencing file—both methods are implemented in LL4blojsom.)

WTF can also function as an alternative to a number of XML formats, e.g., with my blogroll I'm using it as an alternative to OPML (though note: I will be generating OPML as a transform from WTF too), and also plan to use it as an alternative to other popular XML formats.

Finally, there are additional (schema related) data format features, outside of XML and RDF, that I need for the iCite net's data structures, and I've been developing WTF with this in mind.


It seems useful to note here that are a number of non-XML data formats being developed that I would recommend to anyone looking at alternatives to XML.

JSON (JavaScript Object Notation) is a data interchange format based on JavaScript notation. It's usable directly within many languages, and I first came across it via Michal Mizurski's post on his own JSON PHP implementation.

I've mentioned YAML (YAML Ain't Markup Language) here before. One of the main developers is Brian Ingerson, whom I've been fortunate to meet since I've moved to Seattle. YAML can be used simply, but it can also be used in very sophisticated ways.

Also, I recently read Lion Kimbro's latest spec for his Local Names format, which is currently not an XML format either, fyi.


Finally, I should note here, for those that don't know my background, that I've been using SGML / XML for more than ten years and I've more than once been described as one of those people who "likes angle brackets". And, I do.

But, my basic two-liner on why I'm not using XML here is that I like to use markup languages for situations (e.g., textual formatting) that essentially require markup. And, more and more, I'm inclined to suggest that data storage, interchange and querying don't generally require markup—markup adds a conceptual overhead around syntax and data structure that I find to be burdensome for just data.

