the iCite net > news / blog > a permalink

news and thoughts on and around the development of the iCite net
by Jay Fienberg

De Gan Filters, Bloom Filters, Peoples DNS, and CNS

posted: Apr 27, 2004 8:24:00 PM

Danny Ayers has an interesting post, De Gan Filters (also, some more background from Marc Canter) about Joel De Gan's De Gan Filter which extends the concept of Bloom Filters. These are interesting hash structures for highly compressed indexes with small, false-positive-only, error margins on lookups.

Joel is working on a PeoplesDNS project, and came up with this technique as a way to create an efficient index for people look-ups.

Danny's post has a P.S. which I almost totally agree with:

I [sic] just occurred to me that the big DB backing store needn't actually be local - it could just be the web itself. Shift everything down a step, so your local cache is an unreliable map of a whole section of the web (e.g. the FOAFworld). When you do a query, that map tells you if you've got a chance of the getting the result. If there is a chance, you rescutter the actual data of interest.

Part of the beauty of DNS is how it actually does both (centralized and decentralized / local and remote lookup databases). I think there is quite a model (distributed information) ecosystem between the root servers and all the distributed DNS servers. DNS servers can act peer-to-peer, but also have a recourse to higher-up / root servers as well.

I have tried to follow this model in my CNS development for the iCite net. There are trade-offs in having any of it centralized, but I think the positives (essentially third-party verification) outweigh the negatives (possibility for domain name registry exploitativeness).

Also though, I think it is important that DNS-type registries (like CNS) and search be kept autonomous. In other words, DNS registers the location of servers, CNS registers the location of "content", but a search service for looking up information on those servers or in that content functions in parallel to (and, actually, a layer above) the function of the DNS or CNS registries.

(I guess the main distinction is that the index and the query interface to the index can be operated as two automonous systems.)

permalink | comments {5} · trackbacks {1}

also available as: rss · rss2 · rdf · atom

Comments and Tracbacks

Comment by: Joel De Gan ·
posted: May 3, 2004 3:15:03 PM

Hey, I actually just updated a bit on the peoplesdns system to clarify some of the project goals.
"The pDNS system will be opensource, non-centralized and will be a series of independent servers operated by individuals and companies. will act as 'Node1' and distribute 'zone files' much the same way as the DNS system currently works."
This distributes load and allows us to integrate foaf between all "registrars" such as all the current social networking sites. This in effect acts as a bridge between all networks and really creates a system that displays the true nature of how people interact.
Anyway, good post.

Comment by: Jay Fienberg ·
posted: May 3, 2004 5:21:08 PM

Thanks Joel, and glad to see that you are following this distributed model. Sounds good!

I almost commented on your blog earlier today, in response to your post about adding a module for pDNS (DNS-like) information within FOAF files. My own approach has been very different and I'm generally skeptical about using files directly as structural containers for distributed data. But, then I decided: vive la difference!

If you are interested, you might search on "files" in this site's search engine and look at some of the posts I've made in the past about FOAF and other semantic information altogether being stuck in "files".

Comment by: Joel ·
posted: May 4, 2004 1:50:49 AM

Hey, as a note..
The zone files would only be downloaded by sites that are acting as pdns servers. Mostly mysql data in compressed form to save on bandwidth usage.
I am a little funky on "files" myself. I.e. I hate that the standard is "foaf.rdf"... what if I don't want to have a foaf.rdf file on my server? What if I want a foaf.php so it is dynamic? Or just "foaf" or... or..

Comment by: Morten Frederiksen ·
posted: May 4, 2004 4:54:20 AM

Re foaf.rdf vs. foaf.php: There's is no "standard" for naming FOAF files, just as there is no standard for URIs in general - you can call your file what you want.

That said, quite a few people use that convention, because, well, it conveys some information on what to expect, just like most HTML pages use .html as the extension - one should not assume too much though...

Comment by: Jay Fienberg ·
posted: May 4, 2004 11:13:59 AM

Thanks for your further comments Joel, and your comment Morten.

Some more about my own comments:

In DNS, the zone files (locally) persist the data available through the DNS service, and DNS servers don't need to use files per se: they just have a (local) method for storing the data. This is why I would say that DNS is not designed around files. It is really a database / service protocol designed around transferring data.

So, the thing that I wonder about some RDF / FOAF usage is: often, you have data in a database, you turn it into RDF / FOAF and stick it in a file, and then someone writes an application that reads that file and extracts the data back into a database.

When I think about that kind of interaction, I think RDF is designed around transferring data. But, rather than being accessed by a database / service protocol, it is commonly limited to using files.

What I have been trying to do with the iCite net is build a database / service protocol that, among other things, is compatible with RDF data interchange. This is meant to more or less supplant the need to think in terms of files.

So, with FOAF, ideally, I would like to have my FOAF property data and manage / publish / share that: not manage / publish / share my FOAF file(s).

trackback from: the iCite net development blog
posted: May 3, 2004 7:52:49 PM
title: Simple suggestion for pDNS and FOAF (based on CNS)

Based on my work developing CNS for the iCite net, I wanted to suggest a way to think about how a PeoplesDNS (pDNS) based on FOAF might function

Note: All comments and trackbacks are moderated. Spam is deleted. Other comments are approved as promptly as possible.

Note: Older posts no longer accept new comments or trackbacks.

« prev post
Site search added, and (unrelated) the semantic web search engine

» next post
OK, wow! new Sony LIBRIe e-Ink book

blog newsfeeds

brief content:

 XML  ·  RSS  ·  RDF  ·  Atom 

full content:

 XML  ·  RSS  ·  RDF  ·  Atom 

blog archive

jan · feb · mar · apr
may · jun · jul · aug 
sep · oct · nov · dec
jan · feb · mar · apr
may · jun · jul · aug
sep · oct · nov · dec

jan · feb · mar · apr
may · jun · jul · aug
sep · oct · nov · dec

may · jun · jul · aug
sep · oct · nov · dec

first post: 
April 30, 2003

highlight views:
Spammers' Choice

Jay elsewhere online
Jay Fienberg - the official home page

Wrong Notes - the music blog of the Ear Reverends

Fine & Full, aka, a fine and full burger

Sociomobilepoetextologia (moblog, currently inactive due to lack of proper mobile)

to enjoy roll
sites I like to read when I start from here

· Anastasia Fuller
· Andy Baio
· Biz Stone
· Boris Mann
· Bre Pettis
· Chris Dent
· Danny Ayers
· Dare Obasanjo
· David Czarnecki
· David Weinberger
· Don Park
· Evan Williams
· Greg Narain
· Jason Kottke
· Jim Benson
· Lucas Gonze
· Marc Canter
· Matt May
· Matt Mullenweg
· Michal Migurski
· Nancy White
· Rebecca Blood
· Reg Cheramy
· Richard MacManus
· Sam Ruby
· Shelley Powers
· Tim Bray
· danah boyd

powered by blojsom

Entries by blojsim