Udostępnij za pośrednictwem


How should the relationships between tags be defined, and by whom?

IT Conversations has an audio recording of Clay Shirky's entertaining and thought provoking presentation (update: text version here) he gave at O'Reilly Emerging Technology Conference in March this year, 'Ontology is Overrated'.  Chris May has a useful summary and highlights one of Shirky's 2 key points from the talk:

"Hierarchical ontologies are fundamentally not suitable for non-physical information because they're predicated on an object being in one place at one time - which isn't true"

Mark Taylor develops upon another:

"Shirky believes that, ultimately, the choice between structured or unstructured metadata boils down to a philosophical question: does the world make sense, or do we make sense of the world? I prefer to believe that the question is: is everyone equally capable of making sense of the world, or are some better qualified to do it than others?"

Continuing Mark's line of questioning, I'd like to point out how Technorati and Del.icio.us have enabled anyone to make sense of their world through tags, and have added a layer of additional value by providing and interface into relationships between tags: related tags.

But I have my own related question.

In the case of Technorati's implementation of related tags I can start by searching for posts tagged with 'programming' and then browse through the resulting post titles and excerpts. The results page also provides a list of related tags: technology; Computer; software; Computers; Web; PHP; .Net; Ruby; Java. Now when I click the 'software' related tag, the results page returns posts tagged with 'software' but the selection of related tags here doesn't include 'programming'. So, as a post by 'A Consuming Experience' puts it:

"if A's related tags are said to be B and C, B's related tags aren't necessarily A and C."

The following diagram shows the relationship between the various tags on Technorati. Most are unidirectional:

In Del.icio.us the same tags 'software' and 'programming' are bi-directional, and have relationships to more tags:

In both examples, the relationships between tags are absolute, either tags are related (in one direction or the other) or they are not. There is no notion of one being more or less related to another (which could change according to context).

Technorati and Del.icio.us tags are designed to help us 'make sense of the world' by letting us tag the world as we see, without limitations on taxonomy. You can create any tag you like and attached it to any article. But as you can see, in the area of 'related tags' both Technorati and Del.icio.us are imposing rules that define a 'related tag' (i.e. the strengths and the directions of relationships between tags, whether one points to the other, etc.). I don't have any issues with their implementations - they are better than having no related tags at all.

The question I'm asking here is how should the relationships between tags be defined, and by whom?

My instinct tells me that in the case of related tags, an approach that allows the user to the define the relationships will succeed (i.e. used more) over one where a single entity is the sole arbitrator. This is where the magic of WS/APIs could potentially come in. I'm willing to bet that if Technorait and Del.icio.us opened the APIs into the necessary data so developers could create their own interpretation of what makes a tag related, we'd see some amazing stuff...these could manifest at tools or settings or more APIs that others (including Technorait and Del.icio.us) can connect to and tweak to their own liking..I'd also bet that whatever the winning algorithm (or combination of algorithms might be), it would be beautifully simple in terms of its rules but allowing for emerging smartness (usefulness) to shine through.

(Technorati Tags: technorati, del.icio.us, tags, related tags, ontology, folksonomy, tech, web)

Comments