Taxonomy in a Digital World Part 2
Continuing the notes I have made on Everything is Miscellaneous...
Chapter 3 - The Geography of Knowledge
This chapter examines the Dewey Decimal system of classification. It shows how the system is skewed based on the 19th Century American-Christian views of its creator. The implication is that by trying to create classification systems for knowledge we will always end up with a system slanted by the cultural and political norms of the day - limiting their usefulness for finding information in the future.
Dewey's vision was that the floor plan of a library would be a map of ideas. He wanted to spatialize ideas.
Dewey's system has a limited number of top-level categories. Q. What happens when something new and important comes along? Can we fix the system for the 21st Century?
"The Dewey Decimal System can't be fixed because knowledge itself is unfixed. Knowledge is diverse, changing, imbued with the cultural values of the moment. The world is too diverse for any single classification system to work for everyone in culture at every time"
[Does the same apply to those of us creating taxonomies for cataloguing information in our corporations around the world...? How fixed is the information people create in documents? Do documents constrain us, and will knowledge become less fixed as more Web 2.0 techniques are applied in the Enterprise?]
Weinberger contrasts this with how Amazon organises information.
- Collaborative filtering - based on other users actions.
- Designed to introduce you quickly to relevant information you didn't know you wanted
- Customer reviews enable them to "sell more of what people like"
- Makes use of network effects - the usefulness of the system increases the more people use the system
- Look for patterns in the text of books pulling out statistically interesting phrases to enable similar books to be grouped together.
- Personal organisation for each user. (Based on my history, and enabling me to customise which purchases are used in generating this organisation). Rather than Dewey's single universal system
"The fundamental problem with Dewey's system is not that he was an eccentric or that his early education was provincial. The real problem is that any map of knowledge assumes that knowledge has a geography, that is a top-down view. That assumption makes sense in the 1st and 2nd orders of order. In unnecessarily inhibits the useful miscellaneousness of the 3rd."
Chapter 4 - Lumps and Splits
Weinberger introduces the basic concepts we use without thinking to categorize things. Lists - the most basic concept - have the inbuilt assumption that they are about something. Trees and nesting - nesting he says is the fundamental technique of human understanding. He explains how Aristotle made the leap in human understanding to conceive tree structures.
But trees come with embedded assumptions:
- A well constructed tree gives each thing a place. If too many items get shoved in the miscellaneous pile the three is not doing its job.
- Each thing gets only one place
- No one category should be too big or too small. [7 +/- 2 rule again? for branches]
- It should be obvious what the defining principle of each category is.
"...our knowledge of the world has assumed the shape of a tree because that knowledge has been shackled to the physical. Now that the digitizing of information is allowing us to go beyond the physical in ways Aristotle could not have dreamed the shape of knowledge is changing."
Lump and split are technical terms among indexers. A lumper takes things that seem disparate and combines them because they are similar. A splitter takes things that have been lumped together and separates them into smaller categories.
Trees without paper - in a digital world we want a tree that arranges itself to your way of thinking and then change the next day when you need to view the world in a different way. That is, a faceted classification system that dynamically constructs a browsable branching tree that immediately meets your needs. This kind of system was first invented in the 1930's by an Indian librarian. In such a system no facets have to be assumed to be the root.
"In the third order of order, a leaf can hang on many branches, it can hang on different branches for different people and it can change branches for the same person if she decides to look at the subject differently...In the third order of order, knowledge doesn't have a [single] shape."
Taxonomy in a Digital World Part 3
Technorati tags: David Weinberger, Everything Is Miscellaneous, Taxonomy, Faceted Classification, Collaborative Filtering
Comments
Anonymous
June 16, 2007
Hi Mark; I'm one of those people who has struggled with many of the conceptual underpinnings of information systems for a long time. Parent::child, hierarchies, trees, and object:entity class relations - the whole shootin' match has caused me to doubt my intelligence and sanity almost since day one. I'm convinced that a large part of the disconnect between 'business' and IT is due to a sort of linguistic impedance which manifests itself in all sorts of strange ways. I agree that the primary difficulty with applications - from directory structures to SAP - is the relative rigidity of the concepts listed above when contrasted with the fluidity of reality. Thanks for describing the problem and some possible solutions in a much more elegant way than I can. By way of being part of the solution, I have developed what I hope will become a more natural, multi-dimensional knowledge naviagation framework than is currently availaible. It uses faceted classification which on one hand makes use of predictable dimensions and relationships to help the lumpers, but supports combinations of existing values to keep the spitters happy as well. Discovery and retrieval all based on very natural categories. Great site - I now have it bookmarked and will visit often! John O'Anonymous
June 17, 2007
John - I won't take the credit here. I am just the summarizer. Thanks for the compliments anyway.Anonymous
June 25, 2007
Amazon delivered me a copy of David Weinberger's Everything Is Miscellaneous at the weekend. I love the