Delen via


Desktop Search: Solving the Wrong Problem as Quickly as Possible

Derek has a post entitled Search is not Search where he alludes to conversations we had about my post Apples and Oranges: WinFS and Google Desktop Search. His blog post reminds me about why I'm so disappointed that the benefits of adding structured metadata capabilities to file systems is being equated to desktop search tools that are a slightly better incarnation of the Unix grep command. Derek wrote

I was reminded of that conversation today, when catching up on a recent-ish publication from MIT's Haystack team: The Perfect Search Engine is Not Enough: A Study of Orienteering Behavior in Directed Search. One of the main points of the paper is that people tend not to use 'search' (think Google), even when they have enough information for search to likely be useful. Often they will instead go to a know location from which they believe they can find the information they are looking for.

For me the classic example is searching for music. While I tend to store my mp3s in a consistent directory structure such that the song's filename is the actual name of the song, I almost never use a generic directory search to find a song. I tend to think of songs as "song name: Conga Fury, band: Juno Reactor", or something like that, so when I'm looking for Conga Fury, I am more likely to walk the album subdirectories under my Juno Reactor directory, than I am to search for file "Conga Fury.mp3". The above paper talks a bit about why, and I think another key aspect that they don't mention is that search via navigation leverages our brain's innate fuzzy computation abilities. I may not remember how to spell "Conga Fury" or may think that it was "Conga Furvor", but by navigating to my solution, such inaccuracies are easily dealt with.

As Derek's example shows, comparing the scenarios enabled by a metadata based file system against those enabled by desktop search is like comparing navigating one's music library using iTunes versus using Google Desktop Search or the MSN Desktop Search to locate audio files.

Joe Wilcox (of Jupiter Research) seems to have reached a similar conclusion based on my reading of his post Yes, We're on the Road to Cairo where he wrote

WinFS could have anchored Microsoft's plans to unify search across the desktop, network and the Internet. Further delay creates opportunity for competitors like Google to deliver workable products. It's now obvious that rather than provide placeholder desktop search capabilities until Longhorn shipped, MSN will be Microsoft's major provider on the Windows desktop. That's assuming people really need the capability. Colleague Eric Peterson and I chatted about desktop search on Friday. Neither of us is convinced any of the current approaches hit the real consumer need. I see that as making more meaningful disparate bits of information and complex content types, like digital photos, music or videos.

WinFS promised to hit that need, particularly in Microsoft public demonstrations of Longhorn's capabilities. Now the onus and opportunity will fall on Apple, which plans to release metadata search capabilities with Mac OS 10.4 (a.k.a. "Tiger") in 2005. Right now, metadata holds the best promise of delivering more meaningful search and making sense of all the digital content piling up on consumers' and Websites' hard drives. But there are no standards around metadata. Now is the time for vendors to rally around a standard. No standard is a big problem. Take for example online music stores like iTunes, MSN Music or Napster, which all tag metadata slightly differently. Digital cameras capture some metadata about pictures, but not necessarily the same way. Then there are consumers using photo software to create their own custom metadata tags when they import photos.

I agree with his statements about where the real consumer need lies but disagree when he states that no standards around metadata exist. Music files have ID3 and digital images have EXIF. The problem isn't a lack of standards but instead a lack of support for these standards which is a totally different beast.

I was gung ho about WinFS because it looked like Microsoft was going to deliver a platform that made it easy for developers to build applications that took advantage of the rich metadata inherent in user documents and digital media. Of course, this would require applications that created content (e.g. digital cameras) to actually generate such metadata which they don't today. I find it sad to read posts like Robert Scoble's Desktop Search Reviewers' Guide where he wrote

2) Know what it can and can't do. For instance, desktop search today isn't good at finding photos. Why? Because when you take a photo the only thing that the computer knows about that file is the name and some information that the camera puts into the file (like the date it was taken, the shutter speed, etc). And the file name is usually something like DSC0050.jpg so that really isn't going to help you search for it. Hint: put your photos into a folder with a name like "wedding photos" and then your desktop search can find your wedding photos.

What is so depressing about this post is that it costs very little for the digital camera or its associated software to tag JPEG files with comments like 'wedding photos' as part of the EXIF data which would then make them accessible to various applications including desktop search tools. 

Perhaps the solution isn't expending resources to build a metadata platform that will be ignored by applications that create content today but instead giving these applications incentive to generate this metadata. For example, once I bought an iPod I became very careful to ensure that the ID3 information on the MP3s I'd load on it were accurate since I had a poor user experience otherwise.

I wonder what the iPod for digital photography is going to be. Maybe Microsoft should be investing in building such applications instead of boiling the oceans with efforts like WinFS which aim to ship everything including the kitchen sink in version 1.0.

Comments

  • Anonymous
    January 16, 2005
    Yes yes yes. You are spot on with your comments. I use google desktop search because it does do something useful. It combines the search of the internet and the desktop. It is not perfect, like it should search my pictures using the metadata or my mp3s using metadata... but it does work. Joe is wrong and in fact so wrong it is sad. Google meets my need perfectly and a simple download makes it worth even more. If the guys at MS believe that idiot... they are in trouble.

    Searching the computer is a hard thing, because search means a lot of different things to a lot of different people. Tags at the file level will help but that is not really the final fix. I am not sure what is though. If google will use EXIF tags on jpegs... that will be a start.
  • Anonymous
    January 16, 2005
    I agree completely. I've installed pretty well every desktop search app out there, mainly just out of curiousity, but the main problem is that I just don't use them. None of them actually integrate nicely into the Windows UI, and they only solve the search side of things, not the additional problem of actually adding metadata to files. I wrote a little more about this here:

    http://blog.jeffperrin.com/testosteles/posts/209.aspx
  • Anonymous
    January 16, 2005
    <i>What is so depressing about this post is that it costs very little for the digital camera or its associated software to tag JPEG files with comments like 'wedding photos' as part of the EXIF data which would then make them accessible to various applications including desktop search tools. </i>

    But how? Will there be a tiny keyboard attached to the camera or something? How can you attach a cheap and reliable input device to something as compact as regular digital camera?
  • Anonymous
    January 16, 2005
    The comment has been removed
  • Anonymous
    January 16, 2005
    Microsoft is investing in one WinFS metadata format to rule them all---ID3, EXIF, legacy Office document properties are all meant to be forgotten---just like XHTML means to be overshadowed by XAML and/ or WordProcessingML.

    In the meantime I figured out how to load the w3c XHTML schema into Word 2003 and use it as an alternaitve to WordProcessingML:

    http://songhaysystem.com/document.php?cmd=getDoc&get=24
  • Anonymous
    January 16, 2005
    Hrm. I'm sure you could easily extend MSN Desktop Search by using an IFilter to search EXIF and ID3 metadata.

    I just look at the Channel9 Wiki for MSN Desktop Search, I see listed an MP3 (ID3) IFilter, as well as a JPEG (EXIF) IFilter.

    IFilters can be built for any metadata type.

    Maybe these IFilters need to come bundled with the search? Or perhaps better marketing of it?
  • Anonymous
    January 17, 2005
    Alex,
    Digital cameras have software for managing uploading photos from the camera to your computer. This is where the metadata editing capabilities should exist.

    William,
    To me the problem is that the metadata isn't there not that desktop search tools don't search ID3 or EXIF metadata.
  • Anonymous
    January 17, 2005
    Dare,

    So essentially we're talking about post-processing pictures on the PC. Which is not a whole different from some tools available today, like Picasa, or simply naming your target folder on the PC appropriate.

    If there way a way to implant some input device, or perhaps a speech-to-text recognition system, so that the user can comment on the picture while taking it, and that would go into the EXIM tags, that would be a step forward. If you're just interrupting user during the upload process by asking them to name the folder, it's just something they will get annoyed at.

    I think.
  • Anonymous
    January 17, 2005
    Alex,
    If the user feels the expected payoff is enough then they won't view annotating their pictures as an interruption. For example, people who use http://www.flickr.com don't consider the fact that they have to tag their pictures an interruption of the photo sharing experience.