Differences in vocabulary

The workshop I attended last week at Harvard was a really great learning experience for me. I spend most of my time focused on technology, but not always as often on the policies and practices that need to be put in place to best leverage those technologies. Over the two day workshop last week I had the chance to listen to a number of the technology leaders in the public sector. They have a big focus on interoperability to help make communication more efficient, and there was a big focus on archival as well.

One interesting things I noticed right away was that we don't always use the same terminology. There were a couple that really stood out to me because of how often I hear folks talk about them in my current job:

Interoperability

For a number of folks in the public sector interoperability primarily meant radio communication. For example, with police, they want to make sure everyone can quickly and efficiently communicate with each other. This is why standards in communication are very important. When I would mention interoperability in relation to document formats and semantics, often times there would be a little confusion at first because of this. The principles are the same, but I usually mean that more generally interoperability applies to creating an efficient way to share information.

Custom defined schema is the key for interoperability in terms of documents. Some of the folks at the session referred to this as "Web 3.0". I'm not sure it needs that grand of a name, as it's a pretty basic concept. It just means that documents are no longer documents in the traditional sense, but instead collections of data. Another way people talk about this basic concept is "micro formats." It doesn't really matter what term you use though, what's important is to realize that in order to truly get quick efficient sharing of data, you need to have the ability to structure that data within your documents.

If the office document that your target users are creating goes beyond simply specifying the display information but also calls out directly all the semantic information, this takes you to a completely different level of interoperability. When I think of interoperability, I think of documents interacting with systems and processes in ways no one is really doing right now.

Open Source

This was a really big eye opener for me. Many people were talking about open source specifically as a content sharing model. In many folks' minds, Wikipedia and Open Source could be thought of an analogous. I think I've adopted a view of open source that is much too narrow. There were folks from the defense department for example who said they wanted to set up an open source model for sharing intelligence information within their organization. I had previously thought about open source more in terms of the licensing model chosen. Well obviously the folks from the defense department weren't thinking they wanted to put all the content under the GPL, but instead they wanted a system where people could easily share information within their targeted community. This is something I believe strongly in. Easy collaboration and sharing of content was one of the big scenarios we were going after with the XML formats in Office. If the server technology is able to interpret the document content, you can build some powerful solutions.

Other Observations

There are a bunch of other things I wanted to write down about the conference, and hopefully I'll get to them in the next couple days. There were some IBM folks there as well, including Bob Sutor. Bob led a discussion around a case study where folks realized they had to bring more emotion back to certain IT systems that had become too rigid and form like. This study focused on child welfare case workers weren't being encouraged to really think about each case, and were forced to use a system that basically just consisted of a series of checkboxes to go through. This moved folks away from focusing on the true need, which focusing on what's in the child's best interest. So they made the move change up the system and to standardize around a separate color scheme for all sites relating to children. At first there was pushback because folks would say: "We can't do that, we have standards, and everything has to be blue." Eventually though there were able to break away from the rigid system and build something smarter and more targeted. It was an interesting talk.

I was also really interested in the document archival case studies. I worked with the British Library and the Library of Congress on the OpenXML standardization last year, so I've already had a lot of exposure to this issue. It's a very important issue that governments are now dealing with. You have content coming in all sorts of formats (documents, video, audio, e-mail, pictures, etc.), and it's important to maintain them all for the public good. The case study that was discussed at the workshop last week was with the State of Washington digital archives. I'll write up a separate post on that as I believe it's a really important topic.

-Brian

Comments

  • Anonymous
    March 29, 2007
    Hey Brian, With regards to Open Source, I think your notions are (were) correct. The term is evolving. The first I heard this new definition of Open Source was on NPR's Radio Open Source (http://www.radioopensource.org/). At first, I thought they were co-opting the term, "Open Source", but its clear that the term has taken on a broader meaning. The best synonym I have for it now is "transparent".

  • Anonymous
    March 29, 2007
    The comment has been removed

  • Anonymous
    March 29, 2007
    Dennis, I should be clear that when I say custom content, I speaking from the Office suite productivity point of view. That custom content may be a schema that is defined by an industry group and applies to millions of people. The reason we call it custom content though is that those schemas shouldn't be influenced by the producers of the file formats. They are at a higher level. Does that change your view? -Brian

  • Anonymous
    March 29, 2007
    Brian wrote "The case study that was discussed at the workshop last week was with the State of Washington digital archives." Ok, I am going to go google the heck out of that but I am very interested to learn more about the discussion of the Washington State case and how it applies to archival and interoperability.  I work for a county gov. office in Washington and we have been talking about data and document managment.  Thanks for the tip.  

  • Anonymous
    April 02, 2007
    Keith, unless there's something new, the case study Brian may be talking about is located at the following link: Case Study State of Washington Digital Archives Project http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=49160 and more info: Digital Archives Background and History http://www.digitalarchives.wa.gov/Content.aspx?txt=background Q&A: Washington State Introduces Digital Archive Solution, First of its Kind and Based on Microsoft Technology http://www.microsoft.com/presspass/features/2004/oct04/10-04DigitalArchives.mspx

  • Anonymous
    April 03, 2007
    The comment has been removed

  • Anonymous
    April 03, 2007
    Keith and n4cer: I think it is better to start at the top to see how the Digital Archives are used and how they are shaping up: http://www.digitalarchives.wa.gov/default.aspx There's also this, also under the Secretary of State: http://www.digitalarchives.wa.gov/Content.aspx?txt=intro It is interesting how bit rot is characterized, and the common litany about how documents become orphaned by the disappearance of the software that supports their electronic formats. There is an older inter-governmental initiative that has been going on in WA and I would hope that all county folk are aware of it, since there is rule making around accessibility and deposit with the state, as I recall.  Oh, I'm thinking of GILS and WAGILS: http://orcmid.com/blog/2002_10_27_lair-chive.asp#83794140