Share via


Extracting Metadata from Windows Media files

In this article, I'll describe how to use the Windows Media Format SDK to access the metadata embedded in Windows Media files for cataloguing purposes. Also included is two managed classes written in C# that vastly simplify the usage of this SDK.

Download MediaCatalog 1.0 (35KB)

Introduction

Over the last year, I've been gradually filling a spare hard disk with rips of all my CDs. It's fantastic to be able to play any CD from my catalogue so easily, and it means I can hide the CDs themselves away from my young daughter's sticky fingers! The problem is that as my digital collection has accumulated, it's getting harder to see what I've got. I've painstakingly tagged all my CDs with metadata, but Windows doesn't currently provide any easy mechanism to sort or manipulate that metadata. So I thought I'd follow Duncan Mackenzie's example and hack together a media cataloguing application.

The trouble is that it's difficult to extract the metadata from a Windows Media file. The Windows Media Player SDK provides a nice interop library you can use to embed Windows Media Player in your managed application and drive it programmatically, but I definitely wanted to avoid driving GUI controls, given the number of files I want to catalogue. Instead, I fired up MSDN Library and discovered the Windows Media Format SDK, a low-level API into the file format itself.

This SDK isn't easy to program against from managed code, however - it's pretty grungy COM interop. Fortunately, with the aid of MSDN, Adam Nathan's .NET and COM book and a quick look at some pretty dodgy samples, I was able to build a fairly clean managed wrapper that provides a straightforward interface into the SDK. Ironically, I haven't finished writing the graphical front-end catalogue application that generated the itch in the first place, but I thought the managed library was interesting enough in its own right to share.

Using MediaCatalog

I've divided up the managed library into a high-level API and a low-level API. The low-level API is a class that allows you to open a media file, examine the attributes by index or name, and enumerate through them using a foreach loop. The high-level API abstracts the previous class and provides methods to allow recursive or non-recursive iteration through a database structure, creating a strongly-typed DataSet object that contains all the most common attributes in the audio files it finds. You could bind the output to a Windows Forms DataGrid, for example, and indeed the sample test harness included with the code does exactly that.

Low-Level API

To access the low-level API, you instantiate an object of type MetadataEditor, passing the constructor the filename of the media file you're interested in. You can then either enumerate through the object using a foreach statement, or query it by name or using an indexer. The object supports an int-based indexer or alternatively an enum-based indexer that simplifies access using common attributes. The following C# code sample demonstrates each of these choices.

    using (MetadataEditor md = new MetadataEditor("britney.wma"))
   {
      // Enumerate through each of the attributes in the file
      foreach(Attribute attr in md)
      {
         Console.Write(attr.Name);
         Console.Write(": ");
         Console.WriteLine(attr.Value.ToString());
      }
      
      // Set author to be the bitrate of the media file
      string author = md.GetAttributeByName("ID3/TPE1");
      // Set d to the duration of the media file (e.g. 3m 45s)
      TimeSpan d = md[MediaMetadata.Duration];
   }

Remember that since this object uses unmanaged resources, it's important to call the MetadataEditor.Dispose() method when you've finished using it in order to close the underlying resources. Alternatively wrap it inside a using statement as demonstrated above.

High-Level API

This API contains three main methods that can be used to extract album information across multiple directories if necessary:

Method Description
RetrieveTrackInfo Retrieves structured property information for the given media file. Returns a TrackInfo object containing commonly-used fields.
RetrieveSingleDirectoryInfo Retrieves media information for a single directory. Returns a MediaData object (a strongly-typed DataSet)
RetrieveRecursiveDirectoryInfo Recursively trawls through a directory structure for media files, using them to build a DataSet of media metadata. Returns a MediaData object.

As a quick example, the following C# code snippet binds a Windows Forms DataGrid to the output of RetrieveRecursiveDirectoryInfo:

    MediaDataManager mdm = new MediaDataManager();
   musicData = mdm.RetrieveRecursiveDirectoryInfo(@"\\timserver\music");
   mediaInfo.DataSource = musicData;
   mediaInfo.DataMember = "Track";

The MediaDataManager object also exposes an event that can be used to track progress (particularly useful during a long recursive directory search). Use the following syntax to enable it:

    mdm.TrackAdded += new MediaDataManager.TrackAddedEventHandler(mdm_TrackAdded);

Things To Do

The wrappers aren't complete by any means, and I'd love to hear your suggestions of how they might be improved (or even some code!). Several things on my own personal list:

  • Improve the intuitiveness of some of the class names
  • Add setters for the attributes to allow metadata to be modified
  • Add greater flexibility to the recursive searches to allow them to execute on a background thread
  • Write a decent cataloguing engine that takes advantage of the tools!

mediacatalog10.zip

Comments

  • Anonymous
    March 30, 2004
    LSN WebLog » getting meta data from WMA files

  • Anonymous
    June 26, 2004
    Hello.
    I have a very large list of links to WMV streams, however, the files are very large and there's great number of them, and I need a way to extract meta information from them without full download, sort of like loading Media Player and it gets meta information in the first few K.
    Do you have any idea how to implement this?
    The key assumption here is that we don't have full files. They are not downloaded.

  • Anonymous
    August 29, 2005
    Great article but the link to the Windows Media Format SDK doesn't work. I arrived here b/c I'm looking to download that SDK b/c according to Microsoft it includes a sample file for accessing and editing WM file metadata. However, I can't find the SDK to download. Has it been renamed and bundled with the Windows Media 9 Series SDK?

  • Anonymous
    September 01, 2005
    The WMFSDK can be found at http://msdn.microsoft.com/windowsmedia/downloads/default.aspx

  • Anonymous
    September 30, 2005
    The comment has been removed

  • Anonymous
    October 01, 2005
    The comment has been removed

  • Anonymous
    November 14, 2005
    hi, your code was quite interesting, but i didn't manage to retrieve WM/VideoHeight and WM/VideoWidth Attributes from various wmv files. always caught on ASF_E_NOTFOUND. Obviously the video resolution isn't stored as an attribute? ... any comment on this issue would be great.

  • Anonymous
    December 19, 2005
    The comment has been removed

  • Anonymous
    February 17, 2006
    Hmm.... I'm still getting errors when reading a second media file. I added the CloseStream function and I'm calling it before the end of my using block, as well. :-(

  • Anonymous
    March 06, 2006
    hi,
    a great article.
    i would appreciate if you could help me in gettign attributes of other files like .mpeg,.mov,*.rm.
    Any pointers to these would be very helpful.

    Regards,
    Sama

  • Anonymous
    March 24, 2006
    gettign attributes of other files like .mpeg,.mov,*.rm.

    can be done with DirectX SDK, very simple... much easier to use than Windows Media Format SDK.


    to retrieve WM/VideoHeight and WM/VideoWidth Attributes from various wmv files. can be done through accessing the video WMVIDEOINFOHEADER
    and you'll have everything you want from a wmv. and that's in the WMF SDK as well.

  • Anonymous
    May 24, 2006
    Very good article

  • Anonymous
    June 13, 2006
    Interesting. I never knew i could do this.

    Is there any complete software for this ?

  • Anonymous
    June 20, 2006
    There's a bug in the COM wrapper vtable definition for IWMMetadataEditor2. You have Flush and Close in the wrong order and you left off OpenEx as well. Here is the correct code (sorry if the formatting doesn't look good):

    public interface IWMMetadataEditor2
    {
    // HRESULT Open(const WCHAR* pwszFilename);
    void Open([In,MarshalAs(UnmanagedType.LPWStr)] string pwszFilename);

    // HRESULT Close();
    void Close();

    // HRESULT Flush();
    void Flush();

    // HRESULT OpenEx(const WCHAR* pwszFilename, DWORD dwDesiredAccess, DWORD dwShareMode);
    void OpenEx([In, MarshalAs(UnmanagedType.LPWStr)] string pwszFilename, [In] uint dwDesiredAccess, [In] uint dwShareMode);

    }

  • Anonymous
    September 19, 2006
    Hi
    I need something that will help me get the lyrics of the song asI play it so that I can do a karaoke to any song..is there any meta data that could be used for this???


  • Anonymous
    October 30, 2006
    Hi, I have a wmv file. Now I want to add a application specific data into wmv file. Is this possible and allowed? How can I achive this? Regards, Hemant. hemant_kulkarni@persistent.co.in

  • Anonymous
    November 06, 2006
    The following code generates an error when the Main window is closed. What i do is simply click on the Attributes windows (after having changed all the paths so that the code points to a valid file). I'm using Vista with VS 2005 (SP1 Beta).  Any ideas as to why? The exception I get is: System.Runtime.InteropServices.InvalidComObjectException was unhandled  Message="COM object that has been separated from its underlying RCW cannot be used."  Source="Microsoft.Samples.MediaCatalog"  StackTrace:       at Microsoft.Samples.MediaCatalog.MetadataEditor.Dispose(Boolean disposing)       at Microsoft.Samples.MediaCatalog.MetadataEditor.Finalize() protected virtual void Dispose(bool disposing) { if (!isDisposed) { if (disposing) { // No managed resources to clear up... }    // Clear up unmanaged resources        ((IWMMetadataEditor2)header).Close(); } isDisposed = true; }

  • Anonymous
    December 06, 2006
    Is there any way to retrieve a thumbnail that represents a .wmv file? I am looking for something like what will be displayed in explorer if you are viewing files as "thumbnails".

  • Anonymous
    December 12, 2006
    Could you please provide an example of how to embed an image in wma files? I've been trying to find out for ages, to no avail.