Importing your XML into a spreadsheet

I'm going to try to get back on top of all the e-mails folks have been sending me. If you've e-mailed me over the past 6 months the odds are that I didn't reply yet, and I'm sorry about that. I'll start reading through the e-mails again and rather than just replying directly I'll actually try to post it online so that other folks can take advantage of it as well (I'll keep your name and e-mail confidential unless you specify otherwise). Since there is far too much to get through if I look back too far, I'm going to start with June and move forward. If you e-mailed me before that and didn't hear back, please submit it again (sorry).

The first question was one dealing with how to import custom XML data into Excel for editing (Excel allows you to import and export your own custom XML data):

I have an issue mapping an Excel spreadsheet to an XML schema. The problem is that I have to accomodate data for both English and French customers. So, I might have an XML schema such as:

<SoftwareTitles>
  <Entry>
    <Title>Microsoft Office 2007</Title> 
    <desc lang="EN">...</desc> 
    <desc lang="FR">...</desc> 
  </Entry>
  <Entry>
    <Title>Visual Studio .NET</Title>
    <desc lang="EN">...</desc>
    <desc lang="FR">...</desc> 
  </Entry>
</SoftwareTitles>

(where "desc" refers to the "description" of the software).

The plan is to have these "desc" tags correspond to two separate columns in my spreadsheet (description - English, and Description - French, respectively).

I guess my question for you is, is there an easy way to map this type of XML schema in Excel? Any help would be much appreciated.

Basically the goal is to get a table that looks like this:

Title

Description (EN)

Description (FR)

Microsoft Office 2007

...

...

Visual Studio .NET

...

...

While the schema used above is totally fine (and in fact it's probably the best way to represent this info), it doesn't work well with the XML mapping functionality. The mapping functionality tries to turn the data into a table, and it does this by looking at the element name. It treats each node as a map-able value, so attribute values are used more as data, and not as a way of classifying the date. So if you wanted to map this into a grid, the best approach would be to first transform it into a more "table like" structure.

The easiest way to get this into Excel would be to transform the data into something more like this:

<

SoftwareTitles>
<Entry>
<Title>Microsoft Office 2007</Title>
<descEN>...</descEN>
<descFR>...</descFR>
</Entry>
<Entry>
<Title>Visual Studio .NET</Title>
<descEN>...</descEN>
<descFR>...</descFR>
</Entry>
</SoftwareTitles>

It's a pretty simple XSLT to go back and forth between these two schemas, and this will then give you a really easy schema to map into the spreadsheet for editing. I think an XSLT like this would probably allow you to go either way:

<?

xml version="1.0" encoding="utf-8"?>

<

xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform">

<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>

<xsl:template match="desc[@lang='EN']">
<descEN>
<xsl:apply-templates/>
</descEN>
</xsl:template>

<xsl:template match="desc[@lang='FR']">
<descFR>
<xsl:apply-templates/>
</descFR>
</xsl:template>

<xsl:template match="descEN">
<desc lang="EN">
<xsl:apply-templates/>
</desc>
</xsl:template>

<xsl:template match="descFR">
<desc lang="FR">
<xsl:apply-templates/>
</desc>
</xsl:template>

</

xsl:stylesheet>

-Brian

Comments

  • Anonymous
    July 17, 2007
    Dear Brian One problem I really have, as a small developer, is that the string values are kept separately from the spreadsheet itself. It takes a lot of developer  time and effort to try to reconcile worksheet1.xml with sharedstring.xml. Its ok for big software company to spend this effort, not for small fry like me. My question is: is there any cross platform way, may be XSLT that can reconcile SharedString.xml and the spreadsheet itself for me? Thanks, Cinly

  • Anonymous
    July 17, 2007
    Cinly, Are you more concerned with reading or writing the spreadsheet? If you are generating the spreadsheet, you can write the strings locally. If it's for consumption, then we'd need to give you an easy way to running some tool to put the strings inline. That's something that the SDK (http://blogs.msdn.com/brian_jones/archive/2007/06/04/open-xml-api-tech-preview.aspx) will hopefully add at some point, but let me ask around and see if anyone in the OpenXMLDeveloper community has already done so. -Brian

  • Anonymous
    July 17, 2007
    Cinly, Here is a fragment of SpreadsheetML for inline strings: <sheetData>  <row>    <c t="inlineStr">      <is>        <t>Paris</t>      </is>    </c>  </row> </sheetData>

  • Anonymous
    July 18, 2007
    Brian, Both. Mauricio (thanks) had given me a solution for writing. Reading is more important because it means we can use Excel to manipulate data that is more valuable. In particular, we can work around column/row headers in table (by ignoring them), but not true for file paths/string literal inside the table which is of interest. Thanks, Cinly