Get worksheet information from an Open XML package

This topic shows how to use the classes in the Open XML SDK for Office to programmatically retrieve information from a worksheet in a Spreadsheet document.

Basic structure of a spreadsheetML document

The basic document structure of a SpreadsheetML document consists of the Sheets and Sheet elements, which reference the worksheets in the workbook. A separate XML file is created for each worksheet. For example, the SpreadsheetML for a Workbook that has two worksheets name MySheet1 and MySheet2 is located in the Workbook.xml file and is shown in the following code example.

    <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
    <workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
        <sheets>
            <sheet name="MySheet1" sheetId="1" r:id="rId1" /> 
            <sheet name="MySheet2" sheetId="2" r:id="rId2" /> 
        </sheets>
    </workbook>

The worksheet XML files contain one or more block level elements such as SheetData represents the cell table and contains one or more Row elements. A row contains one or more Cell elements. Each cell contains a CellValue element that represents the value of the cell. For example, the SpreadsheetML for the first worksheet in a workbook, that only has the value 100 in cell A1, is located in the Sheet1.xml file and is shown in the following code example.

    <?xml version="1.0" encoding="UTF-8" ?> 
    <worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
        <sheetData>
            <row r="1">
                <c r="A1">
                    <v>100</v> 
                </c>
            </row>
        </sheetData>
    </worksheet>

Using the Open XML SDK, you can create document structure and content that uses strongly-typed classes that correspond to SpreadsheetML elements. You can find these classes in the DocumentFormat.OpenXML.Spreadsheet namespace. The following table lists the class names of the classes that correspond to the workbook, sheets, sheet, worksheet, and sheetData elements.

SpreadsheetML Element Open XML SDK Class Description
<workbook/> DocumentFormat.OpenXML.Spreadsheet.Workbook The root element for the main document part.
<sheets/> DocumentFormat.OpenXML.Spreadsheet.Sheets The container for the block level structures such as sheet, fileVersion, and others specified in the ISO/IEC 29500 specification.
<sheet/> DocumentFormat.OpenXml.Spreadsheet.Sheet A sheet that points to a sheet definition file.
<worksheet/> DocumentFormat.OpenXML.Spreadsheet. Worksheet A sheet definition file that contains the sheet data.
<sheetData/> DocumentFormat.OpenXML.Spreadsheet.SheetData The cell table, grouped together by rows.
<row/> DocumentFormat.OpenXml.Spreadsheet.Row A row in the cell table.
<c/> DocumentFormat.OpenXml.Spreadsheet.Cell A cell in a row.
<v/> DocumentFormat.OpenXml.Spreadsheet.CellValue The value of a cell.

How the Sample Code Works

After you have opened the file for read-only access, you instantiate the Sheets class.

Sheets? sheets = mySpreadsheet.WorkbookPart?.Workbook?.Sheets;

You then you iterate through the Sheets collection and display OpenXmlElement and the OpenXmlAttribute in each element.

foreach (OpenXmlElement sheet in sheets)
{
    foreach (OpenXmlAttribute attr in sheet.GetAttributes())
    {
        Console.WriteLine("{0}: {1}", attr.LocalName, attr.Value);
    }
}

By displaying the attribute information you get the name and ID for each worksheet in the spreadsheet file.

Sample code

The following is the complete code sample in both C# and Visual Basic.

static void GetSheetInfo(string fileName)
{
    // Open file as read-only.
    using (SpreadsheetDocument mySpreadsheet = SpreadsheetDocument.Open(fileName, false))
    {
        Sheets? sheets = mySpreadsheet.WorkbookPart?.Workbook?.Sheets;

        if (sheets is not null)
        {
            // For each sheet, display the sheet information.
            foreach (OpenXmlElement sheet in sheets)
            {
                foreach (OpenXmlAttribute attr in sheet.GetAttributes())
                {
                    Console.WriteLine("{0}: {1}", attr.LocalName, attr.Value);
                }
            }
        }
    }
}