Unable to Add Duplicate Values in Table Column for Profit and Loss Statement

Armani Hammer 5 Reputation points
2025-02-08T20:04:38.7333333+00:00

I am trying to label data for a profit and loss statement. As the field, I chose the table element as it seems like the only field that can do what I want it to.

I need to retrieve three things:

  • Date
  • Account
  • Value

The documents are structured such that the dates are column headers, the leftmost value in a row represents the account, and the values correspond to each date column. My goal is to add the information in the table field for each row. However, I am unable to do this because the system does not allow me to use the same value (from the PDF) more than once.

I would like to be able to label the table like so:

Date Account Value
June 2023 Total Income 1,234,567$
July 2023 Total Income 1,432,765$
August 2023 Total Income 1,222,333$

However, the system forces the table to appear as follows:

Date Account Value
June 2023 Total Income 1,234,567$
July 2023 1,432,765$
August 2023 1,222,333$

This formatting does not work for my needs. Is there a way to enable the use of duplicate values in this field, or am I overlooking a solution? It seems like it should be possible.

Thank you in advance.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,909 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Sina Salam 17,176 Reputation points
    2025-02-10T09:35:28.7266667+00:00

    Hello Armani Hammer,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are unable to Add Duplicate Values in Table Column for Profit and Loss Statement in Azure AI Document Intelligence.

    To reduce overhead, you can resolve the issue by the following steps:

    Step 1:

    1. Label Tables Explicitly:
    • Treat each row as a separate entry, even if the "Account" value repeats.
    • Label all cells in every row (including duplicates like "Total Income") during model training. Do not rely on implicit inheritance from previous rows.
    1. Map Column Headers to Rows: if dates are column headers (e.g., "June 2023"), label them as headers. Azure will associate values under each date column with the correct header.

    Step 2:

    Specify tables in the features parameter to extract structured table data:

    POST https://{endpoint}/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-02-29-preview
    {
      "features": ["tables"]
    }
    

    Step 3:

    If the raw JSON output skips duplicates, write a script to fill gaps using the last non-empty value for Account:

    # Example Python script to handle duplicate "Account" values
    current_account = ""
    for row in extracted_table_rows:
        if row["Account"] != "":
            current_account = row["Account"]
        else:
            row["Account"] = current_account
    

    NOTE: This is Post-Processing (If Needed).

    Step 4:

    Ensure the final JSON includes all values explicitlylike an example below:

    {
      "tables": [
        {
          "rows": [
            {
              "cells": [
                { "content": "June 2023", "role": "columnHeader" },
                { "content": "Total Income", "role": "rowHeader" },
                { "content": "1,234,567$" }
              ]
            },
            {
              "cells": [
                { "content": "July 2023", "role": "columnHeader" },
                { "content": "Total Income", "role": "rowHeader" },
                { "content": "1,432,765$" }
              ]
            }
          ]
        }
      ]
    }
    

    In summary for the account document:

    • Use Document Intelligence Studio to label tables with duplicates explicitly.
    • Avoid queryFields for tabular data; use table extraction instead.
    • Fill gaps programmatically if Azure skips duplicates.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.