此內容適用於: v4.0 (GA) | 舊版: 
v3.1 (GA) ::moniker-end
注意
除了名片模型以外,所有模型內均提供附加元件功能。
功能
文件智慧支援更複雜和模組化的分析功能。 使用附加元件功能來擴充結果以包含從文件中擷取的更多功能。 某些附加元件功能會產生額外的成本。 可以根據文件擷取的場景來啟用和停用這些選用功能。 若要啟用功能,請將相關聯的功能名稱新增至 features
查詢字串屬性。 您可以透過提供以逗號分隔的功能清單來根據要求啟用多個附加元件功能。 下列附加元件功能可用於 2023-07-31 (GA)
和更新版本。
注意
並非所有模型或Microsoft Office 檔類型都支援附加元件功能。 如需詳細資訊,請參閱模型資料擷取。
版本可用性
附加元件功能 |
附加元件/免費 |
2024-11-30 (GA) |
2023-07-31 (GA) |
2022-08-31 (GA) |
v2.1 (GA) |
條碼擷取 |
免費 |
✔️ |
✔️ |
n/a |
n/a |
語言偵測 |
免費 |
✔️ |
✔️ |
n/a |
n/a |
索引鍵/值組 |
免費 |
✔️ |
n/a |
n/a |
n/a |
可搜尋 PDF |
免費 |
✔️ |
n/a |
n/a |
n/a |
字型屬性擷取 |
附加元件 |
✔️ |
✔️ |
n/a |
n/a |
公式擷取 |
附加元件 |
✔️ |
✔️ |
n/a |
n/a |
高解析度擷取 |
附加元件 |
✔️ |
✔️ |
n/a |
n/a |
查詢欄位 |
附加元件 |
✔️ |
n/a |
n/a |
n/a |
✱ 附加元件 - 查詢欄位的價格與其他附加元件功能不同。 如需詳細資料,請參閱定價。
** 附加元件 - 可搜尋的 pdf 僅適用於讀取模型作為附加元件功能。
目前不支援 ✱ Microsoft Office 檔案。
從大型文件 (如工程繪圖) 中識別小型文字是一項挑戰。 文字通常與其他圖形元素混合,並具有不同的字型、大小和方向。 此外,文字可以被分成不同部分或與其他符號相連。 文件智慧現在支援使用 ocr.highResolution
功能從這些類型的文件中擷取內容。 透過啟用此附加功能,您可以提高從 A1/A2/A3 文件中擷取內容的品質。
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=ocrHighResolution
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-highres.png?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.OCR_HIGH_RESOLUTION], # Specify which add-on capabilities to enable.
)
result: AnalyzeResult = poller.result()
# [START analyze_with_highres]
if result.styles and any([style.is_handwritten for style in result.styles]):
print("Document contains handwritten content")
else:
print("Document does not contain handwritten content")
for page in result.pages:
print(f"----Analyzing layout from page #{page.page_number}----")
print(f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}")
if page.lines:
for line_idx, line in enumerate(page.lines):
words = get_words(page, line)
print(
f"...Line # {line_idx} has word count {len(words)} and text '{line.content}' "
f"within bounding polygon '{line.polygon}'"
)
for word in words:
print(f"......Word '{word.content}' has a confidence of {word.confidence}")
if page.selection_marks:
for selection_mark in page.selection_marks:
print(
f"Selection mark is '{selection_mark.state}' within bounding polygon "
f"'{selection_mark.polygon}' and has a confidence of {selection_mark.confidence}"
)
if result.tables:
for table_idx, table in enumerate(result.tables):
print(f"Table # {table_idx} has {table.row_count} rows and " f"{table.column_count} columns")
if table.bounding_regions:
for region in table.bounding_regions:
print(f"Table # {table_idx} location on page: {region.page_number} is {region.polygon}")
for cell in table.cells:
print(f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'")
if cell.bounding_regions:
for region in cell.bounding_regions:
print(f"...content on page {region.page_number} is within bounding polygon '{region.polygon}'")
"styles": [true],
"pages": [
{
"page_number": 1,
"width": 1000,
"height": 800,
"unit": "px",
"lines": [
{
"line_idx": 1,
"content": "This",
"polygon": [10, 20, 30, 40],
"words": [
{
"content": "This",
"confidence": 0.98
}
]
}
],
"selection_marks": [
{
"state": "selected",
"polygon": [50, 60, 70, 80],
"confidence": 0.91
}
]
}
],
"tables": [
{
"table_idx": 1,
"row_count": 3,
"column_count": 4,
"bounding_regions": [
{
"page_number": 1,
"polygon": [100, 200, 300, 400]
}
],
"cells": [
{
"row_index": 1,
"column_index": 1,
"content": "Content 1",
"bounding_regions": [
{
"page_number": 1,
"polygon": [110, 210, 310, 410]
}
]
}
]
}
]
{your-resource-endpoint}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=ocrHighResolution
# Analyze a document at a URL:
url = "(https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-highres.png?raw=true"
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", document_url=url, features=[AnalysisFeature.OCR_HIGH_RESOLUTION] # Specify which add-on capabilities to enable.
)
result = poller.result()
# [START analyze_with_highres]
if any([style.is_handwritten for style in result.styles]):
print("Document contains handwritten content")
else:
print("Document does not contain handwritten content")
for page in result.pages:
print(f"----Analyzing layout from page #{page.page_number}----")
print(
f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}"
)
for line_idx, line in enumerate(page.lines):
words = line.get_words()
print(
f"...Line # {line_idx} has word count {len(words)} and text '{line.content}' "
f"within bounding polygon '{format_polygon(line.polygon)}'"
)
for word in words:
print(
f"......Word '{word.content}' has a confidence of {word.confidence}"
)
for selection_mark in page.selection_marks:
print(
f"Selection mark is '{selection_mark.state}' within bounding polygon "
f"'{format_polygon(selection_mark.polygon)}' and has a confidence of {selection_mark.confidence}"
)
for table_idx, table in enumerate(result.tables):
print(
f"Table # {table_idx} has {table.row_count} rows and "
f"{table.column_count} columns"
)
for region in table.bounding_regions:
print(
f"Table # {table_idx} location on page: {region.page_number} is {format_polygon(region.polygon)}"
)
for cell in table.cells:
print(
f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'"
)
for region in cell.bounding_regions:
print(
f"...content on page {region.page_number} is within bounding polygon '{format_polygon(region.polygon)}'"
)
"styles": [true],
"pages": [
{
"page_number": 1,
"width": 1000,
"height": 800,
"unit": "px",
"lines": [
{
"line_idx": 1,
"content": "This",
"polygon": [10, 20, 30, 40],
"words": [
{
"content": "This",
"confidence": 0.98
}
]
}
],
"selection_marks": [
{
"state": "selected",
"polygon": [50, 60, 70, 80],
"confidence": 0.91
}
]
}
],
"tables": [
{
"table_idx": 1,
"row_count": 3,
"column_count": 4,
"bounding_regions": [
{
"page_number": 1,
"polygon": [100, 200, 300, 400]
}
],
"cells": [
{
"row_index": 1,
"column_index": 1,
"content": "Content 1",
"bounding_regions": [
{
"page_number": 1,
"polygon": [110, 210, 310, 410]
}
]
}
]
}
]
ocr.formula
功能擷取 formulas
集合中所有已識別的公式,如數學方程,作為 content
下的頂端物件。 在 content
內部,偵測到的公式表示為 :formula:
。 此集合中的每個項目都表示一個公式,其包括作為 inline
或 display
的公式類型、作為 value
的 LaTeX 表示及其 polygon
座標。 最初,公式顯示在每頁的末尾。
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=formulas
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/layout-formulas.png?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.FORMULAS], # Specify which add-on capabilities to enable
)
result: AnalyzeResult = poller.result()
# [START analyze_formulas]
for page in result.pages:
print(f"----Formulas detected from page #{page.page_number}----")
if page.formulas:
inline_formulas = [f for f in page.formulas if f.kind == "inline"]
display_formulas = [f for f in page.formulas if f.kind == "display"]
# To learn the detailed concept of "polygon" in the following content, visit: https://aka.ms/bounding-region
print(f"Detected {len(inline_formulas)} inline formulas.")
for formula_idx, formula in enumerate(inline_formulas):
print(f"- Inline #{formula_idx}: {formula.value}")
print(f" Confidence: {formula.confidence}")
print(f" Bounding regions: {formula.polygon}")
print(f"\nDetected {len(display_formulas)} display formulas.")
for formula_idx, formula in enumerate(display_formulas):
print(f"- Display #{formula_idx}: {formula.value}")
print(f" Confidence: {formula.confidence}")
print(f" Bounding regions: {formula.polygon}")
"content": ":formula:",
"pages": [
{
"pageNumber": 1,
"formulas": [
{
"kind": "inline",
"value": "\\frac { \\partial a } { \\partial b }",
"polygon": [...],
"span": {...},
"confidence": 0.99
},
{
"kind": "display",
"value": "y = a \\times b + a \\times c",
"polygon": [...],
"span": {...},
"confidence": 0.99
}
]
}
]
{your-resource-endpoint}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=formulas
# Analyze a document at a URL:
url = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/layout-formulas.png?raw=true"
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", document_url=url, features=[AnalysisFeature.FORMULAS] # Specify which add-on capabilities to enable
)
result = poller.result()
# [START analyze_formulas]
for page in result.pages:
print(f"----Formulas detected from page #{page.page_number}----")
inline_formulas = [f for f in page.formulas if f.kind == "inline"]
display_formulas = [f for f in page.formulas if f.kind == "display"]
print(f"Detected {len(inline_formulas)} inline formulas.")
for formula_idx, formula in enumerate(inline_formulas):
print(f"- Inline #{formula_idx}: {formula.value}")
print(f" Confidence: {formula.confidence}")
print(f" Bounding regions: {format_polygon(formula.polygon)}")
print(f"\nDetected {len(display_formulas)} display formulas.")
for formula_idx, formula in enumerate(display_formulas):
print(f"- Display #{formula_idx}: {formula.value}")
print(f" Confidence: {formula.confidence}")
print(f" Bounding regions: {format_polygon(formula.polygon)}")
"content": ":formula:",
"pages": [
{
"pageNumber": 1,
"formulas": [
{
"kind": "inline",
"value": "\\frac { \\partial a } { \\partial b }",
"polygon": [...],
"span": {...},
"confidence": 0.99
},
{
"kind": "display",
"value": "y = a \\times b + a \\times c",
"polygon": [...],
"span": {...},
"confidence": 0.99
}
]
}
]
ocr.font
功能擷取 styles
集合中擷取之文字的所有字型屬性,作為 content
下的頂端物件。 每個樣式物件都指定單一字型内容、它所套用的文字範圍及其相應的信賴度分數。 現有樣式屬性擴充了更多字型屬性,例如針對文字字型的 similarFontFamily
,針對斜體和一般等樣式的 fontStyle
,針對粗體或一般的 fontWeight
,針對文字色彩的 color
,針對文字週框方塊色彩的 backgroundColor
。
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=styleFont
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/receipt/receipt-with-tips.png?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.STYLE_FONT] # Specify which add-on capabilities to enable.
)
result: AnalyzeResult = poller.result()
# [START analyze_fonts]
# DocumentStyle has the following font related attributes:
similar_font_families = defaultdict(list) # e.g., 'Arial, sans-serif
font_styles = defaultdict(list) # e.g, 'italic'
font_weights = defaultdict(list) # e.g., 'bold'
font_colors = defaultdict(list) # in '#rrggbb' hexadecimal format
font_background_colors = defaultdict(list) # in '#rrggbb' hexadecimal format
if result.styles and any([style.is_handwritten for style in result.styles]):
print("Document contains handwritten content")
else:
print("Document does not contain handwritten content")
return
print("\n----Fonts styles detected in the document----")
# Iterate over the styles and group them by their font attributes.
for style in result.styles:
if style.similar_font_family:
similar_font_families[style.similar_font_family].append(style)
if style.font_style:
font_styles[style.font_style].append(style)
if style.font_weight:
font_weights[style.font_weight].append(style)
if style.color:
font_colors[style.color].append(style)
if style.background_color:
font_background_colors[style.background_color].append(style)
print(f"Detected {len(similar_font_families)} font families:")
for font_family, styles in similar_font_families.items():
print(f"- Font family: '{font_family}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_styles)} font styles:")
for font_style, styles in font_styles.items():
print(f"- Font style: '{font_style}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_weights)} font weights:")
for font_weight, styles in font_weights.items():
print(f"- Font weight: '{font_weight}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_colors)} font colors:")
for font_color, styles in font_colors.items():
print(f"- Font color: '{font_color}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_background_colors)} font background colors:")
for font_background_color, styles in font_background_colors.items():
print(f"- Font background color: '{font_background_color}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
"content": "Foo bar",
"styles": [
{
"similarFontFamily": "Arial, sans-serif",
"spans": [ { "offset": 0, "length": 3 } ],
"confidence": 0.98
},
{
"similarFontFamily": "Times New Roman, serif",
"spans": [ { "offset": 4, "length": 3 } ],
"confidence": 0.98
},
{
"fontStyle": "italic",
"spans": [ { "offset": 1, "length": 2 } ],
"confidence": 0.98
},
{
"fontWeight": "bold",
"spans": [ { "offset": 2, "length": 3 } ],
"confidence": 0.98
},
{
"color": "#FF0000",
"spans": [ { "offset": 4, "length": 2 } ],
"confidence": 0.98
},
{
"backgroundColor": "#00FF00",
"spans": [ { "offset": 5, "length": 2 } ],
"confidence": 0.98
}
]
{your-resource-endpoint}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=styleFont
# Analyze a document at a URL:
url = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/receipt/receipt-with-tips.png?raw=true"
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", document_url=url, features=[AnalysisFeature.STYLE_FONT] # Specify which add-on capabilities to enable.
)
result = poller.result()
# [START analyze_fonts]
# DocumentStyle has the following font related attributes:
similar_font_families = defaultdict(list) # e.g., 'Arial, sans-serif
font_styles = defaultdict(list) # e.g, 'italic'
font_weights = defaultdict(list) # e.g., 'bold'
font_colors = defaultdict(list) # in '#rrggbb' hexadecimal format
font_background_colors = defaultdict(list) # in '#rrggbb' hexadecimal format
if any([style.is_handwritten for style in result.styles]):
print("Document contains handwritten content")
else:
print("Document does not contain handwritten content")
print("\n----Fonts styles detected in the document----")
# Iterate over the styles and group them by their font attributes.
for style in result.styles:
if style.similar_font_family:
similar_font_families[style.similar_font_family].append(style)
if style.font_style:
font_styles[style.font_style].append(style)
if style.font_weight:
font_weights[style.font_weight].append(style)
if style.color:
font_colors[style.color].append(style)
if style.background_color:
font_background_colors[style.background_color].append(style)
print(f"Detected {len(similar_font_families)} font families:")
for font_family, styles in similar_font_families.items():
print(f"- Font family: '{font_family}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_styles)} font styles:")
for font_style, styles in font_styles.items():
print(f"- Font style: '{font_style}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_weights)} font weights:")
for font_weight, styles in font_weights.items():
print(f"- Font weight: '{font_weight}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_colors)} font colors:")
for font_color, styles in font_colors.items():
print(f"- Font color: '{font_color}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
print(f"\nDetected {len(font_background_colors)} font background colors:")
for font_background_color, styles in font_background_colors.items():
print(f"- Font background color: '{font_background_color}'")
print(f" Text: '{get_styled_text(styles, result.content)}'")
"content": "Foo bar",
"styles": [
{
"similarFontFamily": "Arial, sans-serif",
"spans": [ { "offset": 0, "length": 3 } ],
"confidence": 0.98
},
{
"similarFontFamily": "Times New Roman, serif",
"spans": [ { "offset": 4, "length": 3 } ],
"confidence": 0.98
},
{
"fontStyle": "italic",
"spans": [ { "offset": 1, "length": 2 } ],
"confidence": 0.98
},
{
"fontWeight": "bold",
"spans": [ { "offset": 2, "length": 3 } ],
"confidence": 0.98
},
{
"color": "#FF0000",
"spans": [ { "offset": 4, "length": 2 } ],
"confidence": 0.98
},
{
"backgroundColor": "#00FF00",
"spans": [ { "offset": 5, "length": 2 } ],
"confidence": 0.98
}
]
ocr.barcode
功能會將 barcodes
集合中所有已識別的條碼擷取為 content
底下的最上層物件。 在 content
內部,偵測到的條碼表示為 :barcode:
。 此集合中的每個項目表示一個條碼,包括條碼類型 kind
和內嵌的條碼內容 value
及其 polygon
座標。 最初,條碼顯示在每頁的末尾。
confidence
會被硬編碼為 1。
支援的條碼類型
條碼類型 |
範例 |
QR Code |
|
Code 39 |
|
Code 93 |
|
Code 128 |
|
UPC (UPC-A & UPC-E) |
|
PDF417 |
|
EAN-8 |
|
EAN-13 |
|
Codabar |
|
Databar |
|
展開了 Databar |
|
ITF |
|
Data Matrix |
|
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=barcodes
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-barcodes.jpg?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-read",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.BARCODES] # Specify which add-on capabilities to enable.
)
result: AnalyzeResult = poller.result()
# [START analyze_barcodes]
# Iterate over extracted barcodes on each page.
for page in result.pages:
print(f"----Barcodes detected from page #{page.page_number}----")
if page.barcodes:
print(f"Detected {len(page.barcodes)} barcodes:")
for barcode_idx, barcode in enumerate(page.barcodes):
print(f"- Barcode #{barcode_idx}: {barcode.value}")
print(f" Kind: {barcode.kind}")
print(f" Confidence: {barcode.confidence}")
print(f" Bounding regions: {barcode.polygon}")
----Barcodes detected from page #1----
Detected 2 barcodes:
- Barcode #0: 123456
Kind: QRCode
Confidence: 0.95
Bounding regions: [10.5, 20.5, 30.5, 40.5]
- Barcode #1: 789012
Kind: QRCode
Confidence: 0.98
Bounding regions: [50.5, 60.5, 70.5, 80.5]
{your-resource-endpoint}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=barcodes
# Analyze a document at a URL:
url = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-barcodes.jpg?raw=true"
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", document_url=url, features=[AnalysisFeature.BARCODES] # Specify which add-on capabilities to enable.
)
result = poller.result()
# [START analyze_barcodes]
# Iterate over extracted barcodes on each page.
for page in result.pages:
print(f"----Barcodes detected from page #{page.page_number}----")
print(f"Detected {len(page.barcodes)} barcodes:")
for barcode_idx, barcode in enumerate(page.barcodes):
print(f"- Barcode #{barcode_idx}: {barcode.value}")
print(f" Kind: {barcode.kind}")
print(f" Confidence: {barcode.confidence}")
print(f" Bounding regions: {format_polygon(barcode.polygon)}")
----Barcodes detected from page #1----
Detected 2 barcodes:
- Barcode #0: 123456
Kind: QRCode
Confidence: 0.95
Bounding regions: [10.5, 20.5, 30.5, 40.5]
- Barcode #1: 789012
Kind: QRCode
Confidence: 0.98
Bounding regions: [50.5, 60.5, 70.5, 80.5]
語言偵測
將 languages
功能新增至 analyzeResult
要求可以預測每個文字行偵測到的主要語言以及 analyzeResult
下的 languages
集合中的 confidence
。
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=languages
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-fonts_and_languages.png?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.LANGUAGES] # Specify which add-on capabilities to enable.
)
result: AnalyzeResult = poller.result()
# [START analyze_languages]
print("----Languages detected in the document----")
if result.languages:
print(f"Detected {len(result.languages)} languages:")
for lang_idx, lang in enumerate(result.languages):
print(f"- Language #{lang_idx}: locale '{lang.locale}'")
print(f" Confidence: {lang.confidence}")
print(
f" Text: '{','.join([result.content[span.offset : span.offset + span.length] for span in lang.spans])}'"
)
"languages": [
{
"spans": [
{
"offset": 0,
"length": 131
}
],
"locale": "en",
"confidence": 0.7
},
]
{your-resource-endpoint}.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?api-version=2023-07-31&features=languages
# Analyze a document at a URL:
url = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/add-on/add-on-fonts_and_languages.png?raw=true"
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-layout", document_url=url, features=[AnalysisFeature.LANGUAGES] # Specify which add-on capabilities to enable.
)
result = poller.result()
# [START analyze_languages]
print("----Languages detected in the document----")
print(f"Detected {len(result.languages)} languages:")
for lang_idx, lang in enumerate(result.languages):
print(f"- Language #{lang_idx}: locale '{lang.locale}'")
print(f" Confidence: {lang.confidence}")
print(f" Text: '{','.join([result.content[span.offset : span.offset + span.length] for span in lang.spans])}'")
"languages": [
{
"spans": [
{
"offset": 0,
"length": 131
}
],
"locale": "en",
"confidence": 0.7
},
]
可搜尋 PDF
可搜尋 PDF 功能可讓您將類比 PDF (例如掃描影像 PDF 檔案) 轉換為具有內嵌文字的 PDF。 內嵌文字可在 PDF 擷取的內容中啟用深層文字搜尋,方法是將偵測到的文字實體重疊在影像檔案上。
重要
- 目前,只有讀取模型
prebuilt-read
支援可搜尋的 PDF 功能。 使用此功能時,請將 指定 modelId
為 prebuilt-read
。
- 可搜尋的 PDF 隨附於 (GA)
prebuilt-read
模型,2024-11-30
一般 PDF 耗用量不具使用量成本。
使用可搜尋 PDF
若要使用可搜尋 PDF,請使用 Analyze
作業提出 POST
要求,並將輸出格式指定為 pdf
:
POST /documentModels/prebuilt-read:analyze?output=pdf
{...}
202
完成 Analyze
作業之後,請提出 GET
要求來擷取 Analyze
作業結果。
成功完成時,可以擷取 PDF 並下載為 application/pdf
。 此作業允許直接下載 PDF 的內嵌文字格式,而不是 Base64 編碼 JSON。
// Monitor the operation until completion.
GET /documentModels/prebuilt-read/analyzeResults/{resultId}
200
{...}
// Upon successful completion, retrieve the PDF as application/pdf.
GET /documentModels/prebuilt-read/analyzeResults/{resultId}/pdf
200 OK
Content-Type: application/pdf
索引鍵/值組
在舊版 API 中 prebuilt-document
,模型會從表單和檔擷取機碼/值組。 透過將 keyValuePairs
功能新增至預先建置的版面配置,版面配置模型現在可以產生相同的結果。
索引鍵/值組是文件內的特定範圍,其識別標籤或索引鍵,及其相關的回應或值。 在結構化表單中,這些組別可能是標籤,以及使用者為該欄位輸入的值。 在非結構化文件中,它們可能是根據段落中文字內容而得的合約執行日期。 AI 模型已經過定型,可以根據各種不同的文件類型、格式和結構來擷取可識別的索引鍵和值。
若模型偵測到索引鍵存在,且沒有相關聯的值或處理選用欄位時,索引鍵也可以單獨存在。 例如,在某些情況下,表單上的中間名欄位可以留空。 索引鍵/值組是文件中所包含的文字範圍。 若是文件對相同的值有不同的描述方式,例如客戶/使用者,則相關聯的關鍵為客戶或使用者,視前後文而定。
REST API
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=keyValuePairs
查詢欄位
查詢欄位是一項附加元件功能,可擴充從任何預先建置模型中擷取的結構描述,或在索引鍵名稱可變時定義特定的索引鍵名稱。 若要使用查詢欄位,請將功能設定為 queryFields
,並在 queryFields
屬性中提供以逗號分隔的欄位名稱清單。
文件智慧現在支援查詢欄位擷取。 使用查詢欄位擷取,您即可使用查詢要求將欄位新增至擷取流程,而不需要新增訓練。
當您需要擴充預先建置或自訂模型的結構描述,或需要使用版面配置輸出來擷取一些欄位時,請使用查詢欄位。
查詢欄位是一項進階附加元件功能。 為了獲得最佳結果,請使用駝峰式大小寫或帕斯卡式大小寫欄位名稱 (對於多重單字欄位名稱) 來定義要擷取的欄位。
查詢欄位支援每個要求最多 20 個欄位。 如果文件包含欄位的值,則會傳回欄位和值。
此版本具有查詢欄位功能的新實作方式,其價格低於先前的實作方式且應該經過驗證。
注意
除了美國稅務模型 W2、1098 和 1099 以外,檔 Intelligence Studio 查詢欄位擷取目前可使用版面配置和預先建置模型 2024-11-30
(GA) API。
針對查詢欄位擷取,請指定您要擷取的欄位,而文件智慧會據以分析文件。 以下是範例:
如果您要在 Document Intelligence Studio 中處理合約,請使用 2024-11-30 (GA) 版本:
您可以傳遞欄位標籤清單,例如 Party1
、Party2
、TermsOfUse
、PaymentTerms
、PaymentDate
和 TermEndDate
作為 analyze document
要求的一部分。
文件智慧能夠分析和擷取欄位資料並以結構化 JSON 輸出傳回值。
除了查詢欄位之外,回應還會包含文字、資料表、選取標記和其他相關資料。
{your-resource-endpoint}.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-layout:analyze?api-version=2024-11-30&features=queryFields&queryFields=TERMS
# Analyze a document at a URL:
formUrl = "https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Data/invoice/simple-invoice.png?raw=true"
poller = document_intelligence_client.begin_analyze_document(
"prebuilt-layout",
AnalyzeDocumentRequest(url_source=formUrl),
features=[DocumentAnalysisFeature.QUERY_FIELDS], # Specify which add-on capabilities to enable.
query_fields=["Address", "InvoiceNumber"], # Set the features and provide a comma-separated list of field names.
)
result: AnalyzeResult = poller.result()
print("Here are extra fields in result:\n")
if result.documents:
for doc in result.documents:
if doc.fields and doc.fields["Address"]:
print(f"Address: {doc.fields['Address'].value_string}")
if doc.fields and doc.fields["InvoiceNumber"]:
print(f"Invoice number: {doc.fields['InvoiceNumber'].value_string}")
Address: 1 Redmond way Suite 6000 Redmond, WA Sunnayvale, 99243
Invoice number: 34278587
下一步