Partilhar via


Document AI Konfuzio

Use the document AI Konfuzio Connector to OCR, extract, and retrieve structured information from PDFs, images, handwriting, and scans. Register on https://app.konfuzio.com for free to OCR PDFs and images from various file types, see https://dev.konfuzio.com/web/api.html#supported-file-types. In addition, you can train your custom Document AI following the Tutorial on https://help.konfuzio.com/tutorials/quickstart/ to extract structured information from documents.

This connector is available in the following products and regions:

Service Class Regions
Logic Apps Standard All Logic Apps regions except the following:
     -   Azure Government regions
     -   Azure China regions
     -   US Department of Defense (DoD)
Power Automate Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Power Apps Premium All Power Apps regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Contact
Name Helm & Nagel GmbH
URL https://help.konfuzio.com
Email info@konfuzio.com
Connector Metadata
Publisher Helm & Nagel GmbH
Website https://konfuzio.com
Privacy policy https://konfuzio.com/de/impressum/
Categories Data;Content and Files

OCR, extract, validate, process and understand the information in documents by connecting to Konfuzio using this connector. Train Konfuzio to understand any file-based carrier of information to automate complex back-office processes and use more data to generate insights. Konfuzio enables business users who use this connector to categorize documents and extract information. Optionally, data scientists can use Konfuzio as an automated text and image labeling tool with a user-friendly web interface to maintain high-quality data sets to build world-class AI, integrate their AI and improve the AI with humans in the loop.

Prerequisites

  1. Register for free on app.konfuzio.com. The free plan provides you with a limited number of features. Have a look at the features of our fee-based service here. If you want to upgrade to a fee-based service please contact us via info@konfuzio.com.

  2. Set-up a project as described in our quickstart tutorial on help.konfuzio.com.

How to get credentials

Use the username and password which you used to create a user account on app.konfuzio.com as credentials to use the connector.

Get started with your connector

Find the latest step-by-step process for getting started with your connector on help.konfuzio.com.

Known issues and limitations

  • The connector supports ca. 70 languages, see dev.konfuzio.com
  • The connector supports various file types. Have a look at dev.konfuzio.com for a detailed list of supported file types.
  • The connector supports data normalization for strings that refer to numbers, percentages, date values and booleans, see dev.konfuzio.com

Common errors and remedies

Contact us via support@konfuzio.com in case of errors.

FAQ

We publish FAQ about the Power Automate Connector on help.konfuzio.com.

Creating a connection

The connector supports the following authentication types:

Default Parameters for creating connection. All regions Not shareable

Default

Applicable: All regions

Parameters for creating connection.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name Type Description Required
username securestring The username for this api True
password securestring The password for this api True

Throttling Limits

Name Calls Renewal Period
API calls per connection 100 60 seconds

Actions

Delete a document

Delete a document.

Give feedback to the extraction results of a document

You can give feedback by sending the adapted extraction result dictionary.

Example python code:

import requests
import json
from requests.auth import HTTPBasicAuth

url = f"https://app.konfuzio.com/api/v2/docs/{DOC_ID}/"
auth = HTTPBasicAuth(KONFUZIO_USER, KONFUZIO_PASSWORD)
data = requests.get(url=url, auth=auth).json()

# Mark Extraction as correct.
data['labels']['Bruttozahlweise']['extractions'][0]['correct'] = True

# Add new Extraction which has not been in result list.
data['labels']['Bruttozahlweise']['extractions'].append({'value': '123,45'})

r = requests.patch(url = url, data=json.dumps(data), auth=auth, headers={'Content-Type': 'application/json'},)
Retrieve the extraction results for a document

Get all information for a document using its id. The extraction results are available once the processing has finished

Upload a new document

Upload a new document

Delete a document

Delete a document.

Parameters

Name Key Required Type Description
Document ID
doc True string

ID of the document you want to delete

Give feedback to the extraction results of a document

You can give feedback by sending the adapted extraction result dictionary.

Example python code:

import requests
import json
from requests.auth import HTTPBasicAuth

url = f"https://app.konfuzio.com/api/v2/docs/{DOC_ID}/"
auth = HTTPBasicAuth(KONFUZIO_USER, KONFUZIO_PASSWORD)
data = requests.get(url=url, auth=auth).json()

# Mark Extraction as correct.
data['labels']['Bruttozahlweise']['extractions'][0]['correct'] = True

# Add new Extraction which has not been in result list.
data['labels']['Bruttozahlweise']['extractions'].append({'value': '123,45'})

r = requests.patch(url = url, data=json.dumps(data), auth=auth, headers={'Content-Type': 'application/json'},)

Parameters

Name Key Required Type Description
Document ID
doc True string

ID of the document you want to patch

Retrieve the extraction results for a document

Get all information for a document using its id. The extraction results are available once the processing has finished

Parameters

Name Key Required Type Description
Document ID
doc True string

ID of your document in the project

Upload a new document

Upload a new document

Parameters

Name Key Required Type Description
File
data_file True file

File to send to Host.

Project ID
project True integer

The ID of your project.

Synchron response
sync boolean

Default is False

Returns

Name Path Type Description
Data file
data_file uri
ID
id integer
Project
project integer
Data file name
data_file_name string
Callback URL
callback_url uri
Sync
sync boolean
Extraction url
extraction_url string