How to fix AzureMLException: AzureMLException: Message: Execution failed in operation 'to_pandas_dataframe' for Dataset(id='id', name='TEMP_DATA_ASSET', version=1, error_code=ScriptExecution.Database.Unexpected,error_message and ErrorCode: ScriptExecution

Suyash Bahrani 0 Reputation points
2025-03-11T19:58:08.94+00:00

Issue: AzureMLException When Consuming Datastore in AzureML via to_pandas_dataframe()

Description of the Setup

I have manually set up a datastore in AzureML, which is linked to an Azure Synapse Analytics server. During the creation of the datastore, I selected:

Datastore Type: Azure SQL Database

Authentication Type: SQL Authentication

Inputs Provided: Server Name, Database Name

To consume this datastore, I created a data asset using the following SQL query:

SELECT * FROM TABLE_NAME

The dataset contains 100,000 rows, and the schema appeared to be correct in the preview. After confirming the schema, I successfully created a data asset named TEMP_DATA_ASSEST.

Issue Encountered

I use my company (organization) Azure ML studio. I use kernels: Python 3.10 - SDK v2, Python 3.8-AzureML

I have replaced some variables with temp values due to security reasons:session_id with session_idid with iddata asset name with TEMP_DATA_ASSESTserver name with server-name

I attempted to consume the dataset in a Python script using the AzureML Pandas SDK 1 available in the consume tab of data asset (azureml-core and azureml-dataprep[pandas]) with the following code:

Required Libraries

from azureml.core import Workspace, Dataset

AzureML Workspace Details

subscription_id = 'my_subscription_id'

resource_group = 'my_resource_group'

workspace_name = 'my_workspace'

Connect to the AzureML Workspace

workspace = Workspace(subscription_id, resource_group, workspace_name)

Load the Dataset

dataset = Dataset.get_by_name(workspace, name='TEMP_DATA_ASSEST')

Convert to Pandas DataFrame

df = dataset.to_pandas_dataframe()

When executing the to_pandas_dataframe() function, I encountered the following AzureMLException:

Error Message

AzureMLException: AzureMLException:

Message: Execution failed in operation 'to_pandas_dataframe' for Dataset(id='id', name='TEMP_DATA_ASSEST', version=1,

error_code=ScriptExecution.Database.Unexpected,

error_message=Database execution failed with "SQLError(Server(TokenError { code: 103010, state: 1, class: 16,

message: "Parse error at line: 1, column: 22: Incorrect syntax near 'stmt'.", server: "server-name", procedure: "", line: 1 }))".

Observations

The SQL syntax error suggests an issue with query parsing, but the query was accepted when creating the data asset.

The schema preview showed all fields correctly, and the dataset creation was successful.

The issue occurs only when calling to_pandas_dataframe(), preventing me from loading the data into a Pandas DataFrame.

The error message mentions "Incorrect syntax near 'stmt'", which is unclear because no direct query is provided in the function call.

Troubleshooting Attempts

Checked the SQL Query in Azure Synapse:

The query SELECT * FROM TABLE_NAME executes fine in Synapse, returning all rows.

Verified the Dataset Schema in AzureML Studio:

The dataset schema was detected correctly in the AzureML UI.

Attempted an Alternative Dataset Load Method:

df = dataset.to_pandas_dataframe(sample_size=1000)

Still fails with the same SQL parsing error.

Checked SDK Versions:

Installed versions:

azureml-core: 1.54.0

azureml-dataprep[pandas]: 1.40.0

Upgraded to the latest version:

pip install --upgrade azureml-core azureml-dataprep[pandas]

Issue persists after upgrading.

Explicitly Set SQL Query Instead of Dataset Name:

dataset = Dataset.Tabular.from_sql_query(workspace, "SELECT * FROM TABLE_NAME")

df = dataset.to_pandas_dataframe()

Same error message appears.

Questions for the Community

What could be causing this SQL parsing error when calling to_pandas_dataframe()? How do I resolve this?

Does AzureML modify the query internally when executing the dataset retrieval?

Are there any known limitations when consuming large datasets (~100,000 rows) via to_pandas_dataframe() from an Azure SQL Datastore?

Is there an alternative approach to directly load this dataset into a Pandas DataFrame?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,177 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.