Connect Local Machine to Synapse Lake Database using Python

Haria, Neel 0

Hi,

I am trying to develop tooling for my use case, where I am using PySpark/Spark SQL on Synapse Analytics to create tables in Lake database. Sample code below:

spark.sql("CREATE TABLE abc AS SELECT * FROM view")

The table abc is created in the Lake Database called default in Synapse analytics, I am able to access this table while using Synapse Notebooks. However, I need to access it from my local machine using an Azure SDK or a connection method that would enable to me run queries retrieve results within my local environment.

phemanth 13,900 Reputation points Microsoft Vendor

2025-02-17T21:00:19.3933333+00:00
Hi @Haria, Neel

Welcome to Microsoft Q&A platform.

Thanks for your response!

Regarding SynapseClient

You're right, SynapseClient seems to be deprecated, and it's not the best approach. Instead, for accessing Lake Database tables, we need to determine whether you're using Spark or SQL-based access (since pyodbc only works with SQL endpoints).

For using pyodbc, how do I find my username and password for tables in Lake Databses?

If you're using SQL Authentication, your username and password would be the ones you set up when creating the Synapse SQL Pool.

If you're using Azure Active Directory (AAD) Authentication, you might need to use Azure Managed Identity or a Service Principal instead of direct username/password.

If your data is in a Lake Database (Spark-based tables), pyodbc won't work because these tables aren't directly accessible via TDS connections. Instead, you would need to use Spark pools or access the underlying ADLS storage.

Could you confirm whether you have a Synapse SQL pool, or are you working only with Spark tables in the Lake Database? That will help determine the best way to connect.
phemanth 13,900 Reputation points Microsoft Vendor

2025-02-18T18:51:12.15+00:00

@Haria, Neel We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Haria, Neel 0 Reputation points

2025-02-19T00:09:16.8166667+00:00

I am using Spark tables, not Synapse SQL pools.
Smaran Thoomu 20,295 Reputation points Microsoft Vendor

2025-02-19T07:07:31.6033333+00:00
Hi @Haria, Neel
Thanks for the clarification! Since you're working with Spark tables in a Synapse Lake Database, pyodbc won’t work because these tables are not accessible through standard SQL connections. However, you can access them from your local machine using one of the following approaches:

Approach 1: Using Synapse Spark Notebooks (Remote Execution)
If you just need to run queries remotely, you can execute a Synapse Spark notebook from Python on your local machine using Azure Synapse Pipelines or the Synapse REST API. This way, your Spark code still runs inside Synapse, but you trigger it from your machine.

Approach 2: Direct Access via ADLS

Since Spark tables in Synapse are stored as Delta tables in ADLS, you can directly read them using Azure SDKs like azure-storage-file-datalake or delta-rs.

Example using Pandas and Delta:

from deltalake import DeltaTable table_path = "abfss://<container>@<storage-account>.dfs.core.windows.net/<path-to-delta-table>" df = DeltaTable(table_path).to_pandas() print(df)

For this to work, you would need Azure credentials (like Managed Identity or SAS Token).

Let me know which approach works for you or if you need further guidance!
Rakesh Govindula 5 Reputation points Microsoft Vendor

2025-02-20T09:46:39.6166667+00:00
Hi @Haria, Neel,

You can use Pyodbc and pandas to achieve your requirement. For this, your tables should be external tables.

The User name and password are the same as that you are giving while creating the synapse workspace.

import pandas as pd import pyodbc conn = pyodbc.connect( "DRIVER={ODBC Driver 17 for SQL Server};" "SERVER=<server_endpoint>;" "DATABASE=<databasename>;" "UID=<username>;" "PWD=<password>;" ) query = "SELECT TOP 100 * FROM table2" df = pd.read_sql(query, conn) print(df.head()) conn.close()

Let me know which approach works for you or if you need further guidance!

1 answer

Amira Bedhiafi 28,766 Reputation points

2025-02-17T16:26:41.7933333+00:00
Prerequisites

Set up your Azure Synapse Analytics workspace set up

You need to have the necessary permissions to access the Synapse workspace. You can use a service principal or Azure AD authentication.

Python installed on your local machine.

You need to install the necessary Python libraries. You can do this using pip:

pip install pyodbc pip install azure-identity pip install azure-synapse

If you are using Azure AD authentication, you can use the azure-identity library to authenticate.

from azure.identity import DefaultAzureCredential credential = DefaultAzureCredential()

You can use the pyodbc library to connect to Azure Synapse Analytics. Below is an example of how to set up the connection and run a query.

import pyodbc # Connection parameters server = '<your-synapse-workspace>.sql.azuresynapse.net' database = 'default' username = '<your-username>' password = '<your-password>' driver= '{ODBC Driver 17 for SQL Server}' # Connection string conn_str = f'DRIVER={driver};SERVER={server};DATABASE={database};UID={username};PWD={password}' # Establish the connection conn = pyodbc.connect(conn_str) # Create a cursor cursor = conn.cursor() # Execute a query query = "SELECT * FROM abc" cursor.execute(query) # Fetch and print the results rows = cursor.fetchall() for row in rows: print(row) # Close the connection cursor.close() conn.close()

If you prefer using the Azure Synapse SDK, you can use the azure-synapse library to interact with your Synapse workspace.

from azure.synapse import SynapseClient from azure.identity import DefaultAzureCredential credential = DefaultAzureCredential() synapse_client = SynapseClient(credential, '<your-subscription-id>', '<your-resource-group>', '<your-synapse-workspace>') query = "SELECT * FROM abc" result = synapse_client.sql_pools.execute('<your-sql-pool-name>', query) for row in result: print(row)
Please sign in to rate this answer.
Haria, Neel 0 Reputation points

2025-02-17T16:53:17.7966667+00:00

I tried using SynapseClient,

Getting the following erros:
SynapseClient does not take 5 parameters, only 1 (credentials).

TypeError: SynapseClient.__init__() takes from 2 to 4 positional arguments but 5 were given

SynapseClient does not sql_pools as a method.

AttributeError: 'SynapseClient' object has no attribute 'sql_pools'

Edit: Support for SynapseClient seems to be deprecated. Please comment with tested solutions and refrain from copy pasting answers from chatGPT prompts. Thank you!

Haria, Neel 0 Reputation points

2025-02-17T16:56:16.0766667+00:00

For using pyodbc, how do I find my username and password for tables in Lake Databses?

Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
Sign in to comment

Use comments to ask for clarification, additional information, or improvements to the question.

Share via

Connect Local Machine to Synapse Lake Database using Python

1 answer

Prerequisites

Your answer