Azure ML | Support for ADLS Gen2 Datastore with SAS Token Authentication

Pål Hannus 0 Reputation points
2024-11-15T08:37:53.9266667+00:00

Hello,

I’m currently working on an application where we’re connecting various data sources—such as file shares and ADLS Gen2—to Azure Machine Learning. While we can create file-share datastores with SAS token authentication, I noticed that this option isn’t available for ADLS Gen2 datastores in Azure ML. Interestingly, this type of SAS-based authentication seems to be supported in Azure Databricks. Could you clarify if there’s a particular reason for this limitation in Azure ML? Enabling SAS token support for ADLS Gen2 would simplify our integration significantly.

Our use case involves utilizing ADLS Gen2 for blob storage, but we don’t have direct access to the ADLS resource via service principals. Instead, the data lake is primarily accessed through an API, and we are only allowed to generate SAS tokens. Would it be possible to create custom datastores in Azure ML that communicate with the data lake via the API?

Additionally, is it feasible to configure datastores using non-root-level SAS tokens? For instance, if a SAS token provides access only to a specific directory within a blob (e.g., root/some/path/), can we use it to create a datastore scoped to that directory?

Thank you for your assistance!

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,495 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,981 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 12,816 Reputation points
    2024-11-15T11:59:00.43+00:00

    Hello Pål Hannus,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are in need to create a data store that support Azure ML | Support for ADLS Gen2 Datastore with SAS Token Authentication.

    This might be complex than expected, but I will give you best solution and links to read further for more steps and instructions, also utilize the additional resources by the right side of this page.

    Therefore,

    1. Currently, Azure Machine Learning does not support SAS token authentication for ADLS Gen2 datastores. - https://knowledge.informatica.com/s/article/FAQ-What-is-Shared-Access-Signature-Token-in-ADLS-Gen2 This limitation is likely due to security and management considerations, as SAS tokens can be less secure compared to other authentication methods like service principals or managed identities. Azure Databricks, on the other hand, does support SAS tokens for ADLS Gen2, which might be due to different security models and use cases. - https://github.com/easonlai/sas_access_to_adls_databricks
    2. While Azure ML does not natively support SAS tokens for ADLS Gen2, you can potentially create a custom solution. One approach could be to use Azure Functions or Azure Logic Apps to act as an intermediary, generating SAS tokens and accessing the data via the API. This would involve setting up a service that handles the authentication and data access, then connecting this service to your Azure ML workspace.
    3. Yes, it is feasible to configure datastores using non-root-level SAS tokens. If your SAS token provides access to a specific directory within a blob, you can scope your datastore to that directory. When creating the datastore, you would specify the path within the blob storage that the SAS token has access to. This allows you to limit the scope of access to just the necessary data. - https://learn.microsoft.com/en-us/azure/machine-learning/how-to-access-data-interactive?view=azureml-api-2

    This links below cover all detailed steps on setting up a custom solution or any other specific aspect of your integration:

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.