Copy Files from sharepoint online site to azure datalake storage

Sourav 105 Reputation points
2024-11-06T22:12:34.5+00:00

Hello

We are trying to setup the flow which will copy files from sharepoint online site to azure datalake storage.

As per my understanding there are 2 options :

  1. Using ADF to pull the files as mentioned in the link below :

https://learn.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list?tabs=data-factory

Here I have few questions :

  • We are not sure when the files would arrive in sharepoint and sometimes the files may be delayed by 60 days so it may not make sense to schedule it in ADF. Is that correct understanding or is there a way ?
  • If we use service principal and secret. Do we need any license ? Also, is the service principal required specific API permission in ENTRA/AD ?
  • Apart from sharepoint permission mentioned do we need any permission for the storage account ? If yes, how can we set this permission set to specific container/folder and what would be the least privilege permission ?
  1. Using Power Automate flow via 2 way connection - Sharepoint and Azure Storage connector.
    • Can we use service principal and secret to authenticate sharepoint online and azure storage ?
    • Do we need any licenses - sharepoint, AD/Entra, power automate ?
    • Can we setup the pipeline using first option as ADF and just trigger the ADF pipeline from power automate.
    • What are the permissions required - AD/Entra( API permission for sharepoint etc), power automate, sharepoint site, azure storage.

Thanks in advance

Regards

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,495 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,911 questions
SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
10,898 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Keshavulu Dasari 1,830 Reputation points Microsoft Vendor
    2024-11-06T23:31:00.34+00:00

    Hi Sourav,
    Welcome to Microsoft Q&A Forum, thank you for posting your query here!
    When setting up a flow to copy files from SharePoint Online to Azure Data Lake Storage, you can go through two approaches, each with its own considerations.
    1. Using Azure Data Factory (ADF)

    Scheduling and File Arrival: If files can be delayed by up to 60 days, scheduling a regular ADF pipeline might not be efficient. Instead, you can use event-driven triggers or a combination of ADF with Azure Logic Apps to check for new files periodically and trigger the pipeline only when new files are detected.
    For Reference:
    https://learn.microsoft.com/en-us/answers/questions/1513111/how-to-copy-folders-and-files-from-a-sharepoint-si

    Licensing and Permissions:
    Service Principal and Secret:
     You do not need additional licenses for using a service principal. However, the service principal will need specific API permissions in Azure Active Directory (AD) to access SharePoint Online. These permissions include Sites.Read.All for reading files from SharePoint.

    Storage Account Permissions: You will need to grant the service principal access to the Azure Data Lake Storage account. The least privilege permission required would be Storage Blob DataContributor at the container or folder level. You can set these permissions in the Azure portal under the Access Control (IAM) section of your storage account.

    2. Using Power Automate

    Authentication:

    Service Principal and Secret: Power Automate can use service principal authentication for both SharePoint Online and Azure Storage. You will need to register an app in Azure AD and grant it the necessary permissions for both services.

    Licensing:

    SharePoint and AD/Entra: No additional licenses are required beyond what you already have for SharePoint and Azure AD.
    Depending on the complexity and volume of your flows, you might need a premium Power Automate license.

    Triggering ADF from Power Automate: you can set up a Power Automate flow to trigger an ADF pipeline. This can be useful if you want to combine the flexibility of Power Automate with the powerful data integration capabilities of ADF.

    If you have any other questions or are still running into more issues, let me know in the "comments" and I would be happy to help you,


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.