How to Pass File Name from Copy Data Activity to Next Transform Activity ?

Rohant Agrawal 20 Reputation points
2024-11-29T10:21:42.5633333+00:00

Hi,

In copy data activity, I am generating a file by extracting data from a DB source. In the copy data activity, I am sinking this to Azure Blob storage under a newly generated file with naming convention "<activityA>__<date><time>._json" file. I want to pass this file name to next Transform Activity so that this file can further be used for transformation. I am not sure how can I do this is Azure Data Factory. Can someone please help me this.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,969 questions
0 comments No comments
{count} votes

5 answers

Sort by: Most helpful
  1. Deepanshukatara-6769 11,625 Reputation points
    2024-11-29T11:21:04.4733333+00:00

    Hello Rohant, Welcome to MS Q&A

    To pass a dynamically generated file name from a Copy Data activity to a Transform Activity in Azure Data Factory, you can use the following approach:

    1. Generate the File Name in the Copy Data Activity:
      • In the Copy Data activity, use the dynamic content feature to generate the file name. For example, you can use expressions to include the activity name, date, and time in the file name.
    2. Store the File Name in a Variable:
      • Create a pipeline variable to store the generated file name. You can use the Set Variable activity to assign the generated file name to this variable.
    3. Pass the Variable to the Transform Activity:
      • In the Transform Activity, use the variable containing the file name as a parameter. You can reference the variable in the activity's settings or script

    Example Pipeline JSON:

    {
        "name": "ExamplePipeline",
        "properties": {
            "activities": [
                {
                    "name": "CopyDataActivity",
                    "type": "Copy",
                    "inputs": [
                        {
                            "referenceName": "SourceDataset",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "SinkDataset",
                            "type": "DatasetReference"
                        }
                    ],
                    "sink": {
                        "type": "BlobSink",
                        "storeSettings": {
                            "type": "AzureBlobFSWriteSettings"
                        },
                        "formatSettings": {
                            "type": "JsonWriteSettings"
                        },
                        "fileName": {
                            "value": "@concat(activity('CopyDataActivity').name, '__', formatDateTime(utcnow(), 'yyyyMMddHHmmss'), '._json')",
                            "type": "Expression"
                        }
                    }
                },
                {
                    "name": "SetFileNameVariable",
                    "type": "SetVariable",
                    "variables": {
                        "FileName": {
                            "value": "@concat(activity('CopyDataActivity').name, '__', formatDateTime(utcnow(), 'yyyyMMddHHmmss'), '._json')",
                            "type": "Expression"
                        }
                    }
                },
                {
                    "name": "TransformActivity",
                    "type": "DataFlow",
                    "dependsOn": [
                        {
                            "activity": "SetFileNameVariable",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "userProperties": [],
                    "typeProperties": {
                        "dataFlow": {
                            "referenceName": "YourDataFlow",
                            "type": "DataFlowReference"
                        },
                        "parameters": {
                            "FileName": {
                                "value": "@variables('FileName')",
                                "type": "Expression"
                            }
                        }
                    }
                }
            ]
        }
    }
    
    
    
    

    This example demonstrates how to generate a file name dynamically, store it in a variable, and pass it to a subsequent Transform Activity in Azure Data Factory

    Please let us know if any questions

    Kindly accept answer if it helps

    Thanks

    Deepanshu

    0 comments No comments

  2. Rohant Agrawal 20 Reputation points
    2024-11-29T11:45:53.5233333+00:00

    By doing this, the file name that is created in Copy Data Activity is different than that derived while setting the variable value as the utcnow() will produce different value on each activities due to time difference.

    ex. the file name in copy data activity was "ABC_29Nov2024_13:23:28_landing.json" where as the file name derived by varaible is "ABC_29Nov2024_13:23:35_landing.json"

    0 comments No comments

  3. Rohant Agrawal 20 Reputation points
    2024-11-29T12:22:45.1333333+00:00

    But both the places capture the different time stamp. Hence the derived file name is different than the one used in Copy Data Activity.


  4. Ganesh Gurram 1,825 Reputation points Microsoft Vendor
    2024-12-03T10:41:19.1933333+00:00

    @Rohant Agrawal - Thanks for the question and using MS Q&A forum.

    Great to know you got the file name from the variable working in the Copy Data Activity! For the Data Flow Transform Activity, follow these steps to pass the variable to the source settings:

    1. Define Parameters in Data Flow:
      • First, ensure that your Data Flow has a parameter defined for the file name. You can do this by opening your Data Flow and adding a parameter (e.g., FileName).
    2. Pass the Variable to the Data Flow:
      • In your pipeline, when you configure the Data Flow activity, you can pass the variable value to the Data Flow parameter. You will do this in the Parameters section of the Data Flow activity settings.
    3. Use the Parameter in Data Flow:
      • Inside your Data Flow, you can reference the parameter (e.g., FileName) in your source settings. This allows you to dynamically use the file name that was generated in the Copy Data Activity.

    Example Steps

    Here’s a step-by-step guide:

    1. In Your Data Flow:
      • Open your Data Flow and go to the Parameters tab.
      • Add a new parameter named FileName (or whatever you prefer).
    2. In Your Pipeline:
      • In the Data Flow activity settings, go to the Parameters section.
      • Set the FileName parameter to the variable you created earlier.
    3. In Your Data Flow Source:
      • When configuring the source in your Data Flow, you can use the FileName parameter to specify the file path or name dynamically.

    By following these steps, you should be able to successfully pass the dynamically generated file name from your Copy Data Activity to the Data Flow Transform Activity. This approach ensures that the file name remains consistent throughout your pipeline.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click `Accept Answer` and `Yes` for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

  5. Pinaki Ghatak 5,230 Reputation points Microsoft Employee
    2024-12-03T11:19:24.74+00:00

    Hello @Rohant Agrawal

    In order to pass the file name generated in the Copy Activity to the next Transform Activity, you can make use of the Output of the Copy Activity. In the Copy Activity, you can define the Output of the activity as the newly generated file with the naming convention __._json file.

    You can then reference this Output in the Input of the next Transform Activity. To reference the Output of the Copy Activity in the Input of the Transform Activity, you can use the dynamic content feature of Azure Data Factory. In the Input of the Transform Activity, you can use the expression @{activity('CopyActivityName').output.firstRow.fileName} to reference the file name generated in the Copy Activity. Here, 'CopyActivityName' is the name of the Copy Activity that generates the file, and 'fileName' is the name of the column in the Output of the Copy Activity that contains the file name.

    I hope this helps!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.