Migrate from Dataflow Gen1 to Dataflow Gen2

This article targets Power BI dataflow creators. It provides them with guidance and rationale to help migrate their dataflows to Dataflow Gen2 in Data Factory for Microsoft Fabric.

Note

Dataflow Gen2 is a new generation of dataflows that delivers new features and improved experiences. Gen2 dataflows reside alongside Power BI dataflows, which are now known as Dataflow Gen1.

To understand the differences between Dataflow Gen1 and Dataflow Gen2, see Getting from Dataflow Generation 1 to Dataflow Generation 2.

Background

Microsoft Fabric has evolved into an integrated platform for both self-service and IT-managed enterprise data. With exponential growth in data volumes and complexity, Fabric customers demand that their enterprise solutions scale, are secure, easy to manage, and accessible to all users across the largest of organizations.

In recent years, Microsoft has taken great strides to deliver scalable cloud capabilities to Fabric capacity. To that end, Data Factory in Fabric instantly empowers a large ecosystem of data integration developers and data integration solutions that have been built up over decades. It leverages the full set of features and capabilities that go far beyond comparable functionality available in previous generations.

Naturally, customers are now asking whether there's an opportunity to consolidate their data integration solutions by hosting them within Fabric. They often ask questions like:

  • Does all the dataflow functionality we depend on work in Dataflow Gen2?
  • What capabilities are available only in Dataflow Gen2?
  • How do we migrate existing dataflows to Dataflow Gen2?
  • What's Microsoft's roadmap for enterprise data ingestion?

Answers to many of these questions are described in this article.

Note

The decision to migrate to Fabric capacity depends on the requirements of each customer. Customers should carefully evaluate the benefits in order to make an informed decision. We expect to see organic migration to Dataflow Gen2 over time, and our intention is that it happens on terms that the customer is comfortable with.

To be clear, currently there aren't any plans to deprecate Power BI dataflows or Power Platform dataflows. However, there is a priority to focus investment on Dataflow Gen2 for enterprise data ingestion, and so the value provided by Fabric capacity will increase over time. Customers that choose Fabric capacity can expect to benefit from alignment with the Microsoft Fabric product roadmap.

Convergence of self-service and enterprise data integration

The consolidation of items in Fabric simplifies discovery, collaboration, and management by co-locating resources. It allows central IT teams to more easily adopt and integrate popular self-service items. At the same time, it allows operationalizing mission-critical data movement and transformation services aligned with corporate standards, including data lineage and monitoring.

To support the collaborative and scalable needs of creators, Dataflow Gen2 in Fabric introduces fast copy, which enables efficient ingestion of large data volumes by using Fabric's backend infrastructure to store and process intermediate data during transformation. It can handle terabytes of data seamlessly. Dataflow creators can specify data destinations for their transformed data, such as a Fabric lakehouse, warehouse, eventhouse, or Azure SQL Database, facilitating better data management and accessibility. And what's more, the recent integration of generative AI through Copilot enhances the data preparation experience by providing intelligent code generation and automating repetitive tasks, providing an easier and faster path to create complex solutions.

By utilizing a common platform, the workflow is streamlined, which results in enhanced collaboration between the business and IT. Organizations are therefore empowered to scale their data solutions to enterprise levels, ensuring high performance, flexibility, and efficiency in managing vast volumes of data.

Fabric capacity

Thanks to its distributed architecture, Fabric capacity is less sensitive to overall load, temporal spikes, and high concurrency. By consolidating capacities to larger Fabric capacity SKUs, customers can achieve increased performance and throughput.

Feature comparison

The following table presents features supported in Power BI dataflow and/or Fabric Dataflow Gen2.

Feature Power BI Dataflow Gen1 Fabric Dataflow Gen2
Connectivity
Support for all Power Query data sources Yes Yes
Connect to, and load data from, dataflows in Power BI Desktop, Excel, or Power Apps Yes Yes
Scalability
Fast copy, which supports large-scale data ingestion, utilizing the data pipeline Copy activity within dataflows No Yes
Scheduled refresh, which keeps data current Yes Yes
Incremental refresh, which uses policies to automate incremental data load and can help deliver near real-time reporting Yes Yes
Data pipeline orchestration, which allows you to add a Dataflow activity to a data pipeline and create orchestrated conditional events No Yes
Artificial intelligence
Copilot for Data Factory, which provides intelligent code generation to transform data with ease, and generates code explanations to help better understand complex tasks No Yes
Cognitive Services, which use artificial intelligence (AI) to apply different algorithms from Azure Cognitive Services to enrich self-service data preparation Yes No 1
Automated machine learning (AutoML), which enables business analysts to train, validate, and invoke machine learning (ML) models directly in Fabric Deprecated 2
Azure Machine Learning integration, which exposes custom models as dynamic Power Query functions that users can invoke in the Power Query Editor Yes No 1
Content management
Data lineage view, which help users understand and assess dataflow item dependencies Yes Yes
Deployment pipelines, which manage the lifecycle of Fabric content Yes Yes
Platform scalability and resiliency
Premium capacity architecture, which supports increased scale and performance Yes Yes
Multi-Geo support, which helps multinational customers address regional, industry-specific, or organizational data residency requirements Yes 3 Yes
Security
Virtual network (VNet) data gateway connectivity, which allows Fabric to work seamlessly in an organization's virtual network No Yes
On-premises data gateway connectivity, which allows for secure access of data between an organization's on-premises data sources and Fabric Yes Yes
Azure service tags support, which is a defined group of IP addresses that's automatically managed to minimize the complexity of updates or changes to network security rules Yes Yes
Governance
Content endorsement, to promote or certify valuable, high-quality Fabric items Yes Yes
Microsoft Purview integration, which helps customers manage and govern Fabric items Yes Yes
Microsoft Information Protection (MIP) sensitivity labels and integration with Microsoft Defender for Cloud Apps for data loss prevention (DLP) Yes Yes
Monitoring and diagnostic logging
Enhanced refresh history, which allows you to evaluate in detail what happened during the refresh of your dataflow No Yes
Monitoring hub, which provides monitoring capabilities for Fabric items No Yes
Microsoft Fabric Capacity Metrics app, which provides monitoring capabilities for Fabric capacity Yes Yes
Audit log, which tracks user activities across Fabric and Microsoft 365 Yes Yes

1 To learn how to create custom functions that call Azure AI API endpoints, see Tutorial: Extract key phrases from text stored in Power BI.

2 Automated Machine Learning (AutoML) has been deprecated. For more information, see this official announcement.

3 To configure Power BI dataflow storage to use Azure Data Lake Storage (ADLS) Gen2, see this article.

Considerations

There are other considerations to factor into your planning before migrating to Dataflow Gen2.

Licensing

You require a Pro or Premium Per User (PPU) license to publish or manage Power BI dataflows (Dataflow Gen1). In contrast, you only require a Microsoft Fabric (Free) license to author a Dataflow Gen2 in a Premium capacity workspace.

Migration scenarios

When you migrate your dataflows, it's important to think beyond simply copying existing solutions. Instead, we recommend modernizing your solutions by taking advantage of the latest innovations and capabilities of Dataflow Gen2. This approach ensures that your solutions can support the growing demands of the business.

In the migration scenarios article, several methods for upgrading, taking inventory, and using accelerators like Power Query templates are described. These methods can help to ensure a seamless upgrade for your projects.

Roadmap

The Microsoft Fabric release plan announces the latest updates and timelines as features are prepared for future release, including what's new and planned for Data Factory in Microsoft Fabric.

For more information about this article, check out the following resources: