Migrate from Dataflow Gen1 to Dataflow Gen2
This article targets Power BI dataflow creators. It provides them with guidance and rationale to help migrate their dataflows to Dataflow Gen2 in Data Factory for Microsoft Fabric.
Note
Dataflow Gen2 is a new generation of dataflows that delivers new features and improved experiences. Gen2 dataflows reside alongside Power BI dataflows, which are now known as Dataflow Gen1.
To understand the differences between Dataflow Gen1 and Dataflow Gen2, see Getting from Dataflow Generation 1 to Dataflow Generation 2.
Background
Microsoft Fabric has evolved into an integrated platform for both self-service and IT-managed enterprise data. With exponential growth in data volumes and complexity, Fabric customers demand that their enterprise solutions scale, are secure, easy to manage, and accessible to all users across the largest of organizations.
In recent years, Microsoft has taken great strides to deliver scalable cloud capabilities to Fabric capacity. To that end, Data Factory in Fabric instantly empowers a large ecosystem of data integration developers and data integration solutions that have been built up over decades. It leverages the full set of features and capabilities that go far beyond comparable functionality available in previous generations.
Naturally, customers are now asking whether there's an opportunity to consolidate their data integration solutions by hosting them within Fabric. They often ask questions like:
- Does all the dataflow functionality we depend on work in Dataflow Gen2?
- What capabilities are available only in Dataflow Gen2?
- How do we migrate existing dataflows to Dataflow Gen2?
- What's Microsoft's roadmap for enterprise data ingestion?
Answers to many of these questions are described in this article.
Note
The decision to migrate to Fabric capacity depends on the requirements of each customer. Customers should carefully evaluate the benefits in order to make an informed decision. We expect to see organic migration to Dataflow Gen2 over time, and our intention is that it happens on terms that the customer is comfortable with.
To be clear, currently there aren't any plans to deprecate Power BI dataflows or Power Platform dataflows. However, there is a priority to focus investment on Dataflow Gen2 for enterprise data ingestion, and so the value provided by Fabric capacity will increase over time. Customers that choose Fabric capacity can expect to benefit from alignment with the Microsoft Fabric product roadmap.
Convergence of self-service and enterprise data integration
The consolidation of items in Fabric simplifies discovery, collaboration, and management by co-locating resources. It allows central IT teams to more easily adopt and integrate popular self-service items. At the same time, it allows operationalizing mission-critical data movement and transformation services aligned with corporate standards, including data lineage and monitoring.
To support the collaborative and scalable needs of creators, Dataflow Gen2 in Fabric introduces fast copy, which enables efficient ingestion of large data volumes by using Fabric's backend infrastructure to store and process intermediate data during transformation. It can handle terabytes of data seamlessly. Dataflow creators can specify data destinations for their transformed data, such as a Fabric lakehouse, warehouse, eventhouse, or Azure SQL Database, facilitating better data management and accessibility. And what's more, the recent integration of generative AI through Copilot enhances the data preparation experience by providing intelligent code generation and automating repetitive tasks, providing an easier and faster path to create complex solutions.
By utilizing a common platform, the workflow is streamlined, which results in enhanced collaboration between the business and IT. Organizations are therefore empowered to scale their data solutions to enterprise levels, ensuring high performance, flexibility, and efficiency in managing vast volumes of data.
Fabric capacity
Thanks to its distributed architecture, Fabric capacity is less sensitive to overall load, temporal spikes, and high concurrency. By consolidating capacities to larger Fabric capacity SKUs, customers can achieve increased performance and throughput.
Feature comparison
The following table presents features supported in Power BI dataflow and/or Fabric Dataflow Gen2.
Feature | Power BI Dataflow Gen1 | Fabric Dataflow Gen2 |
---|---|---|
Connectivity | ||
Support for all Power Query data sources | Yes | Yes |
Connect to, and load data from, dataflows in Power BI Desktop, Excel, or Power Apps | Yes | Yes |
Scalability | ||
Fast copy, which supports large-scale data ingestion, utilizing the data pipeline Copy activity within dataflows | No | Yes |
Scheduled refresh, which keeps data current | Yes | Yes |
Incremental refresh, which uses policies to automate incremental data load and can help deliver near real-time reporting | Yes | Yes |
Data pipeline orchestration, which allows you to add a Dataflow activity to a data pipeline and create orchestrated conditional events | No | Yes |
Artificial intelligence | ||
Copilot for Data Factory, which provides intelligent code generation to transform data with ease, and generates code explanations to help better understand complex tasks | No | Yes |
Cognitive Services, which use artificial intelligence (AI) to apply different algorithms from Azure Cognitive Services to enrich self-service data preparation | Yes | No 1 |
Automated machine learning (AutoML), which enables business analysts to train, validate, and invoke machine learning (ML) models directly in Fabric | Deprecated 2 | |
Azure Machine Learning integration, which exposes custom models as dynamic Power Query functions that users can invoke in the Power Query Editor | Yes | No 1 |
Content management | ||
Data lineage view, which help users understand and assess dataflow item dependencies | Yes | Yes |
Deployment pipelines, which manage the lifecycle of Fabric content | Yes | Yes |
Platform scalability and resiliency | ||
Premium capacity architecture, which supports increased scale and performance | Yes | Yes |
Multi-Geo support, which helps multinational customers address regional, industry-specific, or organizational data residency requirements | Yes 3 | Yes |
Security | ||
Virtual network (VNet) data gateway connectivity, which allows Fabric to work seamlessly in an organization's virtual network | No | Yes |
On-premises data gateway connectivity, which allows for secure access of data between an organization's on-premises data sources and Fabric | Yes | Yes |
Azure service tags support, which is a defined group of IP addresses that's automatically managed to minimize the complexity of updates or changes to network security rules | Yes | Yes |
Governance | ||
Content endorsement, to promote or certify valuable, high-quality Fabric items | Yes | Yes |
Microsoft Purview integration, which helps customers manage and govern Fabric items | Yes | Yes |
Microsoft Information Protection (MIP) sensitivity labels and integration with Microsoft Defender for Cloud Apps for data loss prevention (DLP) | Yes | Yes |
Monitoring and diagnostic logging | ||
Enhanced refresh history, which allows you to evaluate in detail what happened during the refresh of your dataflow | No | Yes |
Monitoring hub, which provides monitoring capabilities for Fabric items | No | Yes |
Microsoft Fabric Capacity Metrics app, which provides monitoring capabilities for Fabric capacity | Yes | Yes |
Audit log, which tracks user activities across Fabric and Microsoft 365 | Yes | Yes |
1 To learn how to create custom functions that call Azure AI API endpoints, see Tutorial: Extract key phrases from text stored in Power BI.
2 Automated Machine Learning (AutoML) has been deprecated. For more information, see this official announcement.
3 To configure Power BI dataflow storage to use Azure Data Lake Storage (ADLS) Gen2, see this article.
Considerations
There are other considerations to factor into your planning before migrating to Dataflow Gen2.
Licensing
You require a Pro or Premium Per User (PPU) license to publish or manage Power BI dataflows (Dataflow Gen1). In contrast, you only require a Microsoft Fabric (Free) license to author a Dataflow Gen2 in a Premium capacity workspace.
Migration scenarios
When you migrate your dataflows, it's important to think beyond simply copying existing solutions. Instead, we recommend modernizing your solutions by taking advantage of the latest innovations and capabilities of Dataflow Gen2. This approach ensures that your solutions can support the growing demands of the business.
In the migration scenarios article, several methods for upgrading, taking inventory, and using accelerators like Power Query templates are described. These methods can help to ensure a seamless upgrade for your projects.
Roadmap
The Microsoft Fabric release plan announces the latest updates and timelines as features are prepared for future release, including what's new and planned for Data Factory in Microsoft Fabric.
Related content
For more information about this article, check out the following resources: