Condividi tramite


Azure Data Architecture Guide – Blog #8: Data warehousing

In our eighth blog in this series, we'll continue to explore the Azure Data Architecture Guide. The previous entries for this blog series are:

Like the previous post, we'll work from a technology implementation seen directly in our customer engagements. The example can help lead you to the ADAG content to make the right technology choices for your business.

Data warehousing

Here we see store data coming from multiple sources into Azure Data Lake Storage, in their native format. (Azure Data Lake Storage Gen 2 is recommended.) Azure SQL Data Warehouse directly queries against the data with a combination of external tables and schema on read capabilities through PolyBase. Use Azure Data Factory to store the data you need within your warehouse, and quickly analyze and visualize the combined data with Power BI.

Data warehousing

Highlighted services

 

 

See Also

The Reference Architecture, Enterprise BI in Azure with SQL Data Warehouse, implements an extract, load, and transform (ELT) pipeline that moves data from an on-premises SQL Server database into SQL Data Warehouse and transforms the data for analysis.

The Azure Architecture Center also features two related example scenarios that demonstrate solutions using these technologies:

In addition, the Modern Data Warehouse solution ingests the data sources through Azure Data Factory, combining your data to Azure Blob Storage. It uses Azure Databricks to prep and train cleansed and transformed data, to be moved to Azure SQL Data Warehouse (which acts as the data hub). Like the Reference Architecture above, this solution leverages Azure Analysis Services for data modeling (then on to Power BI for your visualizations).

For more information about the data movement, see the following articles:

For more information about stream processing and Azure Data Bricks, see our reference architecture, Create a stream processing pipeline with Azure Databricks, and our ADAG article, Choosing a stream processing technology in Azure.

For more information about Azure Analysis Services and advanced analytics, see our last blog post in this blog series, Azure Data Architecture Guide – Blog #6: Business intelligence, our ADAG article, Choosing an analytical data store in Azure, our two Reference Architectures, Enterprise BI in Azure with SQL Data Warehouse and Automated enterprise BI with SQL Data Warehouse and Azure Data Factory, our Example Scenario Data warehousing and analytics for sales and marketing, and our two Solutions, Advanced analytics on big data and Real-time analytics.

 

Please peruse ADAG to find a clear path for you to architect your data solution on Azure:

 

Azure CAT Guidance

"Hands-on solutions, with our heads in the Cloud!"

Comments

  • Anonymous
    December 29, 2018
    Hi AzureCAT Team,Do you recommend DirectQuery connection to SQL Data Warehouse from Power BI (as it looks on the diagram in the Data warehousing section)? As I tested this approach it practically never works, because of the Power BI's nature as a self-service BI tool (many concurrent queries sent at the same time after a single click in the report containing many visuals). I think only hub & spoke approach works here (having additional layer between SQLDW and Power BI - either Analysis Services or SQL Database). Can you give some comment on that?Thank you in advance.
    • Anonymous
      January 02, 2019
      Pawel, you tried to connect Power BI to SQL Data Warehouse as your data source, per https://docs.microsoft.com/en-us/power-bi/desktop-data-sources? (Azure category data connections)That connection doesn't work at all?Thanks!
      • Anonymous
        January 10, 2019
        The comment has been removed
        • Anonymous
          January 15, 2019
          Thank you, Pawel.So in an ideal world, what features would you want to see added to Power BI and SQL Data Warehouse to address this concurrency issue? Thanks again!
  • Anonymous
    January 02, 2019
    Guys, this is conflicting with the modern DW architecture https://azure.microsoft.com/en-gb/solutions/architecture/modern-data-warehouse/ . How do you move data through the lake? Where is Databricks??? You need to clarify with most customers aware of the ADLA roadmap.
    • Anonymous
      January 02, 2019
      @Wesdev: +1!!!This blog entry is really inconsistent with the architecture you mentioned and also it references Azure Data Lake Store Gen1, while all other recommendations are based either on Blob Storage or ADLS Gen2.I'd really love to see Microsoft fellows sending a single universal message about modern DW architecture to the market, and not architecture relying on definitely NOT bulletproof approach (Power BI connection to SQLDW) and based on old services (ADLS Gen1). Another topic is I'd love to have a clear roadmap for every single Azure service. ADLA and Azure Data Catalog to mention as good examples.Consistency and clear vision are two important factors when client is about to decide whether to go into PaaS solutions or keep to the well-known IaaS implementations based on SQL Server stack.
      • Anonymous
        January 02, 2019
        Thank you for the feedback. I added the references to Gen 2.See my comment below about the newest roadmap and "Azure Updates" features.
    • Anonymous
      January 02, 2019
      The comment has been removed
  • Anonymous
    January 02, 2019
    In November, the Azure Updates site was refreshed with new features: https://techcommunity.microsoft.com/t5/Azure/Staying-up-to-date-with-the-Microsoft-Azure-roadmap/m-p/284886#M3661