Jumpstarting Big Data Projects / Architectural Considerations of HDInsight Applications @ OOP 2015, TechEd Europe 2014 and PASS Summit 2014
Last week, my esteemed colleague Alexei Khalyako (from AzureCAT – the Azure Customer Advisory Team) and myself were speaking at the OOP 2015 – Software meets Business in Munich on jumpstarting Big Data projects. In fact, this session was also the foundation of our talks at TechEd Europe 2014 and PASS Summit 2014.
Here, we were walking through the architectural considerations and decisions made in building an HDInsight solution. Short reminder: HDInsight is a Hadoop implementation as a platform as a service (PaaS) on Microsoft Azure. The HDInsight solution was to drive visitor experience and provide a personalised view using recommendations.
The session is structured along the typical Data Warehouse workflow:
As we go through every step we highlight the agony of choice between various technologies (both open source and Microsoft Azure services) especially in the big data space:
Mapping the technology options to each step within the data warehouse workflow:
And the final implementation workflow:
You can find the presentation also here on slideshare.