Data Otaku
A seemingly random collection of data- and database-related posts
Load Azure Storage Data into Cloudera HDFS
How do I get data from an Azure Storage account into an Azure-deployed Cloudera cluster running...
Date: 10/30/2017
Working with the HBase Import and Export Utility
As mentioned in a couple other posts, I am working with a customer to move data between two Hadoop...
Date: 12/21/2016
Pushing Data from a Hortonworks Cluster to an Azure HDInsight Cluster
I have a scenario where a customer wishes to explore a move from an existing Hortonworks (HDP)...
Date: 12/09/2016
Goofing around with the Cognitive Services Translator API
Recently, I was asked to get familiar with the Translator API in Azure Cognitive Services. Not...
Date: 11/05/2016
A Fixed-Width Extractor for Azure Data Lake Analytics
I have a fixed-width text file I'd like to use with Azure Data Lake Analytics (ADLA). ADLA reads...
Date: 10/27/2016
Data Sources for Business Analysts
I commonly get asked where to find data. For most Business Analysts, your best data sources are...
Date: 07/04/2016
Split a Large Row-Formatted Text File using PowerShell
I move a lot of large, row-formatted text files into Azure Storage for work I do with HDInsight and...
Date: 06/24/2016
Configuration of HBase on Azure HDInsight as a Drill Data Source
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. In a previous...
Date: 06/20/2016
Configuration of Hive on Azure HDInsight as a Drill Data Source
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With my Drill...
Date: 06/13/2016
Enabling SSL Encryption on the Drill Web Console
NOTE This is a continuation of my series on the deployment of Apache Drill on Azure. That said,...
Date: 06/10/2016
Setting up Basic User Authentication in Drill
NOTE This is a continuation of my series on the deployment of Apache Drill on Azure. That said,...
Date: 06/10/2016
A Script for Replicating Database Backup Files to the Azure Cloud
Let's say we have a backup process that creates backup files on a file server. For disaster...
Date: 06/07/2016
Connecting to the Drill Cluster from a Client App
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Drill...
Date: 06/01/2016
Connect Drill to an Azure SQL Database
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. The intent of...
Date: 05/31/2016
Configuration of Azure Blob Storage (aka WASB) as a Drill Data Source
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Azure Storage...
Date: 05/30/2016
Configuration of the Drill Cluster
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With our...
Date: 05/29/2016
Configuration of the ZooKeeper Ensemble
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With the VMs...
Date: 05/28/2016
The Deployment Mechanics for the Drill Infrastructure
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. In my last...
Date: 05/28/2016
An Overview of an Apache Drill Topology in Azure
NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Apache Drill...
Date: 05/27/2016
Deploying Apache Drill on Azure
I have a customer interested in leveraging Apache Drill for interactive queries on data resident in...
Date: 05/27/2016
Provision an HDInsight Cluster with Tez as Default Hive Execution Engine
There is a lot you can do with Hadoop but I primarily use it to store data I want to loosely...
Date: 12/04/2014
Creating a Demo Power BI Data Gateway using an Azure Virtual Machine
PLEASE READ CAREFULLYBefore getting into the topic of this post, I want to clarify that what I am...
Date: 03/05/2014
Hadoop for .NET Developers: Working with HDInsight on Azure GA
This last week, HDInsight on Azure became generally available. This is great but with the roll out...
Date: 11/02/2013
Hadoop for the .NET Developer: Troubleshooting with the MapReduce Job Logs
NOTE This post is one in a series on Hadoop for .NET Developers.Despite your best efforts, you will...
Date: 09/14/2013
Hadoop for .NET Developers: Unit-Testing with the .NET SDK
NOTE This post is one in a series on Hadoop for .NET Developers.Data are problematic and code...
Date: 09/13/2013
Hadoop for .NET Developers: Implementing a (Slightly) More Complex MapReduce Job
NOTE This post is one in a series on Hadoop for .NET Developers.In our first MapReduce exercise, we...
Date: 09/12/2013
Hadoop for .NET Developers: Understanding Hadoop Streaming
NOTE This post is one in a series on Hadoop for .NET Developers.In the last post, we built a simple...
Date: 09/09/2013
Hadoop for .NET Developers: Implementing a Simple MapReduce Job
NOTE This post is one in a series on Hadoop for .NET Developers.In this exercise, we will write and...
Date: 09/07/2013
Hadoop for .NET Developers: Understanding MapReduce
NOTE This post is one in a series on Hadoop for .NET Developers.In Hadoop, data processing is...
Date: 09/04/2013
Hadoop for .NET Developers: Programmatically Loading Data to AVS
NOTE This post is one in a series on Hadoop for .NET Developers.As mentioned in an earlier post, the...
Date: 08/27/2013
Hadoop for .NET Developers: Understanding Azure Vault Storage
NOTE This post is one in a series on Hadoop for .NET Developers.My explanation of Hadoop storage in...
Date: 08/27/2013
Hadoop for .NET Developers: Programmatically Loading Data to HDFS
NOTE This post is one in a series on Hadoop for .NET Developers.In the last blog post in this...
Date: 08/26/2013
Hadoop for .NET Developers: Manually Loading Data to Hadoop
NOTE This post is one in a series on Hadoop for .NET Developers.To manually load a file to Hadoop,...
Date: 08/26/2013
Hadoop for .NET Developers: Understanding HDFS
NOTE This post is one in a series on Hadoop for .NET Developers.From a data storage perspective, you...
Date: 08/26/2013
Hadoop for .NET Developers: Obtaining the Sample Data Sets
NOTE This post is one in a series on Hadoop for .NET Developers.In the exercises that follow, we...
Date: 08/15/2013
Hadoop for .NET Developers: Setting Up an Azure Cluster
NOTE This post is one in a series on Hadoop for .NET Developers.For rapid provisioning and lack of...
Date: 08/14/2013
Hadoop for .NET Developers: Setting Up a Desktop Development Environment
NOTE This post is one in a series on Hadoop for .NET Developers.If you are a .NET developer, you...
Date: 08/14/2013
Hadoop for .NET Developers: Basic Architecture
NOTE This post is one in a series on Hadoop for .NET Developers.Hadoop is implemented as a set of...
Date: 08/14/2013
Hadoop for .NET Developers: Understanding Hadoop
NOTE This post is one in a series on Hadoop for .NET Developers.Big Data has been a source of...
Date: 08/14/2013
Hadoop for .NET Developers
Well, it’s Summer again and time for some new blog entries. This Summer, I’ve had some...
Date: 08/14/2013
Presenting Actuals and Forecast Concurrently in a Write-Enabled Cube
I have written a series of entries on writeback applications and wanted to add this last entry...
Date: 01/26/2013
UPDATED: Getting the Timeline Filter (Slicer) in Excel 2013 to Work with an Analysis Services OLAP Cube
In Excel 2013, there is a new Timeline filter (slicer) that allows you to easily select a range of...
Date: 11/30/2012
Writeback Application Code Samples
In order to help folks get started with writeback applications, I'm posting here an Analysis...
Date: 07/20/2012
Managing Writeback Cubes
NOTE This is part of a series of entries on the topic of Building Writeback Applications with...
Date: 07/20/2012
Writeback to a Regular Dimension
NOTE This is part of a series of entries on the topic of Building Writeback Applications with...
Date: 07/20/2012
Writeback to a Parent-Child Dimension
NOTE This is part of a series of entries on the topic of Building Writeback Applications with...
Date: 07/19/2012
Introducing Dimension Writeback
NOTE This is part of a series of entries on the topic of Building Writeback Applications with...
Date: 06/20/2012
Allocation across a Parent-Child Hierarchy
NOTE This is part of a series of entries onthe topic of Building Writeback Applications with...
Date: 06/19/2012
Understanding Allocations
NOTE This is part of a series of entries on the topic of Building Writeback Applications with...
Date: 06/16/2012