Data Otaku

A seemingly random collection of data- and database-related posts

Load Azure Storage Data into Cloudera HDFS

How do I get data from an Azure Storage account into an Azure-deployed Cloudera cluster running...

Date: 10/30/2017

Working with the HBase Import and Export Utility

As mentioned in a couple other posts, I am working with a customer to move data between two Hadoop...

Date: 12/21/2016

Pushing Data from a Hortonworks Cluster to an Azure HDInsight Cluster

I have a scenario where a customer wishes to explore a move from an existing Hortonworks (HDP)...

Date: 12/09/2016

Goofing around with the Cognitive Services Translator API

Recently, I was asked to get familiar with the Translator API in Azure Cognitive Services.  Not...

Date: 11/05/2016

A Fixed-Width Extractor for Azure Data Lake Analytics

I have a fixed-width text file I'd like to use with Azure Data Lake Analytics (ADLA). ADLA reads...

Date: 10/27/2016

Data Sources for Business Analysts

I commonly get asked where to find data.  For most Business Analysts, your best data sources are...

Date: 07/04/2016

Split a Large Row-Formatted Text File using PowerShell

I move a lot of large, row-formatted text files into Azure Storage for work I do with HDInsight and...

Date: 06/24/2016

Configuration of HBase on Azure HDInsight as a Drill Data Source

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. In a previous...

Date: 06/20/2016

Configuration of Hive on Azure HDInsight as a Drill Data Source

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With my Drill...

Date: 06/13/2016

Enabling SSL Encryption on the Drill Web Console

NOTE This is a continuation of my series on the deployment of Apache Drill on Azure. That said,...

Date: 06/10/2016

Setting up Basic User Authentication in Drill

NOTE This is a continuation of my series on the deployment of Apache Drill on Azure. That said,...

Date: 06/10/2016

A Script for Replicating Database Backup Files to the Azure Cloud

Let's say we have a backup process that creates backup files on a file server.  For disaster...

Date: 06/07/2016

Connecting to the Drill Cluster from a Client App

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Drill...

Date: 06/01/2016

Connect Drill to an Azure SQL Database

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. The intent of...

Date: 05/31/2016

Configuration of Azure Blob Storage (aka WASB) as a Drill Data Source

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Azure Storage...

Date: 05/30/2016

Configuration of the Drill Cluster

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With our...

Date: 05/29/2016

Configuration of the ZooKeeper Ensemble

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. With the VMs...

Date: 05/28/2016

The Deployment Mechanics for the Drill Infrastructure

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. In my last...

Date: 05/28/2016

An Overview of an Apache Drill Topology in Azure

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud. Apache Drill...

Date: 05/27/2016

Deploying Apache Drill on Azure

I have a customer interested in leveraging Apache Drill for interactive queries on data resident in...

Date: 05/27/2016

Provision an HDInsight Cluster with Tez as Default Hive Execution Engine

There is a lot you can do with Hadoop but I primarily use it to store data I want to loosely...

Date: 12/04/2014

Creating a Demo Power BI Data Gateway using an Azure Virtual Machine

PLEASE READ CAREFULLYBefore getting into the topic of this post, I want to clarify that what I am...

Date: 03/05/2014

Hadoop for .NET Developers: Working with HDInsight on Azure GA

This last week, HDInsight on Azure became generally available. This is great but with the roll out...

Date: 11/02/2013

Hadoop for the .NET Developer: Troubleshooting with the MapReduce Job Logs

NOTE This post is one in a series on Hadoop for .NET Developers.Despite your best efforts, you will...

Date: 09/14/2013

Hadoop for .NET Developers: Unit-Testing with the .NET SDK

NOTE This post is one in a series on Hadoop for .NET Developers.Data are problematic and code...

Date: 09/13/2013

Hadoop for .NET Developers: Implementing a (Slightly) More Complex MapReduce Job

NOTE This post is one in a series on Hadoop for .NET Developers.In our first MapReduce exercise, we...

Date: 09/12/2013

Hadoop for .NET Developers: Understanding Hadoop Streaming

NOTE This post is one in a series on Hadoop for .NET Developers.In the last post, we built a simple...

Date: 09/09/2013

Hadoop for .NET Developers: Implementing a Simple MapReduce Job

NOTE This post is one in a series on Hadoop for .NET Developers.In this exercise, we will write and...

Date: 09/07/2013

Hadoop for .NET Developers: Understanding MapReduce

NOTE This post is one in a series on Hadoop for .NET Developers.In Hadoop, data processing is...

Date: 09/04/2013

Hadoop for .NET Developers: Programmatically Loading Data to AVS

NOTE This post is one in a series on Hadoop for .NET Developers.As mentioned in an earlier post, the...

Date: 08/27/2013

Hadoop for .NET Developers: Understanding Azure Vault Storage

NOTE This post is one in a series on Hadoop for .NET Developers.My explanation of Hadoop storage in...

Date: 08/27/2013

Hadoop for .NET Developers: Programmatically Loading Data to HDFS

NOTE This post is one in a series on Hadoop for .NET Developers.In the last blog post in this...

Date: 08/26/2013

Hadoop for .NET Developers: Manually Loading Data to Hadoop

NOTE This post is one in a series on Hadoop for .NET Developers.To manually load a file to Hadoop,...

Date: 08/26/2013

Hadoop for .NET Developers: Understanding HDFS

NOTE This post is one in a series on Hadoop for .NET Developers.From a data storage perspective, you...

Date: 08/26/2013

Hadoop for .NET Developers: Obtaining the Sample Data Sets

NOTE This post is one in a series on Hadoop for .NET Developers.In the exercises that follow, we...

Date: 08/15/2013

Hadoop for .NET Developers: Setting Up an Azure Cluster

NOTE This post is one in a series on Hadoop for .NET Developers.For rapid provisioning and lack of...

Date: 08/14/2013

Hadoop for .NET Developers: Setting Up a Desktop Development Environment

NOTE This post is one in a series on Hadoop for .NET Developers.If you are a .NET developer, you...

Date: 08/14/2013

Hadoop for .NET Developers: Basic Architecture

NOTE This post is one in a series on Hadoop for .NET Developers.Hadoop is implemented as a set of...

Date: 08/14/2013

Hadoop for .NET Developers: Understanding Hadoop

NOTE This post is one in a series on Hadoop for .NET Developers.Big Data has been a source of...

Date: 08/14/2013

Hadoop for .NET Developers

Well, it’s Summer again and time for some new blog entries. This Summer, I’ve had some...

Date: 08/14/2013

Presenting Actuals and Forecast Concurrently in a Write-Enabled Cube

I have written a series of entries on writeback applications and wanted to add this last entry...

Date: 01/26/2013

UPDATED: Getting the Timeline Filter (Slicer) in Excel 2013 to Work with an Analysis Services OLAP Cube

In Excel 2013, there is a new Timeline filter (slicer) that allows you to easily select a range of...

Date: 11/30/2012

Writeback Application Code Samples

In order to help folks get started with writeback applications, I'm posting here an Analysis...

Date: 07/20/2012

Managing Writeback Cubes

NOTE This is part of a series of entries on the topic of Building Writeback Applications with...

Date: 07/20/2012

Writeback to a Regular Dimension

NOTE This is part of a series of entries on the topic of Building Writeback Applications with...

Date: 07/20/2012

Writeback to a Parent-Child Dimension

NOTE This is part of a series of entries on the topic of Building Writeback Applications with...

Date: 07/19/2012

Introducing Dimension Writeback

NOTE This is part of a series of entries on the topic of Building Writeback Applications with...

Date: 06/20/2012

Allocation across a Parent-Child Hierarchy

NOTE This is part of a series of entries onthe topic of Building Writeback Applications with...

Date: 06/19/2012

Understanding Allocations

NOTE This is part of a series of entries on the topic of Building Writeback Applications with...

Date: 06/16/2012

Next>