Compartir vía


Querying Hive tables with Windows PowerShell

patterns & practices Developer Center

From: Developing big data solutions on Microsoft Azure HDInsight

To query a Hive table in a PowerShell script you can use the New-AzureHDInsightHiveJobDefinition and Start-AzureHDInsightJob cmdlets, or you can use the Invoke-AzureHDInsightHiveJob cmdlet (which can be abbreviated to Invoke-Hive). Generally, when the purpose of the script is simply to retrieve and display the results of Hive SELECT query, the Invoke-Hive cmdlet is the preferred option because using it requires significantly less code.

The Invoke-Hive cmdlet can be used with a Query parameter to specify a hard-coded HiveQL query, or with a File parameter that references a HiveQL script file stored in Azure blob storage. The following code example uses the Query parameter to execute a hard-coded HiveQL query.

$clusterName = "cluster-name"
$hiveQL = "SELECT obs_date, avg(temperature) FROM observations GROUP BY obs_date;"
Use-AzureHDInsightCluster $clusterName
Invoke-Hive -Query $hiveQL

Figure 1 shows how the results of this query are displayed in the Windows PowerShell ISE.

Figure 1 - Using the Invoke-Hive cmdlet in the Windows PowerShell ISE

Figure 1 - Using the Invoke-Hive cmdlet in the Windows PowerShell ISE

Next Topic | Previous Topic | Home | Community