Azure HDInsight: PowerShell Commands
While working on PowerShell scripts, make it a habit to define/store the cluster name, storage name, etc. in some relevant/meaningful variable names. Use these variables as and when required.
- To associate a Windows Azure Subscription:
- Add-AzureAccount
- To see the already associated account
- Get-AzureAccount
- To define a variable and assign value to that variable
- $<VariableName> = "Value"
- For Example, $subscriptionName = “Free Trial”
- To see your subscription
- Get-AzureSubscription
- To assign a storage key to a variable:
- $storageAccountKey = Get-AzureStorageKey $storageAccountName | %{ $_.Primary }
- To select a particular subscription at the time of usage. It is mostly used when you have added multiple subscriptions:
- Select-AzureSubscription $subscriptionName
- To define a MapReduce Job, just define a variable and assign it a value using “New-AzureHDInsightMapReduceJobDefinition” function along with some parameters. A sample is shown below:
- $MapReduceDefinition = New-AzureHDInsightMapReduceJobDefinition -JarFile "<location of JAR file (MapReduce)>" -Arguments "<Path to source file>", "<Path to output folder>"
- To execute the above defined MapReduce job, use the following command:
- $executeJob = Start-AzureHDInsightJob -Cluster <clustername> -JobDefinition $MapReduceDefinition
- To view the status of the job execution, use:
- Wait-AzureHDInsightJob -Job $executeJob -WaitTimeoutInSeconds 3600
- To run a Hive Jobs thru PowerShell, use the following command:
- $CreateTableVar = "DROP TABLE WordCount;" + "CREATE EXTERNAL TABLE WordCount(c1 string, c2 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' STORED AS TEXTFILE LOCATION '<Path to the source / part file>';"
- To define the HIVE query, use:
- $DefiningHive = New-AzureHDInsightHiveJobDefinition -Query $CreateTableVar
- To execute the HIVE job, use:
- $hiveJob = Start-AzureHDInsightJob -Cluster <name of the cluster> -JobDefinition $DefiningHive
- Once the table is created, you can start querying into the content of that table. Use the following command:
- Query without getting the status displayed
- invoke-Hive -Query "SELECT * FROM WordCount LIMIT 10; "
- Query with the status getting displayed
invoke-Hive -Query @"
SELECT * FROM WordCount LIMIT 10;
"@
- Query without getting the status displayed
This is just a consolidated list of PowerShell commands developed by leveraging the Azure HDInsight guides.
This list of PowerShell command goes on and on. Let's try to cover as many HDInsight specific commands as possible in this article.