Compartir a través de


Building a HQL IDE for Apache Hive

My new team ( https://www.hadooponazure.com/) works on making
Windows Server & Windows Azure the best environment for hosting Hadoop.  
As one my new challenges, I’ve been tasked with building an editor / job manager to Apache Hive.
Seen below is my first draft at providing a terminal for running and saving Hive Jobs using HQL.

Please direct all comments/feedback to PhaniRaj AT Microsoft DOT COM

This is what the editor looks like  :

HiveQueryEditor 

Let’s start by first creating a connection to your local Hive server.

CreateNewConnection

Once the connection is created, we connect to the Hive server by using the Microsoft ODBC Driver for Hive .
We’ll soon move the console over to use Apache’s new Templeton apis for metadata access and job submission.
You can learn more about Templeton here : https://people.apache.org/~thejas/templeton_doc_latest/

LoadingMetadata

Support for basic Hive Metadata visualization

We visualize the Hive metadata as a hierarchical tree-view .

metadata_tree

A query editor for HQL that supports syntax coloring, auto completion and other fun activities.

The IDE hosts an editor that supports auto completion & syntax coloring for HQL keywords & functions.
You can find HQL’s language specification here : https://cwiki.apache.org/confluence/display/Hive/LanguageManual

Clicking on a table name from the above tree view produces a sample query that selects the first 10 rows from the table.  

SampleQueryFromTable

You can edit this query or clear it and start over.

HQL Code Snippets for common tasks in Hive

We’ve seeded the editor with some code snippets for common tasks .
Below is an example of the “Create External Table” code snippet.

CodeSnippets1

CodeSnippets2  

Auto completion  support for hive Functions in the editor.

HiveFunctionsAutocompletion

 

Auto completion  support for hive keywords in the editor.

About 163 keywords in HQL are supported in the editor.

HiveKeywordAutocompletion

 

Metadata sensitive auto completion support for queries.

We inject the metadata we gleaned from the Hive server into the editor so that
you can use intellisense on column names in your queries.

 

Metadata_autocompletion

Once you have the query written, hit “Run Query” to kick off query execution.
We kick off the Hive job and wait for its completion, intermittently polling for the results.

QueryRunning

Once the query is finished successfully, you will see the query icon light up.

QuerySucceeded

Clicking on the query icon should bring up the results in a tabular format.

QueryResults

If on the other hand, the query failed, you will see a “query failed” icon next to the query.

QueryFailed

Clicking on this will bring up a window with links where you can further see details about why the query failed.

QueryFailed_details 

In conclusion, I hope this is something that you find useful.
As always your feedback and comments are welcome at my email mentioned above.
There’s many other features that we’re planning for this editor and will post regular updates to this work as we progress.