如何使用Java程序通过JDBC访问HDInsight Hive Server
很多用户在使用HDInsight的时候,需要Java客户端访问群集的Hive Server语句、或者提交作业,在Azure的门户已经提供了使用ODBC访问Hive Server的方式(海外环境配置ODBC链接,中国环境配置ODBC链接),本文主要介绍如何让Java程序通过JDBC访问HDInsight Hive Server。
首先需要注意的是,从安全的角度,HDInsight会使用SSL安全连接,监听443端口,如下为一个示例连接字符串:
jdbc:hive2://myclustername.azurehdinsight.net:443/default;ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=/hive2
jdbc:hive2://myclustername.azurehdinsight.cn:443/default;ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=/hive2
如下步骤为一个以Maven项目为例的访问Hive Server的示例:
1. 假定已经安装好Eclipse和Maven,为了获取Maven的项目模板,请执行如下命令:
mvn archetype:generate -DgroupId=com.microsoft.css -DartifactId=HiveJdbcTest -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
2. 修改POM.XML:默认的POM文件,并不包含HDInsight所需的Jar依赖文件,请手动添加以下依赖。
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>jdk.tools</groupId>
<artifactId>jdk.tools</artifactId>
<version>1.8</version>
<scope>system</scope>
<systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.calcite</groupId>
<artifactId>calcite-avatica</artifactId>
<version>0.9.2-incubating</version>
</dependency>
<dependency>
<groupId>org.apache.calcite</groupId>
<artifactId>calcite-core</artifactId>
<version>0.9.2-incubating</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
3.Java代码访问HDInsight的示例Hive Table代码如下:
package HiveTest2.MyJDBCTest;
import java.sql.*;
public class App {
public static void main(String[] args) throws SQLException {
Connection conn=null;
Statement stmt = null;
ResultSet res = null;
try
{
Class.forName("org.apache.hive.jdbc.HiveDriver");
//Note that HDInsight uses port 443 for SSL secure connections, and the port forwarder listening to 443
// will direct it to the hiveserver2 from there on port 10001.
String connectionQuery = "jdbc:hive2://HDInclusterName.azurehdinsight.net:443/default;ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=/hive2";
conn = DriverManager.getConnection(connectionQuery,"HDIClusterUserName","HDIUserPassword");
stmt = conn.createStatement();
String sql =null;
sql = "Select * from hivesampletable LIMIT 3";
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println( res.getString(1) + "\t" + res.getString(2) + "\t" + res.getString(3) + "\t" + res.getString(4) + "\t" + res.getString(5) + "\t" + res.getString(6));
}
System.out.println("Hive queries completed successfully!");
}
catch (SQLException e )
{
e.getMessage();
e.printStackTrace();
System.exit(1);//
}
catch(Exception ex)
{
ex.getMessage();
ex.printStackTrace();
System.exit(1);//
}
finally {
if (res!=null) res.close();
if (stmt!=null) stmt.close();
}
}
}
4.如下为调用以上代码,成功查询海外Azure和中国Azure的HiveServer的结果:
请注意,针对于中国的Azure,请使用OpenJDK, 如Azul JDK,具体方法如下:
1. 从如下地址下载Azul JDK,(针对于Windows环境,,请选择“Windows and Microsoft Azure”目录下的安装文件)
https://www.azulsystems.com/products/zulu/downloads#Windows
2. 针对于Eclipse开发环境,请从如下路径引用Azul JDK
Eclipse->Windows->Preferences->Java->Installed JREs
若Zulu文件包未显示,请跳转到如下路径,搜索该JRE文件
点击如下附件,获取整个Java Maven示例代码。