Need to connect to IMB DB2 from Azure databricks workspace

veerabhadra reddy kovvuri 181 Reputation points
2023-08-22T04:58:34.3733333+00:00

Need to connect to IMB DB2 (DB version IBM DB2 for i 7.3) form azure databricks workspace, the DB2 is hosted is AS400 server and DB port 8471. Please suggest on the step by step procedure to connect to DB and query a table, like installing required libraries/drivers in the cluster and commands to importing them and then to connect to DB using a userID and password.

Also let us know whatever suggestion that you provide will work for AS400 virtual machine as well or not.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,365 questions
{count} votes

Accepted answer
  1. Bhargava-MSFT 31,216 Reputation points Microsoft Employee
    2023-09-05T16:03:08.48+00:00

    Hello veerabhadra reddy kovvuri,

    I'm glad that you were able to resolve the issue, and thank you for posting the detailed steps so that others experiencing the same can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution if you'd like to "Accept " the answer.

    Issue:

    How to connect to IMB DB2 hosted on AS400 from Azure data bricks workspace

    Solution:

    Maven library JT400.jar (net.sf.jt400:jt400:20.0.2) is available in the Library tab of Cluster (UI). Once this is installed, users can query the DB2 database in the AS400 server using the below commands.

    # Databricks notebook source
    from py4j.java_gateway import java_import
    java_import(spark._jvm, "com.ibm.as400.access.AS400JDBCDriver")
    
    # COMMAND ----------
    
    from pyspark.sql import SparkSession
    from py4j.java_gateway import java_import
    spark = SparkSession.builder.appName("DB2Connect").getOrCreate()
    
    properties = {     "user": "<USER_ID>",     "password": "<password>",     "driver": "com.ibm.as400.access.AS400JDBCDriver" }
    
    # Set up the JDBC URL
    url = "jdbc:as400://<server>:<port>/<database_name>"
    
    # COMMAND ----------
    df = spark.read.jdbc(url=url, table="<schema_name>.<table_name>", properties=properties)
    display(df)
    

    Thank you again for your time and patience throughout this.

    Please remember to "Accept Answer", so that others in the community facing similar issues can easily find the solution.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. veerabhadra reddy kovvuri 181 Reputation points
    2023-09-01T03:42:54.8066667+00:00

    JT400.jar library (net.sf.jt400:jt400:20.0.2) is available in Library tab of Cluster (UI), which is a maven library. Once installed I was able to query DB2 database which is in AS400 server, successfully using below commands.

    #python notebook commands
    
    from py4j.java_gateway import java_import 
    java_import(spark._jvm, "com.ibm.as400.access.AS400JDBCDriver")
    
    from pyspark.sql import SparkSession 
    from py4j.java_gateway import java_import 
    spark = SparkSession.builder.appName("DB2Connect").getOrCreate() 
    
    properties = { "user": "<USER_ID>", "password": "<password>", "driver": "com.ibm.as400.access.AS400JDBCDriver" } 
    url = "jdbc:as400://<server>:<port>/<database_name>"
    
    df = spark.read.jdbc(url=url, table="<schema_name>.<table_name>", properties=properties)
    display(df)
    
    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.