Improving Import Performance of MIIS Extensible Management Agent (extMA)
This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified at https://www.microsoft.com/info/cpyright.htm
I am assuming that the reader is familiar with the process and concepts behind creation of an extMA. If not, please, take a look at my blog Walkthrough: How to build an extensible management agent for MIIS This blog should provide you with the necessary background on the concepts involved in building your own connector for MIIS.
During the early stages of developing a new extMA performance is not probably the first thing on your mind, you are mainly concerned with getting the basic functionality of an extMA working properly, and there is nothing wrong with this approach. But as you move your extMA out of your development environment and start testing it against some real production scenarios, the issues of performance become more important. Things that worked just fine in you lab environment with a dozen or a hundred test accounts suddenly don’t go as smoothly anymore when your extMA needs to handle for example 30,000 accounts. Of course, performance tuning is more of an art then a science and highly dependant on the system that you are connecting to. But there are some universal performance improving principals, one of which is minimizing number of round trips to database/directory in order to reduce the impact of network latencies incurred by each trip. While working on performance tuning of my extMA for Oracle Security Principals I was reminded of the importance of this principal and would like to share some of that experience in this blog.
Before you even start tuning, it is important to establish a reference point with respect to acceptable performance of your extMA.
In the case of extMA for Oracle Principals we are extracting information from several Oracle system views, for example DBA_USERS, DBA_ROLES, etc. So in order to establish a benchmark I simply created a SQL query that extracts similar information that would be consumed by MIIS and timed its execution while starting the query from Oracle SQL PLUS Worksheet. Of course prior to doing this you need to setup a test environment with a significantly large number of test accounts (in my case I have 10,000 user objects in Oracle and the query would complete in under 1 minutes). Based on this test I set my goal to have the extMA to perform full import stage only operation within 2-3 minutes. Obviously, it would not be realistic to expect our extMA to perform better then the native tools, but we should strive to be in the same range at least.
So when I ran the Oracle extMA against the same test data and the import took over 30 minutes, I knew that this extMA had some room for improvement.
Note: extMA always use file based import process, meaning that data from the remote system is first dumped into a text file (comma delimited, AVP, etc) and then MIIS uses that file to bring the data into the connector space. As a developer of an extMA you only have control over how quickly the data from the remote system ends up in the dump file, after that MIIS does its magic and you have little control over it. So when I said that my extMA took 30 minutes to process 10000 records most of those 30 minutes (about 25) were spent getting data into the dump file.
Good Practice: Always create a benchmark for your extMA performance by using the native tools of the system in question to extract the data to be consumed by MIIS.
So let’s take a look at the original (poorly performing) logic used to extract user information from Oracle and we will try to find some potential performance issues with this approach.
Public Shared Function enumerateUsers(ByVal dbConnection As OracleConnection, _
ByVal exportFile As StreamWriter)
Dim sqlDBA_USERS As String
sqlDBA_USERS = "select * from dba_users"
Dim cmdDBA_USERS As New OracleCommand(sqlDBA_USERS, dbConnection)
cmdDBA_USERS.CommandType = CommandType.Text
Dim rdDBA_USERS As OracleDataReader = cmdDBA_USERS.ExecuteReader
While rdDBA_USERS.Read
exportFile.WriteLine("ObjType: User")
exportFile.WriteLine("ID: " & rdDBA_USERS.Item("USERNAME"))
exportFile.WriteLine("USER_ID: " & rdDBA_USERS.Item("USER_ID"))
. . . . . .
Getting remaining user attributes, Removed here to conserve space
. . . .
Dim sqlProxy_User As String
sqlProxy_User = "select proxy from proxy_users where client = " & "'" & rdDBA_USERS("USERNAME") & "'"
Dim cmdProxy_User As New OracleCommand(sqlProxy_User, dbConnection)
cmdProxy_User.CommandType = CommandType.Text
Dim rdProxy_User As OracleDataReader = cmdProxy_User.ExecuteReader
While rdProxy_User.Read
exportFile.WriteLine("PROXY: " & rdProxy_User("PROXY"))
End While
rdProxy_User.Close()
. . . .
Getting quota information, very similar in concept to getting proxy info,
Removed to concerve space
. . . . .
exportFile.WriteLine()
End While
rdDBA_USERS.Close()
End Function
Let me explain the main points of this function:
Dim sqlDBA_USERS As String
sqlDBA_USERS = "select * from dba_users"
Dim cmdDBA_USERS As New OracleCommand(sqlDBA_USERS, dbConnection)
cmdDBA_USERS.CommandType = CommandType.Text
Dim rdDBA_USERS As OracleDataReader = cmdDBA_USERS.ExecuteReader
These 5 lines of code use the previously established connection to an Oracle server in order to retrieve a cursor which would contain entries from the DBA_USERS system view. So after completion of this portion of the function we get rdDBA_USERS cursor which contains information on Oracle users.
After we setup a loop to enumerate through the rdDBA_USERS cursor and dump information on Oracle Users into an AVP file later to be consumed by MIIS
While rdDBA_USERS.Read
exportFile.WriteLine("ObjType: User")
exportFile.WriteLine("ID: " & rdDBA_USERS.Item("USERNAME"))
exportFile.WriteLine("USER_ID: " & rdDBA_USERS.Item("USER_ID"))
. . . . . .
End While
Unfortunately, not all information pertaining to Oracle users is stored in DBA_USERS. For example information on Quota and Proxy is stored in other system views. Since we need to get this information as well, inside the loop we create another query to get the remaining information for the user currently being processed as part of the loop. Here is the portion of the function that extracts Oracle Proxy authentication information. Similar work needs to be done for Quota information and potentially other aspects of Oracle user security settings.
Dim sqlProxy_User As String
sqlProxy_User = "select proxy from proxy_users where client = " & "'" & rdDBA_USERS("USERNAME") & "'"
Dim cmdProxy_User As New OracleCommand(sqlProxy_User, dbConnection)
cmdProxy_User.CommandType = CommandType.Text
Dim rdProxy_User As OracleDataReader = cmdProxy_User.ExecuteReader
While rdProxy_User.Read
exportFile.WriteLine("PROXY: " & rdProxy_User("PROXY"))
End While
rdProxy_User.Close()
At first this function seems to be quite logical and at least on the surface there does not appear anything substantially wrong with it, but yet this peace of code produces results that were completely off the benchmark expectations that we established using native Oracle tools. So certainly there must be something we can do to improve this function.
After doing some reading on the behavior of Oracle cursors in ODP.NET, I realized that by default OracleDataReader brings only one record at a time from the database. See section on “Controlling the Number of Rows Fetched in One Database Round Trip” in Oracle Data Provider for .NET Developers Guide for more details. In other words, in the case of the 10,000 users the extMA will make 10,000 trips to the database, plus at least additional 10,000 for each query executed inside the loop. So it is no wonder that the import process would take such a long time. But luckily there is an easy way to change this behavior by manipulating the FetchSize property of the OracleDataReader object. So based on this my initial thought was be to modify my code like this:
Dim sqlDBA_USERS As String
sqlDBA_USERS = "select * from dba_users"
Dim cmdDBA_USERS As New OracleCommand(sqlDBA_USERS, dbConnection)
cmdDBA_USERS.CommandType = CommandType.Text
Dim rdDBA_USERS As OracleDataReader = cmdDBA_USERS.ExecuteReader
Dim rdDBA_USERS As OracleDataReader = cmdDBA_USERS.ExecuteReader
rdDBA_USERS.FetchSize = cmdDBA_USERS.RowSize() * 1000
By setting the FetchSize to cmdDBA_USERS.RowSize() * 1000 I was hoping to achieve some major performance gains, since I reduced the number of trips to the database for the DBA_USER data by 1000. Unfortunately, after running this modified code the results were almost the same, still about 30 min for full import. At this point I started to realize that the issue is probably related to the fact that I am performing queries from inside the loop, which still would cause at least 10000 trips to the database for each additional attribute not stored in DBA_USERS (for example Proxy and Quota data). Just as a test after commenting out the queries from the inside the loop the full import completed in just a few minutes, of course this was at a price of loosing some important information about Oracle user. I could not apply the same logic of using larger FetchSize when querying proxy_users, since such query would typically return only a few rows because we were limiting it to entries for a specific user.
sqlProxy_User = "select proxy from proxy_users where client = " & "'" & rdDBA_USERS("USERNAME") & "'"
Good Practice
At this point I realized that I needed to change my approach. Instead of querying multiple system views once per each user I created a join statement which would pull all of the information from Oracle in one of several trips.
Here is the modified portion of the function that brings all of the data in one cursor and does this in just a few trips:
Dim sqlDBA_USERS As String
sqlDBA_USERS = "select T1.*, T2.proxy, T3.tablespace_name, T3.MAX_BYTES
from dba_users T1, proxy_users T2, dba_ts_quotas T3
where T1.USERNAME = T2.client(+) and T1.USERNAME = T3.USERNAME(+) order by T1.USERNAME"
Dim cmdDBA_USERS As New OracleCommand(sqlDBA_USERS, dbConnection)
cmdDBA_USERS.CommandType = CommandType.Text
Dim rdDBA_USERS As OracleDataReader = cmdDBA_USERS.ExecuteReader
rdDBA_USERS.FetchSize = cmdDBA_USERS.RowSize() * 1000
So the resulting cursor will hold information in the following format (I omitted some attributes for clarity):
USERNAME PROXY TABLESPACE_NAME MAX_BYTES
------------------------------ ------------------------------ ------------------------------ ----------
AALFORT AARMITAGE SYSTEM 125829120
AALFORT AARMITAGE USERS 314572800
AALFORT AAPPLEBY SYSTEM 125829120
AALFORT AAPPLEBY USERS 314572800
AALFORT AALSOP SYSTEM 125829120
AALFORT AALSOP USERS 314572800
AALFORT AANGUS SYSTEM 125829120
AALFORT AANGUS USERS 314572800
AALFORT AALLENDORF SYSTEM 125829120
AALFORT AALLENDORF USERS 314572800
AALFORT AANNAN SYSTEM 125829120
AALFORT AANNAN USERS 314572800
Now that I had all the data in once cursor all I need is a way to properly enumerate through this cursor and dump the data into the export file.
Here is the code that accomplishes this:
Dim parentRecordID As String = String.Empty
While rdDBA_USERS.Read
Dim oraUser As oraUser
Dim oraUserID As String = rdDBA_USERS.Item("USERNAME").ToUpper
If Not oraUserID.Equals(parentRecordID.ToUpper) Then
parentRecordID = rdDBA_USERS.Item("USERNAME")
oraUser = New oraUser(rdDBA_USERS.Item("USERNAME"), exportFile)
oraUser.USER_ID = rdDBA_USERS.Item("USER_ID")
… some attributes removed to conserve space …
If Not IsDBNull(rdDBA_USERS("PROXY")) Then
oraUser.addProxy(rdDBA_USERS("PROXY"))
End If
If Not IsDBNull(rdDBA_USERS("MAX_BYTES")) Then
oraUser.addQuota(rdDBA_USERS("TABLESPACE_NAME"), rdDBA_USERS("MAX_BYTES"))
End If
If Not IsDBNull(rdDBA_USERS("PROXY")) Then
oraUser.addProxy(rdDBA_USERS("PROXY"))
End If
Else
If Not IsDBNull(rdDBA_USERS("MAX_BYTES")) Then
oraUser.addQuota(rdDBA_USERS("TABLESPACE_NAME"), rdDBA_USERS("MAX_BYTES"))
End If
If Not IsDBNull(rdDBA_USERS("PROXY")) Then
oraUser.addProxy(rdDBA_USERS("PROXY"))
End If
End If
End While
Let me explain the main logic points of this peace of code.
Dim parentRecordID As String = String.Empty
While rdDBA_USERS.Read
Dim oraUser As oraUser
Dim oraUserID As String = rdDBA_USERS.Item("USERNAME").ToUpper
If Not oraUserID.Equals(parentRecordID.ToUpper) Then
parentRecordID = rdDBA_USERS.Item("USERNAME")
oraUser = New oraUser(rdDBA_USERS.Item("USERNAME"), exportFile)
oraUser.USER_ID = rdDBA_USERS.Item("USER_ID")
As before we will go through each record stored in the cursor, but this time multiple records in the cursor may belong to the same user (see sample data shown earlier), hence the introduction of concept of a parent record. For example
AALFORT SYSTEM 125829120
AALFORT USERS 314572800
User AALFORT may have 2 quota entries; therefore this user will have 2 rows in the resulting cursor. Once we start processing the first record for AALFORT we would set parentRecord = AALFORT, so that when we get to the second entry for this user we will know to append the second quota entry to the same Oracle User object. Once we are done processing AALFORT and move to the next user the parentRecordID will now point to the Username of this new user. By the way the oraUser object is an abstraction of Oracle User record that I created in the extMA to simplify storage of the information while processing a record.
Else
If Not IsDBNull(rdDBA_USERS("MAX_BYTES")) Then
oraUser.addQuota(rdDBA_USERS("TABLESPACE_NAME"), rdDBA_USERS("MAX_BYTES"))
End If
If Not IsDBNull(rdDBA_USERS("PROXY")) Then
oraUser.addProxy(rdDBA_USERS("PROXY"))
End If
End If
In the case of the Else, we know that we are still processing the same record. During our first pass we already gathered everything from DBA_USERS view, so now we only need to deal with multi-value attributes such as Quota and Proxy.
Note: since the data in the cursor is stored in this format:
USERNAME PROXY TABLESPACE_NAME MAX_BYTES
------------------------------ ------------------------------ ------------------------------ ----------
AALFORT AARMITAGE SYSTEM 125829120
AALFORT AARMITAGE USERS 314572800
AALFORT AAPPLEBY SYSTEM 125829120
AALFORT AAPPLEBY USERS 314572800
We need to take measures in order not to add duplicate entries into the Proxy or Quota attributes. Therefore the ora.addProxy and ora.addQuota are implemented in such a way as not to allow duplicates. Here is a how addProxy is implemented
Private _PROXIES As New ArrayList
Sub addProxy(ByVal proxy As String)
If Not _PROXIES.Contains(proxy) Then
_PROXIES.Add(proxy)
_exportFile.WriteLine("PROXY: " & proxy)
End If
End Sub
With the above described modifications the extMA now performs full import stage only operation in under 3 minutes, most of which are spent by MIIS actually bringing the data into the CS. The input file is being produced in a matter of seconds. As you can see, by applying this fundamental performance guidance of reducing the number of round-trips to database we were able to achieve dramatic improvement in our extMA.
You can find the complete source code for this extMA at https://workspaces.gotdotnet.com/oraSecPrincMA
Simply ask to join the workspace and I will be happy to provide you with access to source code.
Comments
- Anonymous
June 16, 2009
PingBack from http://topalternativedating.info/story.php?id=15396