Behavior Change When Handling Character Conversions SQL Server’s ODBC Driver (SQL 2012 – Version 11.xx)
Summary
The SQL Server 2012 ODBC Driver (SQLNCLI11.dll) updated handling of SQL_WCHAR* (NCHAR/NVARCHAR/NVARCHAR(MAX)) and SQL_CHAR* (CHAR/VARCHAR/NARCHAR(MAX)) conversions. The change was made to accommodate surrogate pair behavior as outlined at: https://technet.microsoft.com/en-us/library/ms143179(v=SQL.110).aspx
“In versions of SQL Server prior to SQL Server 2012, string functions did not recognize surrogate pairs as a single character. Some string operations – such as string length calculations and substring extractions – returned incorrect results. SQL Server 2012 now fully supports UTF-16 and the correct handling of surrogate pairs.”
ODBC functions, such as SQLGetData, SQLBindCol, SQLBindParameter, may return (-4) SQL_NO_TOTAL as the length/indicator parameter when using the SQL Server 2012 driver when prior versions of the SQL Server ODBC driver returned a length value (which may not have been correct in all cases).
More Information
SQLGetData Behavior
Many Windows APIs allow you to call the API with a buffer size of 0 and the returned length is the size needed to store the data returned from the API. The following pattern is common for Win32 programmers (minus error handling for clarity).
int iSize = 0;
BYTE * pBuffer = NULL;
GetMyFavoriteAPI(pBuffer, &iSize); // Returns needed size in iSize
pBuffer = new BYTE[iSize]; // Allocate buffer
GetMyFavoriteAPI(pBuffer, &iSize); // Retrieve actual data
Some applications made the assumption that SQLGetData provides the same capabilities.
int iSize = 0;
WCHAR * pBuffer = NULL;
SQLGetData(hstmt, SQL_W_CHAR, ...., (SQLPOINTER*)0x1, 0, &iSize); // Get storage size needed
pBuffer = new WCHAR[(iSize/sizeof(WCHAR)) + 1]; // Allocate buffer
SQLGetData(hstmt, SQL_W_CHAR, ...., (SQLPOINTER*)pBuffer, iSize, &iSize); // Retrieve data
Note: The ODBC specification clearly outlines that SQLGetData can only be called to retrieve chunks of actual data. Calling it using this paradigm is relying on bug behavior.
Let’s look at a specific example of the driver change if you are using the incorrect logic shown above. The application is specifically querying a varchar column and binding as Unicode (SQL_UNICODE/SQL_WCHAR)
Query: select convert(varchar(36), ‘123’)
SQLGetData(hstmt, SQL_WCHAR, ….., (SQLPOINTER*) 0x1, 0 , &iSize); // Attempting to determine storage size needed
Driver Version |
Length or Indicator Outcome |
Description |
< 11.xxx |
6 |
The driver made the incorrect assumption that converting CHAR to WCHAR could be accomplished as length * 2. |
>= 11.xxx |
-4 (SQL_NO_TOTAL) |
The driver logic no longer makes the assumption that converting from CHAR to WCHAR or WCHAR to CHAR is a (multiply) *2 or (divide)/2 action is valid. As such, calling SQLGetData will no longer return the length of the expected conversion as prior driver versions may have provided. The driver detects the conversion between CHAR to WCHAR or WCHAR to CHAR and returns (-4) SQL_NO_TOTAL instead of the *2 or /2 behavior that could be incorrect. |
The proper way to use SQLGetData would be to retrieve the chunks of the data. Pseudo code shown below.
while( (SQL_SUCCESS or SQL_SUCCESS_WITH_INFO) == SQLFetch(...) )
{
SQLNumCols(...iTotalCols...)
for(int iCol = 1; iCol < iTotalCols; iCol++)
{
WCHAR* pBufOrig, pBuffer = new WCHAR[100];
SQLGetData(.... iCol … pBuffer, 100, &iSize); // Get original chunk
while(NOT ALL DATA RETREIVED (SQL_NO_TOTAL, ...) )
{
pBuffer += 50; // Advance buffer for data retreived
// May need to realloc the buffer when you reach current size
SQLGetData(.... iCol … pBuffer, 100, &iSize); // Get next chunk
}
}
}
SQLBindCol Behavior
Query: select convert(varchar(36), ‘1234567890’)
SQLBindCol(… SQL_W_CHAR, …) Only bound a buffer of WCHAR[4] – Expecting String Data Right Truncation behavior
Driver Version |
Length or Indicator Outcome |
Description |
< 11.xxx |
20 |
· SQLFetch reports String Data Right Truncation · Length indicates the complete length of data returned, not what was stored (assumes *2 CHAR to WCHAR conversion which may not be correct for glyphs) · Data stored in buffer is 123\0 - Buffer is guaranteed to be NULL terminated |
>= 11.xxx |
-4 (SQL_NO_TOTAL) |
· SQLFetch reports String Data Right Truncation · Length indicates -4 (SQL_NO_TOTAL) because the rest of the data was not converted · Data stored in buffer is 123\0 - Buffer is guaranteed to be NULL terminated |
SQLBindParameter (OUTPUT Parameter Behavior)
Query: create procedure spTest @p1 varchar(max) OUTPUT
select @p1 = replicate(‘B’, 1234)
SQLBindParameter(… SQL_W_CHAR, …) // Only bind up to first 64 characters
Driver Version |
Length or Indicator Outcome |
Description |
< 11.xxx |
2468 |
· SQLFetch returns no more data available · SQLMoreResults returns no more data available · Length indicates the size of the data returned from server, not stored in buffer · Original buffer contains 63 B’s and NULL terminator - Buffer is guaranteed to be NULL terminated |
>= 11.xxx |
-4 (SQL_NO_TOTAL) |
· SQLFetch returns no more data available · SQLMoreResults returns no more data available · Length indicates (-4) SQL_NO_TOTAL because the rest of the data was not converted · Original buffer contains 63 B’s and NULL terminator - Buffer is guaranteed to be NULL terminated |
Handling CHAR and WCHAR Conversions
The SQL Server 2012 ODBC Driver provides you with several ways to accommodate this change. In fact, if you are handling blobs (varchar(max), nvarchar(max), …) the logic is similar.
· If you bind (SQLBindCol or SQLBindParameter) the data is saved or truncated into the specified buffer
· If you do NOT bind you can retrieve the data in chunks using SQLGetData, SQLParamData, … functions
I asked ' Kamil Sykora - Senior Escalation Engineer' to help me explain this behavior.
Drivers MAY return the total remaining length or they MAY NOT – driver specific. If the driver returns a length, the app is free to use that to allocate the buffer of an appropriate size. If it returns SQL_NO_TOTAL it should chunk it with a fixed size buffer. https://msdn.microsoft.com/en-us/library/windows/desktop/ms715441(v=vs.85).aspx
1. Places the length of the data in *StrLen_or_IndPtr. If StrLen_or_IndPtr was a null pointer, SQLGetData does not return the length.
· For character or binary data, this is the length of the data after conversion and before truncation due to BufferLength. If the driver cannot determine the length of the data after conversion, as is sometimes the case with long data, it returns SQL_SUCCESS_WITH_INFO and sets the length to SQL_NO_TOTAL.
I tested this with the new driver, it supports ODBC 3.8 extensions allowing us to stream output parameter data using SQLGetData. This is not directly related to this change but allows us to chunk the output parameter without having to know the size ahead of time. Here’s a sample with some output:
OdbcHandle& dbc = getDbc(handles);
wstring driverOdbcVersion(32, L' ');
dbc.check(SQLGetInfo(dbc.h, SQL_DRIVER_ODBC_VER, &driverOdbcVersion[0] , driverOdbcVersion.size(), NULL));
bool use38 = false;
if (wstring::npos != driverOdbcVersion.find(L"3.8")) {
use38 = true;
}
val.resize(100, L' ');
int paramToken = 222; //app-defined identifier, can be anything
if (use38) {
//binding as SQL_WCHAR allows the converted remaining length to be returned
//stmt.check(SQLBindParameter(stmt.h, 1, SQL_PARAM_OUTPUT_STREAM, SQL_C_WCHAR, SQL_WCHAR, 1234, 0, (SQLPOINTER) paramToken, 0, &ind));
stmt.check(SQLBindParameter(stmt.h, 1, SQL_PARAM_OUTPUT_STREAM, SQL_C_WCHAR, SQL_CHAR, 1234, 0, (SQLPOINTER) paramToken, 0, &ind));
} else {
stmt.check(SQLBindParameter(stmt.h, 1, SQL_PARAM_OUTPUT, SQL_C_WCHAR, SQL_CHAR, 1234, 0, &val[0], val.size(), &ind));
}
stmt.check(ret = SQLExecDirect(stmt.h, &sqlProc[0], SQL_NTS), "Convert SQLExecDirect 3 failed");
wcout << L"SQLExecDirect: " << mRet.str(ret) << endl;
ret = SQLFetch(stmt.h); //this might fail if there are no actual results returned by the stored proc
wcout << L"SQLFetch: " << mRet.str(ret) << endl;
if (SQL_SUCCESS != ret) {
stmt.handleError(ret);
}
stmt.check(ret = SQLMoreResults(stmt.h));
wcout << L"SQLMoreResults: " << mRet.str(ret) << endl;
if (use38 && SQL_PARAM_DATA_AVAILABLE == ret) {
int outParamToken = NULL;
stmt.check(ret = SQLParamData(stmt.h, (SQLPOINTER*) &outParamToken));
wcout << L"SQLParamData: " << mRet.str(ret) << endl;
if (outParamToken == paramToken ) { //for multiple params, this needs to be checked so we retrieve the right parameter
stmt.check(ret = SQLGetData(stmt.h, 1, SQL_C_WCHAR, &val[0], val.size(), &ind), "Convert SQLGetData (parameter, initial) failed");
while (SQL_SUCCESS == ret || SQL_SUCCESS_WITH_INFO == ret) {
wcout << L" Indicator: " << mInd.str(ind) << L", Value: " << val.c_str() << endl;
stmt.check(ret = SQLGetData(stmt.h, 1, SQL_C_WCHAR, &val[0], val.size(), &ind), "Convert SQLGetData(parameter, loop) failed");
}
}
} else {
wcout << val.c_str() << endl;
}
SQLExecDirect: SQL_SUCCESS
SQLFetch: SQL_SUCCESS
SQLMoreResults: SQL_PARAM_DATA_AVAILABLE
SQLParamData: SQL_PARAM_DATA_AVAILABLE
Indicator: SQL_NO_TOTAL, Value: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Indicator: SQL_NO_TOTAL, Value: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
…
Indicator: SQL_NO_TOTAL, Value: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Indicator: 18, Value: BBBBBBBBB
I also inadvertently found that if you bind the parameter as SQL_C_WCHAR/SQL_WCHAR (see commented out in code above), you still get the remaining length. I believe this length should be correct even with surrogate pairs since binding it this way causes the parameter to be sent as Unicode by SQL Server, thus the correct length is available to the driver:
SQL Profiler event with SQL_C_WCHAR/SQL_WCHAR binding:
declare @p1 nchar(1234)
set @p1=N'BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB'
exec spTestOutVarchar @p1 output
select @p1
SQLExecDirect: SQL_SUCCESS
SQLFetch: SQL_SUCCESS
SQLMoreResults: SQL_PARAM_DATA_AVAILABLE
SQLParamData: SQL_PARAM_DATA_AVAILABLE
Indicator: 2468, Value: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
…
Indicator: 116, Value: BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Indicator: 18, Value: BBBBBBBBB
Bob Dorr - Principal SQL Server Escalation Engineer