Hi @Peter Bishop , Welcome to Microsoft Q&A,
Your problem should be caused by encoding. The DT_TEXT
data type in SSIS is usually stored in Unicode (UTF-16) format. When you extract JSON data as DT_TEXT
, it is most likely stored as UTF-16, where each character is represented by two bytes (hence there are empty CHAR(0)
between each visible character)
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
BlobColumn blobColumn = Row.jsonData;
int blobLength = Convert.ToInt32(blobColumn.Length);
byte[] blobData = blobColumn.GetBlobData(0, blobLength);
// Convert from UTF-16 (default) to UTF-8
string stringData = Encoding.Unicode.GetString(blobData); // SSIS default encoding
byte[] utf8Bytes = Encoding.UTF8.GetBytes(stringData); // Convert to UTF-8
string utf8String = Encoding.UTF8.GetString(utf8Bytes); // Ensure it's a proper UTF-8 string
// Assign cleaned-up string back to the output column
Row.CleanJsonData = utf8String;
}
Best Regards,
Jiale
If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.