Accessing your Data on Remote Servers for DQS Operations by Linking Servers
Data Quality Services (DQS) supports performing data quality operations on your source data only if it is available in:
- SQL Server table/view in the same SQL Server instance as the DQS databases
- Microsoft Excel or CSV file
What if your source data is available on a remote SQL Server database or in some third-party databases such as Oracle or DB2 on a remote computer? Well, you can use the Linked Server feature in SQL Server to access data from these remote sources for DQS operations. The Linked Server feature in SQL Server enables you to remotely access data from diverse sources. For detailed information, see Linked Servers.
DISCLAIMER: Remote access of source data for DQS operations is not tested, and therefore officially not supported by Microsoft. You can use the information in this article at your own risk to use your source data in remote server for DQS operations.
In this article, we will illustrate how you can access data in a table in a remote SQL Server database for DQS operations by using linked servers.
IMPORTANT: For accessing your data from a remote SQL Server database for DQS operations using linked server, it is assumed that a DQS role is assigned to your Windows account and your Windows account has access to the source data in the remote SQL Server instance.
Step 1: Define Remote Server as a Linked Server on DQS Computer
This section provides information about defining the remote server as a linked server in the SQL Server instance where DQS databases are available. To do so:
- Start SQL Server Management Studio and connect to the SQL Server instance where DQS databases are available (that is where you installed Data Quality Server).
- Expand your SQL Server instance, expand Server Objects, right-click the Linked Servers folder, and then click New Linked Server.
- In the New Linked Server dialog box, provide the network name of the remote server (that has the SQL Server instance containing your source data) in the Linked server field, and select SQL Server as the server type.
- Next, click the Security page on the left, and then click the Be made using the login’s current security context radio button on the right. This is required to ensure that the DQS client token credential (Windows authentication) is used to connect to the linked server.
- Click OK in the New Linked Server dialog box to create the linked server.
Step 2: Create a View Pointing to Source Table on the Remote Server
In this section, we will create a view in the DQS_STAGING_DATA to point to a table containing source data in a remote SQL Server instance that we linked to in step 1.
- Start SQL Server Management Studio and connect to the SQL Server instance where DQS databases are available (that is where you installed Data Quality Server).
- Expand your SQL Server instance, expand Databases, right-click the DQS_STAGING_DATA database, and then click New Query.
- In the Query Editor window, type the following SQL statements:
CREATE VIEW ViewName
AS
SELECT * FROM [RemoteServerName].[DatabaseName].[SchemaName].[TableName] - Press F5 to execute the statements. This will create a view with the specified name in the DQS_STAGING_DATA database.
- Next, test the view by running the following command against the DQS_STAGING_DATA database to validate if its returning the same data as the underlying table on the remote server:
SELECT * FROM ViewName
Step 3: Perform DQS Activities against the Linked View
While performing knowledge discovery, cleansing, or matching activities using the Data Quality Client application, select the DQS_STAGING_DATA as your source database and the linked view that you created in step 3 as your source view in the mapping stage: