Compartir a través de


SQL Server Yukon DTS: Success in cleaning CRM data

I was just reading an article at TechnologyEvaluation.com about the difficulty in maintaining data quality for CRM applications.  The extremely volatile nature of CRM activity leads to degradation in the data such as:

  • Customer details that are incorrect or inconsistent with other data
  • Duplicate records
  • Multiple database synchronization problems

Duplicate records is noted as being the most common issue.  This can be caused, for example, by spelling variations resulting in duplicate records for the same customer.  As I was reading this I recalled what I had learned about the new Data Transformation Services (DTS) feature of SQL Server Yukon.  Yukon's DTS will include data cleaning technology such as fuzzy lookup and fuzzy grouping.  Fuzzy lookup involves an input row of data whose columns are mapped to the columns of a given table of existing data.  So you may have an existing Customer table and now you have a new candidate row to insert into the Customer table.  Fuzzy lookup involves scanning the candidate row's values and comparing them against existing data to check for similarities so that the data can be “fixed up” so that they are consistent.  For example, if the candidate Customer's company name is equal to “Micrsooft” and it finds that there exists a current Customer with a company name of “Microsoft”, it will classify that as a high-scoring match all allow the user to take the appropriate actions either manually or through an automatic action.  Fuzzy grouping allows you to find fuzzy duplicates in a given result set in a similar way to fuzzy lookups.  This allows you to fix up data that already exists.  As you can see, these DTS features match well with solving the data quality issues of CRM.  I'm sure Yukon will be taken advantage of in this sector.

This posting is provided "AS IS" with no warranties, and confers no rights.

Comments

  • Anonymous
    April 15, 2004
    I believe that CRM data problems are due to a couple of reasons:

    1. Lack of database referential integrity rules. There is a reason these things exist yet we continue to choose not use these SQL Server capabilities citing "performance" as a reason. Most customer I know prefer somethng that is a little slower if they don't have to solve complicated referential integrity problems. Ask support which problems are some of the most haunting and they'll tell you it is referential integrity. When the RI goes south it goes south.

    2. Most CRM applications are implemented with a disconnected client. The disconnected client will have replication conflicts that users are forced to resolve. Most users are not qualified to resolve these issues and choose the wrong option (this a new record).

    This fuzzy grouping mentioned above seems be a good step in the right direction to address #2 above. But the real problem is that we need to be smarter in a disconnected (also called autonomous) client so the user doesn't run into replication conflicts and does not even have the oppurtunity to make a mistake.

  • Anonymous
    June 18, 2004
    At TechEd I attended a general session called "whistle stop tour of SQL Server 2005" (Yukon). DTS was positioned as just one element in a stack of subsystems supporting Yukon based Business Intelligence applications. Others included Analysis Services and Reporting Services.

    Two particular examples related to the fuzzy grouping functionality were called out:

    One was a new form of validation logic for user input: Not hard constraints such as the format of a ZIP code or Social Security Number, but soft constraints such as 15 years of professional experience at age 20. One can envision many such plausibility checks in business applications like CRM to further increase data quality.

    Another was the use of cluster analysis or segmentation to drive customer campaigns. The more you understand different customer profiles relative to your entire customer population the more successful you will be running targeted campaigns.

    An overview of Business Intelligence and Data Warehousing in SQL Server 2005 can be found <a href="http://www.microsoft.com/technet/prodtechnol/sql/yukon/evaluate/dwsqlsy.mspx">here</a>.

  • Anonymous
    January 07, 2005
    Some new blogs and articles...

  • Anonymous
    November 26, 2007
    PingBack from http://feeds.maxblog.eu/item_545900.html

  • Anonymous
    March 15, 2008
    PingBack from http://bordercrossingstatsblog.info/from-the-depths-sql-server-yukon-dts-success-in-cleaning-crm-data/

  • Anonymous
    May 29, 2009
    PingBack from http://paidsurveyshub.info/story.php?title=from-the-depths-sql-server-yukon-dts-success-in-cleaning-crm-data