SQL Script: Finding duplicates on supposedly unique values
Unsupported: Use at your own risk
Sometimes you load data into metaverse that is supposed to be unique and you do not want to use FindMVEntries technique due to performance impact.
This SQL scripts allow you to find out directly from DB if you have unexpected duplicates so you can talk to data store owners to go and fix their data.
The following example looks for repeated mailNickName attribute values in metaverse. Just replace “mailNickName” with “SAMAccountName”, “uid” or whatever attribute you have that should be unique:
SELECT mailNickName, COUNT(mailNickName) AS NumOccurrences
FROM FIMSynchronizationService.dbo.mms_metaverse with (nolock)
where object_type = 'person'
GROUP BY mailNickName
HAVING ( COUNT(mailNickName) > 1 )
order by NumOccurrences desc
You could also use this technique to find the opposite, that is, records with a particular value that occur exactly once:
SELECT email
FROM FIMSynchronizationService.dbo.mms_metaverse with (nolock)
GROUP BY email
HAVING ( COUNT(email) = 1 )
However, the previous queries give just details about the number of records that you have duplicates or unique. Finding out more details about the metaverse records require more elaborated SQL query.
In the following example, we first find duplicated sAMAccountName values and store in temporary “tblDups” table, that then we use to join with full metaverse (“MV”) and show “displayName” and “object_id” attributes.
select tblDups.sAMAccountName, MV.displayName, MV.object_id
FROM
(
SELECT sAMAccountName, COUNT(sAMAccountName) AS NumOccurrences
FROM FIMSynchronizationService.dbo.mms_metaverse with (nolock)
where object_type = 'person'
GROUP BY sAMAccountName
HAVING ( COUNT(sAMAccountName) > 1 )
) as tblDups
INNER JOIN
FIMSynchronizationService.dbo.mms_metaverse AS MV WITH (nolock)
ON
tblDups.sAMAccountName = MV.sAMAccountName
You can also add WHERE sentences at the end of the query to find specific values, such as those sAMAccountNames duplicates that belong to HR department:
select tblDups.sAMAccountName, MV.displayName, MV.object_id
FROM
(
SELECT sAMAccountName, COUNT(sAMAccountName) AS NumOccurrences
FROM FIMSynchronizationService.dbo.mms_metaverse with (nolock)
where object_type = 'person'
GROUP BY sAMAccountName
HAVING ( COUNT(sAMAccountName) > 1 )
) as tblDups
INNER JOIN
FIMSynchronizationService.dbo.mms_metaverse AS MV WITH (nolock)
ON
tblDups.sAMAccountName = MV.sAMAccountName
WHERE MV.department = 'HR'