다음을 통해 공유


T-SQL: Random Equal Distribution

 

Introduction

Recent question in MSDN Transact-SQL forum Update value in one table with value from looping through another table prompted writing of this article and exploration of the equal random distribution problem.

Problem Definition

In the aforementioned thread the topic starter wanted to randomly distribute claims among the personnel. Each user needed to receive an equal (or nearly equal) amount of claims from the unassigned pool of claims.

In order to simulate the environment I created the following tables and populated them with this code:

IF OBJECT_ID(N'Claims', N'U') IS  NOT NULL  DROP TABLE  dbo.Claims;
 
CREATE TABLE  [dbo].[Claims] (
    [ClaimID] INT  identity(1, 1)
    ,[PaidAmt] [money] NOT NULL
    ,[PaidDate] [datetime] NULL
    ,[Notes] [varchar](max) NULL
    ,[CompleteUser] INT  NULL
    ,[CompleteDate] [datetime] NULL
    ,[Tracking] [bit] NULL
    ,[FollowUpUser] INT  NULL
    ,[FollowUpDate] [datetime] NULL
    ,CONSTRAINT [PK_Claims] PRIMARY KEY  CLUSTERED ([ClaimID] ASC) WITH  (
        PAD_INDEX = OFF
        ,STATISTICS_NORECOMPUTE = OFF
        ,IGNORE_DUP_KEY = OFF
        ,ALLOW_ROW_LOCKS = ON
        ,ALLOW_PAGE_LOCKS = ON
        ) ON  [PRIMARY]
    ) ON  [PRIMARY]
 
DECLARE @Loop INT = 1;
 
WHILE @Loop <= 520
BEGIN
    INSERT INTO  dbo.Claims (PaidAmt)
    VALUES (@Loop * 10)
 
    SET @Loop = @Loop + 1;
END
IF OBJECT_ID(N'Staff', N'U') IS  NOT NULL  DROP TABLE  dbo.Staff;
 
CREATE TABLE  dbo.Staff (
    UserID INT  identity(1, 1)
    ,FirstName VARCHAR(25)
    ,LastName VARCHAR(30)
    ,CONSTRAINT [PK_Staff] PRIMARY KEY  CLUSTERED ([UserID] ASC) WITH  (
        PAD_INDEX = OFF
        ,STATISTICS_NORECOMPUTE = OFF
        ,IGNORE_DUP_KEY = OFF
        ,ALLOW_ROW_LOCKS = ON
        ,ALLOW_PAGE_LOCKS = ON
        ) ON  [PRIMARY]
    ) ON  [PRIMARY]
 
SET @Loop = 1;
 
WHILE @Loop <= 30
BEGIN
    INSERT INTO  dbo.Staff (
        FirstName
        ,LastName
        )
    VALUES (
        'FName ' + cast(@Loop AS  VARCHAR(10))
        ,'LName ' + cast(@Loop AS  VARCHAR(10))
        );
 
    SET @Loop = @Loop + 1;
END

As you see, with the code above we created Staff and Claims tables where Staff has 30 rows and Claims has 520 rows. You may want to use different numbers for Staff and Claims tables, the solution still will be the same.

Now, given the above tables we want to update FollowUpUser and FollowUpDate columns assigning random claims to the users (for users we would order them in the UserID order, so first 10 users will get one extra claim).

Solution

To achieve random distribution of claims we would order them based on NewID() function. This is well-known trick of producing random order. I would also mention that we can not use RAND() function directly to achieve random order as in case of RAND() function every row will get the same number.

DECLARE @TotalUsers INT = (
        SELECT count(*)
        FROM dbo.Staff
        );
 
WITH cteClaims
AS (
    SELECT *
        ,row_number() OVER (
            ORDER BY  newID()
            ) AS  Rn
    FROM Claims
    )
    ,cteUsersOrdered
AS (
    SELECT *
        ,row_number() OVER (
            ORDER BY  UserId
            ) AS  Rn
    FROM dbo.Staff
    )
MERGE cteClaims AS  Target
USING cteUsersOrdered AS  Source
    ON (Target.Rn - 1) % @TotalUsers + 1 = Source.Rn
WHEN MATCHED
    THEN
        UPDATE
        SET FollowUpUser = Source.UserID, 
        FollowUpDate = CURRENT_TIMESTAMP;

Using this formula: (ClaimsRow - 1) % Number Of Users + 1 we achieve the desired result where all claims are distributed.

In order to verify our results we can use the following select statement:

SELECT COUNT(C.ClaimID)  as  claimsCount, S.UserID,
S.FirstName, S.LastName from  dbo.Claims C
INNER JOIN dbo.Staff S  ON  C.FollowUpUser = S.UserID
GROUP BY  S.UserID, S.FirstName, S.LastName
ORDER by  claimsCount DESC;

which produced desired result:


Conclusion

This article demonstrated solution for a common problem of random equal distribution using set-based approach. 


See Also


This article participated in the TechNet Guru for May 2014 competition and won the Gold Prize.