Share via


Fixing Missing Data Based on Prior Row Information

One of the commonly asked problems in the Transact-SQL forum is how to provide missing information based on the information in the first prior row that has data (or alternatively in the next row (by date)). One of the examples where this problem was discussed is this thread.

In this thread the original poster was kind enough to provide DDL and the DML (data sample), so it was easy to define a solution based on the OUTER APPLY

CREATE TABLE  [dbo].[test_assign] (
    [name] [varchar](25) NULL
    ,[datestart] [date] NULL
    ,[dateEnd] [date] NULL
    ,[assign_id] [int] IDENTITY(1, 1) NOT NULL
    ,CONSTRAINT [PK_test_assign] PRIMARY KEY  CLUSTERED ([assign_id] ASC) WITH  (
        PAD_INDEX = OFF
        ,STATISTICS_NORECOMPUTE = OFF
        ,IGNORE_DUP_KEY = OFF
        ,ALLOW_ROW_LOCKS = ON
        ,ALLOW_PAGE_LOCKS = ON
        ) ON  [PRIMARY]
    ) ON  [PRIMARY]
 
CREATE TABLE  [dbo].[test_measure] (
    [name] [varchar](25) NULL
    ,[measurementDate] [date] NULL
    ,[measure_id] [int] IDENTITY(1, 1) NOT NULL
    ,CONSTRAINT [PK_test_measure] PRIMARY KEY  CLUSTERED ([measure_id] ASC) WITH  (
        PAD_INDEX = OFF
        ,STATISTICS_NORECOMPUTE = OFF
        ,IGNORE_DUP_KEY = OFF
        ,ALLOW_ROW_LOCKS = ON
        ,ALLOW_PAGE_LOCKS = ON
        ) ON  [PRIMARY]
    ) ON  [PRIMARY]
 
INSERT INTO  Test_Measure (
    NAME
    ,Measurementdate
    )
SELECT 'Adam'
    ,'1/1/2001'
 
INSERT INTO  Test_Measure (
    NAME
    ,Measurementdate
    )
SELECT 'Adam'
    ,'2/2/2002'
 
INSERT INTO  Test_assign (
    NAME
    ,DateStart
    ,DateEnd
    )
SELECT 'Adam'
    ,'1/15/2001'
    ,'12/31/2001'
 
INSERT INTO  Test_assign (
    NAME
    ,DateStart
    ,DateEnd
    )
SELECT 'Adam'
    ,'2/15/2002'
    ,'12/31/2002'
 
INSERT INTO  Test_assign (
    NAME
    ,DateStart
    ,DateEnd
    )
SELECT 'Adam'
    ,'3/15/2003'
    ,'12/31/2003'
 
 
-- Solution starts now
SELECT TA.*
    ,M.MeasurementDate
FROM Test_Assign TA
OUTER APPLY (
    SELECT TOP  (1) *
    FROM Test_Measure TM
    WHERE TM.NAME = TA.NAME
        AND TM.MeasurementDate <= TA.Datestart
    ORDER BY  TM.MeasurementDate DESC
    ) M

The idea of this solution is to use correlated OUTER APPLY subquery to get first measurement date that is prior the Start date of the main table.

A similar problem is also described in this thread and the solution will also be a variation of CROSS APPLY solution. So, you can see that this problem is very common.


See Also


This entry participated in the TechNet Guru contributions for June contest.