Share via


Transact-SQL: Get the first and last entry added per user

Introduction

In this article, we will explain two different queries to get the first and last entry added in a specific table per user.
We will also compare the 2 queries from the below perspectives

  • Result.
  • Disk Usage.
  • Execution time.

Scenario

Consider you have a [Transaction] table to store multiple transaction for different users daily!

 The [Transaction] table structure is simple like the below one:

Now, you need to

  • Get First transaction added per user.

  • Get Last transaction added per user.

SQL: Get the first transaction added per user | Inner Join Approach

First, we will get all required field from the main table [Transaction] as shown below:

SELECT [Transactions].UserName,
        [Transactions].TransDate,
        [Transactions].Amount
FROM [dbo].[Transactions]

The result is

Second, we will get all records aggregated with Minimum function on the [TransDate] grouped by [UserName] field.

SELECT [UserName],
       Min([TransDate]) AS  MinDate
FROM [dbo].[Transactions]
GROUP BY  [UserName]

The result is

Additionally, we will use "WITH" clause to give the above query a block name (MinTransTable)

WITH MinTranstable AS
  (SELECT [UserName],
  min([TransDate]) AS  MinDate
   FROM [dbo]. [Transactions]
   GROUP BY  [UserName])

Finally, we will perform inner join between main [Transaction] table and [MinTransTable] as the following:

WITH MinTranstable AS
  (SELECT [UserName],
      min([TransDate]) AS  MinDate
   FROM [dbo]. [Transactions]
   GROUP BY  [UserName])
SELECT [Transactions].UserName,
        [Transactions].TransDate,
        [Transactions].Amount
FROM [dbo].[Transactions]
INNER JOIN MinTranstable ON  [Transactions].username = MinTranstable.username
AND [Transactions].TransDate = MinTranstable.MinDate

The final result is

SQL: Get the first transaction added per user | row_number() Approach

Before we getting started, let's first know:

 What's the row_number() function ?
  • It's a function that assigns a sequential integer to each row within a partition of a result set.
  • You can use it with "PARTITION BY" to divide the result set into partitions per specific column, actually, it's an optional part, if you don't use it, the function will treat all rows of the query result set as a single group.

For more details, please check ROW_NUMBER (Transact-SQL).
Now, let's go through our query, 

Get all rows from [Transaction]  table with a sequential row number for each "[UserName]" ordered by "[Transaction Date]" Ascending.

SELECT row_number() over(PARTITION BY UserName;     ORDER BY  TransDate asc) AS  ID,
        [UserName],
        [TransDate],
        [Amount]
FROM [dbo].[Transactions] )

The result is

Use "WITH" clause to give the above query a block name (MinTransTable)

WITH MinTransTable AS
  (SELECT row_number() over(PARTITION BY UserName
        ORDER BY  TransDate ASC) AS  ID,
[UserName],
[TransDate],
           [Amount]
   FROM [dbo]. [Transactions])

Now, get all record filtered by the sequential number ID= 1 and the [Transaction date] is not null

WITH MinTransTable AS
  (SELECT row_number() over(PARTITION BY UserName
ORDER BY  TransDate ASC) AS  ID,
           [UserName],
         [TransDate],
           [Amount]
   FROM [dbo]. [Transactions])
SELECT *
FROM MinTransTable
WHERE MinTransTable.ID = 1
  AND MinTransTable.[TransDate] IS NOT NULL

**The final result is **

SQL: Get the last transaction added per user | Inner Join Approach

Actually, it's the same query with a bit different, instead of using Min() , we will use Max() on the date field!
So the final query should be 

WITH MaxTranstable AS
  (SELECT [UserName],
        MAX([TransDate]) AS  MaxDate
   FROM [dbo]. [Transactions]
   GROUP BY  [UserName])
SELECT [Transactions].UserName,
        [Transactions].TransDate,
        [Transactions].Amount
FROM [dbo].[Transactions]
INNER JOIN MaxTranstable ON  [Transactions].username = MaxTranstable .username
AND [Transactions].TransDate = MaxTranstable .MaxDate

The final result is

SQL: Get the last transaction added per user | row_number() Approach

Again, we will use the same query of getting the first entry, but we will order by the transaction date Descending.

WITH MaxTransTable AS
  (SELECT row_number() over(PARTITION BY UserName
ORDER BY  TransDate DESC) AS  ID,
[UserName],
[TransDate],
           [Amount]
   FROM [dbo]. [Transactions])
SELECT *
FROM MaxTransTable 
WHERE MaxTransTable .ID = 1
  AND MaxTransTable .[TransDate] IS NOT NULL

The final result is
**
**

Compare the performance between 2 queries

In this section, we will answer an important question in your head right now, Which query should you use?
Actually, there are different ways to measure the performance and execution time for your query

  1. The execution plan.
  2. SQL commands.

You can include actual execution plan during running your query from the tool bar as shown below:


Once you activate the "Actual execution plan", you will get a new tab in the result window that show in details the behavior and performance of your query as shown below:


You can save the execution plan and compare it with another execution plan by right-clicking on a blank area of the execution plan and click "Compare Showplan".

Note

 In this article, we don't cover the execution plan in details, but you can learn more about it at Execution Plans

You can also append a SQL commands to your query to display information about the IO and Time as the following:

SET STATISTICS IO ON

It displays information about the amount of disk activity generated by Transact-SQL statements.

SET STATISTICS TIME ON

It displays the number of milliseconds required to parse, compile, and execute each statement.

DBCC DROPCLEANBUFFERS

It cleans all the buffers from the buffer pool to make sure that no pages for this table are loaded in memory.

Let's now append the above commands to our query to measure the execution time as the following:

Query (1)
SET STATISTICS  IO ON
SET STATISTICS  TIME ON
DBCC DROPCLEANBUFFERS ;
 
WITH MinTranstable AS
  (SELECT [UserName],
min([TransDate]) AS  MinDate
   FROM [dbo]. [Transactions]
   GROUP BY  [UserName])
SELECT [Transactions].UserName,
        [Transactions].TransDate,
        [Transactions].Amount
FROM [dbo].[Transactions]
INNER JOIN MinTranstable ON  [Transactions].username = MinTranstable.username
AND [Transactions].TransDate = MinTranstable.MinDate
Query (1) Result
(5 row(s) affected)
Table 'Transactions'. Scan count 2, logical reads 12, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 
(1 row(s) affected)
 
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 207 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
 
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
Query (2)
SET STATISTICS  IO ON
SET STATISTICS  TIME ON
DBCC DROPCLEANBUFFERS 
;
WITH MinTransTable AS
  (SELECT row_number() over(PARTITION BY UserName
ORDER BY  TransDate ASC) AS  ID,
[UserName],
[TransDate],
           [Amount]
   FROM [dbo]. [Transactions])
SELECT *
FROM MinTransTable
WHERE MinTransTable.ID = 1
  AND MinTransTable.[TransDate] IS NOT NULL
Query (2) Result
(4 row(s) affected)
Table 'Transactions'. Scan count 1, logical reads 1, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 
(1 row(s) affected)
 
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 78 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
 
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

As per the above result, the second query is faster than the first query

Conclusion

In this article, we have explained how to

  • Get the first and last entry added per each user in SQL.
  • Measure the execution time for SQL query,
  • Compare two execution plans.

References

See Also