'How to optimally query the only first data row after a certain date?

I have table Clients with columns like this:

ClientId    LastChanged
671154  2012-07-21 14:35:40.780
671154  2012-07-21 14:35:41.630
671155  2012-07-21 14:37:24.213

I also have table ClientsHistory with columns like this:

ClientId    Changed                 Name                                    StateCode
671154      2012-07-21 14:35:40.780 91C6672EB2D4496AB34D3C                  22654624
671154      2012-07-21 14:35:41.630 55E345FE6A144B0E92A6026D8B461CEA        22654624
671154      2013-03-13 12:56:33.943 D32841A8EBAB44999C966D2F597DC240        22654624  

I want to query Clients And Only First Row in ClientsHistory after LastChanged date:

SELECT  (   SELECT  TOP 1
                    Name
            FROM    dbo.ClientsHistory
            WHERE   ClientId    = c.ClientId
            AND     Changed     > c.LastChanged
            ORDER   BY  [Changed]   ASC
        )   ,
        (   SELECT  TOP 1
                    StateCode
            FROM    dbo.СlientsHistory
            WHERE   ClientId    = c.ClientId
            AND     Changed     > c.LastChanged
            ORDER   BY  [Changed]   ASC
        )   ,
        c.ClientId,
        c.LastChanged
FROM    Clients c

How to rewrite this query to eliminate subquery for every field in ClientsHistory table?



Solution 1:[1]

You can use a CTE, the ranking function ROW_NUMBER and a LEFT JOIN:

WITH Hist AS
(
    SELECT c.ClientId, Changed, Name, StateCode, c.LastChanged,
           RN = ROW_NUMBER() OVER (PARTITION BY c.ClientId
                                   ORDER BY Changed DESC)
    FROM dbo.Clients c INNER JOIN dbo.ClientsHistory ch
         ON c.ClientId = ch.ClientId
    WHERE c.LastChanged > ch.Changed
)
SELECT c.ClientId, c.LastChanged, h.Changed, Name, StateCode
FROM dbo.Clients c LEFT JOIN Hist h
   ON c.ClientId = h.ClientId
   AND c.LastChanged = h.LastChanged  -- necessary since ClientId is not unique in hist
   AND h.RN = 1

Demo

Solution 2:[2]

To eliminate the need for a separate subquery for each column you could make use of APPLY like this:

SELECT ch.Name, 
        ch.StateCode, 
        c.ClientId,
        c.LastChanged
    FROM dbo.Clients c
    OUTER APPLY (SELECT TOP 1 * 
                 FROM  dbo.ClientsHistory
                 WHERE ClientId = c.ClientId 
                    AND Changed > c.LastChanged
                 ORDER BY Changed ASC) ch

Here, the subquery will be executed for each row in Clients.

Solution 3:[3]

This query must return the same result. Can you test that :

SELECT temp.Name,
       temp.StateCode,
       c.ClientId,
       c.LastChanged
FROM Clients c
INNER JOIN 
     (SELECT TOP 1 c2.ClientId 
             c2.Name, 
             c2.StateCode
      FROM Clients c2
      INNER JOIN dbo.ClientsHistory ch
          ON ch.ClientId = c2.ClientId 
          AND ch.Changed > c2.LastChanged
      ORDER   BY  [Changed]   ASC) temp 
  ON temp.ClientId = c.ClientId

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 coytech