'SQL Server query select min over partition by

I have some hard time to understand this over select partition by statement. I read about it but don't get it. What's the benefit of using:

SELECT MIN(Field) OVER (PARTITION BY OtherField) as Value
FROM MYTABLE

Can you briefly describe this, when will you use it and why? What's the benefit and the acheivement goal when you write something like this



Solution 1:[1]

OVER (PARTITION BY OtherField) is a window function. In addition, the main idea it to group by partitions without reducing of count of selected table rows.

In general, window functions are going to be faster than join/aggregation solutions. This is a rather simple case, so the performance might be essentially the same.

Let me show an example:

CREATE TABLE dbo.Duration     (startDate datetime2, endDate datetime2);

INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-06 16:42:09', '2007-05-07 11:10:08');  
INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-07 08:30:09', '2007-05-07 12:12:43');  
INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-07 11:35:09', '2007-05-07 17:13:39');  
INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-07 08:30:10', '2007-05-07 12:12:43');  
INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-07 08:30:11', '2007-05-07 12:12:43');  
INSERT INTO dbo.Duration (startDate, endDate)  VALUES ('2007-05-07 08:30:12', '2007-05-07 12:12:43');  

Now we can make partitions by field EndDate and get the MIN() of startDatein the partition ofEndDate`:

SELECT 
MIN(dr.startDate)  OVER (PARTITION BY dr.endDate ) MinDateByPartitionOfEndDate
,  dr.endDate
FROM dbo.Duration dr

Pay great attention to 2007-05-07 12:12:43 and their values 2007-05-07 08:30:09.0000000. So we did not reduce the query result, however we got MIN value of each partition by desired field.

OUTPUT:

MinDateByPartitionOfEndDate    endDate
2007-05-06 16:42:09.0000000    2007-05-07 11:10:08.0000000
2007-05-07 08:30:09.0000000    2007-05-07 12:12:43.0000000
2007-05-07 08:30:09.0000000    2007-05-07 12:12:43.0000000
2007-05-07 08:30:09.0000000    2007-05-07 12:12:43.0000000
2007-05-07 08:30:09.0000000    2007-05-07 12:12:43.0000000
2007-05-07 11:35:09.0000000    2007-05-07 17:13:39.0000000
2007-05-07 11:35:09.0000000    2007-05-07 17:13:39.0000000

And a GROUP BY query to see a main difference between GROUP BY and PARTITION BY (it reduces the query result):

SELECT 
MIN(dr.startDate) MinDateByGroupBy
,  dr.endDate
FROM dbo.Duration dr
GROUP BY dr.endDate

OUTPUT:

MinDateByGroupBy                    endDate
2007-05-06 16:42:09.0000000     2007-05-07 11:10:08.0000000
2007-05-07 08:30:09.0000000     2007-05-07 12:12:43.0000000
2007-05-07 11:35:09.0000000     2007-05-07 17:13:39.0000000

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1