'Random sorting algorithm with spacing logic for duplicates
I need to sort some records based off of one field in SQL Server, which in isolation wouldn't be too difficult using NEW_ID() to generate a random value with which to sort. The complexity that makes this harder is that the business rule for any duplicate values states that they must be spaced at least 10 records apart (gap of 9) OR as far apart as is possible given the constraints of the current record set. Therefore if there are less than 10 records in total then the duplicates should be spaced as far apart as they can be. To further complicate matters there can be multiple duplicate values and a duplicate value can appear any number of times in the result set.
As for recordset size - anywhere between 5 and 100 records would the minimum and maximum range.
Now I understand that it will be very difficult in some scenarios for an algorithm to handle all scenarios and in some cases duplicates may end up being next to one another (e.g. 2 records, with a duplicate). There is the option of some user review after the algorithm has generated a new order so any conflicts can be dealt with by manually adjusting the final output. This should be the exception rather than the rule as manual review is time intensive and should only be needed in certain scenarios.
To illustrate some scenarios:
- A simple example would be
| Id | Name |
|---|---|
| 1 | Name A |
| 2 | Name B |
| 3 | Name C |
| 4 | Name D |
| 5 | Name E |
The routine must be able to randomise the Name column to produce something like:
(All output abbreviated to cut down question size - the result does not need to be comma separated just a row based resultset in SQL)
3,1,4,5,2
- With Duplicate
| Id | Name |
|---|---|
| 1 | Name A |
| 2 | Name B |
| 3 | Name A |
| 4 | Name C |
| 5 | Name E |
Should produce
1,5,2,4,3 where 1 and 3 are spaced as far apart as possible and the rest of the result set is randomised in the middle.
- More records (one duplicate)
| Id | Name |
|---|---|
| 1 | Name A |
| 2 | Name B |
| 3 | Name C |
| 4 | Name E |
| 5 | Name E |
| 6 | Name F |
| 7 | Name G |
| 8 | Name H |
| 9 | Name I |
| 10 | Name J |
| 11 | Name K |
| 12 | Name A |
| 13 | Name M |
| 14 | Name N |
| 15 | Name O |
| 16 | Name J |
Output something like:
3,1,5,16,11,4,7,14,8,9,2,12,15,10,6,13 (where 1 and 12 are spaced at least 10 apart)
Any ideas if this is possible in SQL Server and how it could be approached?
Solution 1:[1]
SELECT id ,
name,
ROW_NUMBER() OVER(partition by name order by id) as r
FROM table
ORDER BY r , id
example 2:
Id Name r
1 Name A 1
3 Name A 2
2 Name B 1
4 Name C 1
5 Name E 1
and it must be sorted by r and id
answer:
Id Name r
1 Name A 1
2 Name B 1
4 Name C 1
5 Name E 1
3 Name A 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | O'tkir Xo'jayev |
