'Regrouping groups in SQL Server

I am trying to solve the below problem, I have an existing dataset that is already grouped, but I need to it to be grouped further based on the common "SO_Number" in the dataset, example below:

Current Data:

Group_Key SO_Number
233738 SO21268046
233738 SO21269767
234129 SO21269767
234129 SO21274404
234129 SO21271542
234129 SO21274421
234421 SO21274421
234421 SO21276633
234421 SO21276877
88964 SO21276877
88964 SO21278203
88964 SO21278329
234727 SO21278329
234727 SO21279199
234727 SO21279542
91016 SO21279542
91016 SO21289940
88111 SO21289664
88111 SO21289665
88112 SO21289665
88112 SO21289677
88113 SO21289678

Expected Data output:

NewGroup_Key Group_key SO_number
233738,234129,234421,88964,234727,91016 233738 SO21268046
233738,234129,234421,88964,234727,91016 233738 SO21269767
233738,234129,234421,88964,234727,91016 234129 SO21269767
233738,234129,234421,88964,234727,91016 234129 SO21274404
233738,234129,234421,88964,234727,91016 234129 SO21271542
233738,234129,234421,88964,234727,91016 234129 SO21274421
233738,234129,234421,88964,234727,91016 234421 SO21274421
233738,234129,234421,88964,234727,91016 234421 SO21276633
233738,234129,234421,88964,234727,91016 234421 SO21276877
233738,234129,234421,88964,234727,91016 88964 SO21276877
233738,234129,234421,88964,234727,91016 88964 SO21278203
233738,234129,234421,88964,234727,91016 88964 SO21278329
233738,234129,234421,88964,234727,91016 234727 SO21278329
233738,234129,234421,88964,234727,91016 234727 SO21279199
233738,234129,234421,88964,234727,91016 234727 SO21279542
233738,234129,234421,88964,234727,91016 91016 SO21279542
233738,234129,234421,88964,234727,91016 91016 SO21289940
88111,88112 88111 SO21289665
88111,88112 88112 SO21289677
88113 88113 SO21289678

The expected data output I need should be in three groups instead of nine groups as they're all grouped by SO_Number - hence creating a new group key (NewGroup_Key) that will be used for new mapping of the data. Note that this is just a subset of the dataset so there are other groups that are involved as well. I have bolded the SO_number where each of the groups should be linked(grouped) in the "Current Data" table.

I have tried a few queries my end but didn't lead to anything reasonable or easy to follow using SQL. So any ideas would be helpful.



Solution 1:[1]

Hen you say 'clustering' are you referring to this?

declare @t TABLE (
   category int  NOT NULL 
  ,segment  NVARCHAR(50) NOT NULL
  ,payment  int  NOT NULL
);
INSERT INTO @t(category,segment,payment) VALUES (01,'A',1425);
INSERT INTO @t(category,segment,payment) VALUES (01,'B',7647);
INSERT INTO @t(category,segment,payment) VALUES (01,'A',6164);
INSERT INTO @t(category,segment,payment) VALUES (01,'B',3241);

--if you want payment column as string then use following

SELECT
category, 
segment,
    STRING_AGG(cast(payment as nvarchar(50)),'+') payment
FROM
   @t T
GROUP BY
    category,segment

Result:

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ASH