'Combine cell strings when condition is matched, and remove duplicates?

I'm over my head here, but hoping someone can help.

Would like C3 to display all the strings from C15 to C60 for rows where column E matches B3, and remove duplicates / only display uniques.

Also, I would like D3 to display the strings from D15 to D60 as the prior question, however the cells potentially contain comma separated values - is there a way I can do as above but treat the comma separated values as separate strings?

screenshot and spreadsheet to download.

Thanks in advance for anyone taking a look.

Joe



Solution 1:[1]

excel have a function to remove duplicate data, did you try it?

just copy your origin data then remove duplicate.

enter image description here

Solution 2:[2]

Retrieve Possibly Separated Unique Data

Usage:

Copy the code into a standard module and in Excel use the following formulas:

in Cell C3 =CritJoe(C$15:C$60,$E$15:$E$60,$B3)

in Cell D3 =CritJoe(D$15:D$60,$E$15:$E$60,$B3,",")

The Code

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Purpose:  Retrieves unique ((comma) separated) (ResultSeparator) data
'           determined by a criteria (Criteria) in a specified column
'           (CriteriaRange) from another specified column (SourceRange)
'           possibly containing (comma) separated (StringSeparator) strings.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function CritJoe(SourceRange As Range, CriteriaRange As Range, _
  Criteria As String, Optional StringSeparator As String = "", _
  Optional ResultSeparator As String = ", ") As String

    Dim vntS            ' Source Array (1-based, 2-dimensional)
    Dim vntC            ' Criteria Array (1-based, 2-dimensional)
    Dim vntSS           ' Source String Array (0-based, 1-dimensional)
    Dim vntR            ' Resulting Array (0-based, 1-dimensional)
    Dim i As Long       ' Source & Criteria Array Elements Counter
    Dim j As Long       ' Resulting Array Elements Counter
    Dim k As Long       ' Source String Array Elements Counter
    Dim UB As Long      ' Current Resulting Array's Upper Bound
    Dim strS As String  ' Current Source String
    Dim strR As String  ' Resulting String

    ' Check if SourceRange and CriteriaRange have the same number of rows and
    ' have the same first row number.
    If SourceRange.Rows.Count <> CriteriaRange.Rows.Count Or _
      SourceRange.Rows(1).Row <> CriteriaRange.Rows(1).Row Then GoTo RowsError
    ' Note:  The relevant data has to be in the first column of each range
    ' if (accidentally) more columns have been selected.

    ' Copy first column of the Ranges to Arrays.
    vntS = SourceRange.Cells(1).Resize(SourceRange.Rows.Count)
    vntC = CriteriaRange.Cells(1).Resize(CriteriaRange.Rows.Count)

    ' Write relevant data to Resulting Array.
    For i = 1 To UBound(vntS)
        ' To avoid "Case/Space issues", "parts" of the following can be used:
        ' If Upper(Trim(vntC(i, 1))) = Upper(Trim(Criteria)) Then
        ' ... instead of the following line:
        If vntC(i, 1) = Criteria Then
            strS = vntS(i, 1)
            If StringSeparator <> "" Then
                ' Write Resulting String to Resulting Array using
                ' StringSeparator.
                GoSub SplitString
            Else
                ' Write Resulting String to Resulting Array without
                ' using StringSeparator.
                GoSub StringToArray
            End If
        End If
    Next

    ' Write relevant data from Resulting Array to Resulting String.
    If IsArray(vntR) Then
        strR = vntR(0)
        If UBound(vntR) > 0 Then
            For j = 1 To UBound(vntR)
                strR = strR & ResultSeparator & vntR(j)
            Next
        End If
    End If

    CritJoe = strR

Exit Function

' Write Resulting String to Resulting Array using StringSeparator.
SplitString:
    vntSS = Split(strS, StringSeparator)
    For k = 0 To UBound(vntSS)
        strS = Trim(vntSS(k))
        GoSub StringToArray
    Next
    Return

' Write Resulting String to Resulting Array.
StringToArray:
    If IsArray(vntR) Then
        ' Handle all except the first element in Resulting Array.
        UB = UBound(vntR)
        For j = 0 To UB
            If vntR(j) = strS Then Exit For
        Next
        If j = UB + 1 Then
            ReDim Preserve vntR(j): vntR(j) = strS
        End If
    Else
        ' Handle only first element in Resulting Array.
        ReDim vntR(0): vntR(0) = strS
    End If
    Return

RowsError:
    CritJoe = "Rows Error!"

End Function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Solution 3:[3]

As I wrote, it's relatively simple with Power Query, available in Excel 2010+. It takes longer to type out the steps then to create it :-).

All can be done from the UI

  • Remove the Scene and Video Columns
  • Split the Cast column by the comma delimiter
  • Select the Location and Cast columns and Transform/Trim to eliminate unwanted spaces
  • Select all three columns and Remove Rows/Duplicates
  • Group By Recording Block Operation:= All Rows
    • Name the Column Grouped

enter image description here

  • Add Custom Column for the Locations: Formula =

    List.Distinct(Table.Column([Grouped],"Location"))

    • note the new column name in the dialog box

enter image description here

  • Add another Custom Column for the Cast: Formula =

    List.Distinct(Table.Column([Grouped],"Cast"))

enter image description here

  • Select the double-headed arrow at the top of the new columns

    • Extract Values from the drop down and select a delimiter -- I used the Custom --> comma space
  • Close and Load to wherever you want on the worksheet.

enter image description here

Of note, this workbook can be saved as an xlsx file.

For interest, here is the M-Code generated.

M Code

    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Video", Int64.Type}, {"Scene", Int64.Type}, {"Location", type text}, {"Cast", type text}, {"Recording Block", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Video", "Scene"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Columns", {{"Cast", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Cast"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Cast", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type1",{{"Cast", Text.Trim, type text}, {"Location", Text.Trim, type text}}),
    #"Removed Duplicates" = Table.Distinct(#"Trimmed Text"),
    #"Grouped Rows" = Table.Group(#"Removed Duplicates", {"Recording Block"}, {{"Grouped", each _, type table [Location=text, Cast=text, Recording Block=text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Locations", each List.Distinct(Table.Column([Grouped],"Location"))),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Cast", each List.Distinct(Table.Column([Grouped],"Cast"))),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom1", {"Cast", each Text.Combine(List.Transform(_, Text.From), ", "), type text}),
    #"Extracted Values1" = Table.TransformColumns(#"Extracted Values", {"Locations", each Text.Combine(List.Transform(_, Text.From), ", "), type text})
in
    #"Extracted Values1"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ManhND
Solution 2 VBasic2008
Solution 3 Ron Rosenfeld