'How get a field contains which at least two matching words from other table in clickhouse

I would like to get results from the fields (path) in table URL where the last directory in the path field from URL table contains at least two matching words from the same row in the Locations table.

Table URL

WITH arrayJoin([
    ('/directory1/paris-oslo-london.htm', 0 ),
    ('/directory1/paris-oslo-granada.htm', 1 ),
    ('/directory1/london-moscow.htm', 2),
    ('/directory1/barcelona-moscow.htm', 3),
    ('/directory2/keyword1-keyword2-lisboa-london.htm', 4 ),
    ('/directory3/ny-anykeyword-milan.htm', 5),
    ('/directory3/anykeyword-berlin-milan-moscow', 6)] )as data
SELECT data.1 as path,data.2 as value

Table

path value
/directory1/paris-oslo-london.htm 0
/directory1/paris-oslo-granada.htm 1
/directory1/london-moscow.htm 2
/directory1/barcelona-moscow.htm 3
/directory1/keyword1-keyword2-lisboa-london.htm 4
/directory3/ny-anykeyword-milan.htm 5

Table Locations

WITH arrayJoin([
    ('paris','row 1'),
    ('oslo','row 2'),
    ('granada','row 3'),
    ('ny-paris-milan-granada','row 4'),
    ('london-moscow','row 5'),
    ('lisboa','row 6'),
    ('ny-aaaa-milan','row 7'),
    ('berlin-moscow-keyword3','row 8')] )as data
SELECT data.1 as location_name,data.2 as row_number

Table Locations

location_name row_number
paris 1
oslo 2
granada 3
ny-paris-milan-granada 4
london-moscow 5
lisboa 6
ny-aaaa-milan 7
berlin-moscow-keyword3 8

My sql

I try this SQL but dowcounts each keyword if exists in any rows, not in the same row

WITH  arrayFlatten(groupUniqArray(splitByNonAlpha(arrayJoin([
    ('paris','row 1'),
    ('oslo','row 2'),
    ('granada','row 3'),
    ('ny-paris-milan-granada','row 4'),
    ('london-moscow','row 5'),
    ('lisboa','row 6'),
    ('ny-aaaa-milan','row 7'),
    ('berlin-moscow-keyword3','row 8')] ).1))) as location_namee
SELECT arrayJoin([
    ('/directory1/paris-oslo-london.htm', 0 ),
    ('/directory1/paris-oslo-granada.htm', 1 ),
    ('/directory1/london-moscow.htm', 2),
    ('/directory1/barcelona-moscow.htm', 3),
    ('/directory2/keyword1-keyword2-lisboa-london.htm', 4 ),
    ('/directory3/ny-anykeyword-milan.htm', 5),
    ('/directory3/anykeyword-berlin-milan-moscow', 6)] ).1 as path,
    splitByNonAlpha(replaceRegexpAll(path,'^\/(.*)\/(.*)\.htm$',    '\\2')) as words_in_directory,
    arrayFilter(x -> has(location_namee,x),words_in_directory) as matchedWords

Result of my sql (not result desired)

path words_in_directory matchedWords
/directory1/paris-oslo-london.htm ['paris','oslo','london'] ['paris','oslo','london']
/directory1/paris-oslo-granada.htm ['paris','oslo','granada'] ['paris','oslo','granada']
/directory1/london-moscow.htm ['london','moscow'] ['london','moscow']
/directory1/barcelona-moscow.htm ['barcelona','moscow'] ['moscow']
/directory2/keyword1-keyword2-lisboa-london.htm ['keyword1','keyword2','lisboa','london'] ['lisboa','london']
/directory3/ny-anykeyword-milan.htm ['ny','anykeyword','milan'] ['ny','milan']
/directory3/anykeyword-berlin-milan-moscow ['directory3','anykeyword','berlin','milan','moscow'] ['berlin','milan','moscow']

Results desired

url match number_of_elements_matching_in_url matchedWords row_matched
directory1/paris-oslo-granada.htm yes 2 'paris','granada' row 4
directory3/ny-anykeyword-otherkeyword-milan.htm yes 2 'ny','milan' row 7
directory3/anykeyword-berlin-milan-moscow yes 2 'berlin','moscow' row 8
directory1/barcelona-moscow.htm not 1 '-' -
directory1/london-moscow.htm yes 2 'london','moscow' row 5
directory1/paris-oslo-london. not 1 '-' -
directory1/keyword1-keyword2-lisboa-london.htm not 1 '-' -


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source