'Finding the combination of strings that occurs the most in given string

I have a combination of values that occur the most in given list of strings.

combination of values == value-value-value-value-value

So combination needs to have 5 values, and values can repeat.

all_values = ["CONJ", "NUM", "ADV", "PRT", "ADP", "PRON", "VERB", "DET", "ADJ", "NOUN"]


all_strings = ["DET-VERB-PRON-PRON-VERB-VERB-ADP",
"DET-NOUN-DET-NOUN-CONJ-PRON-NOUN-NOUN-VERB-NOUN-NOUN",
"PRON-VERB-VERB-DET-NOUN-ADP-NUM-ADP-NOUN",
"NOUN-VERB-NOUN-VERB-ADV-ADJ-ADP-PRON-NOUN-VERB-ADV-ADV-VERB-ADV-VERB",
"ADJ-NOUN-NOUN-PRT-VERB-VERB-DET-NOUN-VERB-ADP-DET-NOUN-NOUN",
"NOUN-VERB-PRT-ADP-PRON-NOUN-DET-VERB-NUM-NOUN-ADP-ADV-VERB",
"NOUN-NOUN-ADP-DET-NOUN-ADP-NOUN-NOUN",
"NOUN-ADV-VERB-ADP-DET-NOUN-VERB-VERB-ADP-NUM-ADJ-NOUN",
"PRON-ADV-VERB-DET-NOUN-PRON-VERB-VERB-NOUN-ADP-DET-NOUN",
"PRON-VERB-ADP-PRON-ADP-PRON-ADJ-ADJ-NOUN",
"PRON-VERB-DET-NOUN-NOUN-VERB-NOUN-VERB-PRT-VERB-ADP-DET-NOUN",
"PRON-VERB-ADP-PRON-ADP-DET-NOUN-DET-VERB-PRON-VERB-VERB-NOUN-ADP-PRON",
"NOUN-VERB-ADP-DET-NOUN-ADJ-VERB-ADV-VERB-ADJ-ADP-PRON-VERB-NOUN",
"ADJ-NOUN-NOUN-ADV-ADJ-NOUN-VERB",
"NOUN-ADV-VERB-ADJ-PRON-ADJ-NOUN-VERB-VERB-NUM",
"NOUN-DET-NOUN-ADV-VERB-NOUN-VERB-ADV-DET-ADJ-NOUN",
"ADV-PRON-VERB-ADV-NUM-ADP-DET-NOUN-NOUN-ADJ-NOUN",
"PRON-VERB-DET-NOUN-ADP-PRON-NOUN",
"ADJ-ADP-PRON-NOUN-VERB-VERB-ADP-NOUN-NOUN",
"ADJ-NOUN-NOUN-VERB-DET-PRON-VERB-DET-NOUN",
"NOUN-VERB-PRT-VERB-DET-NOUN-PRT-VERB-ADP-PRON-ADJ-ADV-ADJ-ADV",
"PRON-NOUN-VERB-ADV-VERB-ADP-DET-NOUN",
"ADV-NOUN-VERB-ADV-VERB-NOUN-ADP-PRON-NOUN-VERB-NOUN-ADP-NOUN",
"PRON-VERB-ADP-DET-NOUN-NOUN-ADV-DET-VERB-VERB-PRT-VERB-PRON",
"DET-VERB-ADJ-NOUN-NOUN-ADP-NOUN-ADP-NOUN-VERB-DET-NOUN",
"ADJ-NOUN-VERB-VERB-PRON-NOUN-NOUN",
"ADP-PRON-VERB-PRON-NOUN-PRT-NOUN-CONJ-DET-NOUN-VERB-VERB",
"ADV-DET-ADJ-NOUN-PRON-VERB-ADJ-NUM-ADP-NOUN-NOUN-NOUN",
"VERB-PRON-NOUN-ADV-VERB-PRT-VERB-NOUN",
"ADV-ADP-PRON-VERB-ADP-ADV-VERB-PRON-VERB-DET-NOUN-ADP-NOUN",
"VERB-PRON-VERB-NOUN-ADP-NOUN-NOUN",
"DET-ADV-ADJ-NOUN-VERB-VERB-ADP-PRON",
"PRON-VERB-ADJ-ADJ-VERB-ADP-ADJ-ADJ-NOUN-NOUN-VERB",
"NOUN-NOUN-NOUN-ADP-DET-ADJ-NOUN-NOUN-PRT-VERB-PRON-ADP",
"PRON-VERB-DET-ADJ-NOUN-NOUN",
"ADJ-NOUN-NOUN-ADP-ADP-NOUN",
"DET-VERB-DET-NOUN-ADP-NOUN-PRT-NOUN",
"NOUN-NOUN-NOUN-CONJ-ADJ-NOUN-VERB-VERB-VERB",
"ADP-NOUN-PRON-VERB-VERB-PRON-NOUN-NOUN-CONJ-ADJ-PRT-ADJ-NOUN",
"PRON-VERB-PRON-PRON-VERB-ADV-ADV",
"NOUN-VERB-VERB-PRT-VERB-NOUN-ADP-NOUN-NOUN-ADP-DET-NOUN-NOUN",
"PRON-PRON-VERB-VERB-DET-NOUN-CONJ-PRON-VERB-VERB-ADP-DET",
"PRON-VERB-NOUN-ADP-DET-NOUN-CONJ-NOUN-CONJ-NOUN-PRON-VERB-ADP-VERB-PRON",
"NOUN-ADJ-NOUN-VERB-ADV-ADP-DET",
"DET-NOUN-VERB-ADP-ADP-NUM-ADJ-NOUN",
"PRON-VERB-VERB-PRT-VERB-VERB-PRT-ADP-DET-NOUN-ADP-DET-NOUN",
"CONJ-DET-VERB-DET-NOUN-NOUN-ADP-NOUN-NOUN",
"PRON-VERB-ADJ-NOUN-VERB-NOUN-ADP-ADJ-NOUN-ADP-NOUN",
"ADV-PRON-VERB-VERB-DET-ADJ-NOUN-ADP-NOUN-PRT-VERB-PRON-NOUN",
"VERB-PRON-VERB-DET-NOUN-ADP-DET-NOUN-PRT-VERB-VERB",
"PRON-VERB-ADP-DET-NOUN-NOUN-VERB-ADJ",
"ADV-VERB-DET-NOUN-ADP-DET-NOUN",
"ADV-ADP-VERB-ADV-PRON-VERB-VERB",
"NOUN-DET-NOUN-NOUN-NOUN-NOUN-VERB-ADJ-NOUN-ADP-DET-NOUN",
"PRON-VERB-NOUN-ADP-NOUN-ADP-NUM-NOUN-NOUN-PRT-VERB-NOUN",
"ADJ-NOUN-ADV-VERB-DET-VERB-VERB-ADJ-NOUN-ADP-PRON",
"NOUN-ADV-ADJ-ADV-ADJ-ADP-DET",
"NOUN-PRON-VERB-ADJ-VERB-ADV",
"VERB-PRT-DET-NOUN-VERB-ADP-DET-NOUN-VERB-ADV-NOUN-VERB-VERB-ADP-PRON",
"PRON-VERB-ADV-ADJ-NOUN-NOUN-NOUN-NOUN-NOUN-PRT-VERB-DET-NOUN",
"PRON-VERB-DET-VERB-VERB-DET-NOUN-NOUN-ADP-NOUN",
"NOUN-NOUN-NOUN-NOUN-NOUN-VERB-NUM-NUM",
"PRON-ADV-VERB-NUM-NOUN-ADV-ADJ-ADP-PRON-NOUN",
"NOUN-VERB-DET-NOUN-ADP-NOUN-ADV-NOUN-VERB-VERB-PRON-NOUN",
"NOUN-VERB-VERB-VERB-ADP-PRON",
"NOUN-VERB-DET-ADV-ADJ-NOUN",
"NOUN-ADV-VERB-PRON-DET-NOUN-NOUN-ADV",
"ADV-VERB-PRT-NUM-NOUN-NOUN-PRON-VERB-DET-NOUN-ADP-DET-NOUN",
"NOUN-VERB-ADV-NOUN-VERB-ADP-DET-NOUN-NOUN",
"PRON-VERB-ADJ-NOUN-DET-ADJ-NOUN",
"VERB-PRON-VERB-DET-ADJ-ADP-PRON",
"PRON-VERB-ADJ-NOUN-ADP-NOUN",
"NOUN-NOUN-NOUN-ADJ-VERB-NOUN-VERB-NOUN-ADP-PRON-ADV",
"NOUN-NOUN-VERB-NOUN-VERB-ADV-ADP-DET-ADJ-NOUN",
"NOUN-VERB-VERB-NOUN-NOUN-NOUN-ADV-VERB-PRON-PRT-VERB-NOUN"]

So I need to find if the best combination is for example VERB-VERB-DET-NOUN-ADP or VERB-PRON-NOUN-ADV-VERB or any other.

I was thinking of finding all possible combinations of values from all_values list, but I'm sure that there is a faster way. Ofc, full all_strings has more than 50k values.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source