'Search for header and specific letters in lines and print both python

Search for "Cluster" and specific letters in lines st104, pK in (st104H_20170,pKH911_25081).

If the lines below the header have both the initials st104,pK print header and the lines.

input.txt
Cluster 1
0 673aa -st104P_06575
1 673aa -st104H_22488
3 673aa -pKH911_09284
4 673aa -pKP911_09288
Cluster 2
0 690aa -st104H_20170
1 690aa -KH911_25081
2 687aa -NE95031.1
3 685aa -TIG_004920
Cluster 3
0 685aa -st104H_27649
1 690aa -st104P_11877
2 685aa -pKP911_15300
Cluster 4
0 685aa -st104H_27649
1 690aa -st104P_11877

output
Cluster 1
0 673aa -st104P_06575
1 673aa -st104H_22488
3 673aa -pKH911_09284
4 673aa -pKP911_09288
Cluster 3
0 685aa -st104H_27649
1 690aa -st104P_11877
2 685aa -pKP911_15300

Tried:

with open("input.txt") as fh:
    result = ""
    cluster_content = ""
    for line in fh:
    if line.startswith("Cluster"):
        if all(initial in cluster_content for initial in ('st104', 'pK')):
           result += cluster_content
        cluster_content = ""
    cluster_content += line


Solution 1:[1]

This would filter the st104 and pK clusters

# true if filter_str is only one used
def check_alone(cluster_content, filter_str, cluster_split):
    return cluster_content.count(filter_str) == len(cluster_split) - 1

def cluster_filter(cluster_content):
    filters_labels = ['st104', 'pK']
    cluster_split = cluster_content.split('\n')
    if cluster_split[-1] == '': # to remove the last empty string in list
        cluster_split = cluster_split[:-1]

    if check_alone(cluster_content, 'st104', cluster_split) or check_alone(cluster_content, 'pK', cluster_split):
        return 

    # checking if each of the strings contain any of the filter_labels and making sure that all of the strings in the cluster contain an item from the filter
    if all(any(label in item for label in filters_labels) for item in cluster_split[1:]):
        print(cluster_content)


with open("input.txt") as fh:
    result = ""
    cluster_content = ""
    for line in fh:
        if line.startswith("Cluster"):
            cluster_filter(cluster_content)
            cluster_content = line
        else:
            cluster_content += line
    cluster_filter(cluster_content)

print(result)

Output:

Cluster 1
0 673aa -st104P_06575
1 673aa -st104H_22488
3 673aa -pKH911_09284
4 673aa -pKP911_09288

Cluster 3
0 685aa -st104H_27649
1 690aa -st104P_11877
2 685aa -pKP911_15300

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1