'Iteration over list in PySpark

I am not sure how could I reproduce this python code into PySpark, any ideas? It iterate over a list of dicts, than if that idx_key happen again it append and then compare to the next list, I am wondering if an inner join could reproduce it. But I would have a cross result, no?

    def get_new_contacts(b2b_master_data, new_potential_contact_data):
       
    master_contact_data = {}

    # Generating a dictionary where key is the idx_key and values are
    # contacts associated with the idx_key


    for record in list(b2b_master_data):
        idx_key = record["idx_key"]
        if idx_key not in master_contact_data:
            master_contact_data[idx_key] = [record]
        else:
            master_contact_data[idx_key].append(record)

    for record in list(new_potential_contact_data):
        idx_key = record["idx_key"]
        if idx_key in master_contact_data:
            potential_matches = master_contact_data[idx_key]
    return potential_matches ```

python pyspark

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Iteration over list in PySpark

Sources

Related Questions