'Pyspark Delete rows in table one which matches rows in table two

Problem Statement

Here's my use case: I have 2 tables, Today_data and Yesterday_data, for example:

Today_data:

Id    Value
1     1_data
2     2_data
3     3_data 

Yesterday_data:

Id    Value
2     2_data
4     4_data
8     8_data

I want to delete Today_data df rows if the row matches Yesterday_data row.

Expected Result

Id    Value
1     1_data
3     3_data 

Approach Taken

I was thinking it should be a easy left join where Today_data will be on the left, however after I read through all the join operations in pyspark here: https://sparkbyexamples.com/pyspark/pyspark-join-explained-with-examples/#pyspark-join-types, I don't see any of them can solve my problem. Any ideas?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source