'Difference Between Cartesian Join and BroadcastNestedLoop join in Spark
I went through several articles but eventually could not exactly figure out what's the exact difference between them. Both of them scan the tables for each record in a cross product manner. They say in BroadcastNestedLoop, smaller table is broadcasted to all worker nodes. How does this shuffling happen in case of Cartesian join? Could you please explain me what exactly is the different between the two join strategies in Spark.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
