'spark-sql overwrite hive table ,why occured duplicate records
It occured duplicate records when spark-sql overwrite hive table . when spark job has failure stages,but dateframe has no duplicate records? when I run the job again, the reasult is correct.It confused me.why?
eg: dataFrame.write().mode(SaveMode.Overwrite).insertInto("outputTable");
no duplicate records in dataFrame, but duplicate records existed in hive outputTable
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
