'spark-sql overwrite hive table ,why occured duplicate records

It occured duplicate records when spark-sql overwrite hive table . when spark job has failure stages,but dateframe has no duplicate records? when I run the job again, the reasult is correct.It confused me.why?

eg: dataFrame.write().mode(SaveMode.Overwrite).insertInto("outputTable");

no duplicate records in dataFrame, but duplicate records existed in hive outputTable

apache-spark-sql

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'spark-sql overwrite hive table ,why occured duplicate records

Sources

Related Questions