'Spark error: "ERROR Utils: Exception while deleting Spark temp dir:"
Edit: adding details as requested
While running a simple Spark code written in Scala, locally on a Windows 7 64bit in Administrator mode, the execution always ends in an error as detailed below. The code does write the output as expected (saveAsTextFile) before throwing the error.
(Based on a google search, others have the same problem but adding sc.stop() at the end of the code as suggested on another board does not help.)
The code:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object Test {
def main(args: Array[String]) {
System.setProperty("hadoop.home.dir", "C:/prog/winutil/")
val inputFile1 = "./textinput.txt"
val conf = new SparkConf().setAppName("Testing")
val sc = new SparkContext(conf)
val data = sc.textFile(inputFile1)
val outputFile = "./output"
data.saveAsTextFile(outputFile)
sc.stop()
}
}
And the error message:
ERROR Utils: Exception while deleting Spark temp dir: [userpath]\AppData\Local\Temp\spark-a790ba3f-af1e-4d2b-80e8-4085caaad04b\userFiles
-904e004e-4ca2-43a8-8689-684cc401b827
java.io.IOException: Failed to delete: [userpath]\AppData\Local\Temp\spark
-a790ba3f-af1e-4d2b-80e8-4085caaad04b\userFiles-904e004e-4ca2-43a8-8689-684cc401
b827
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:933)
at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV
$sp$2.apply(Utils.scala:181)
at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1$$anonfun$apply$mcV
$sp$2.apply(Utils.scala:179)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply$mcV$sp(Utils
.scala:179)
at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:
177)
at org.apache.spark.util.Utils$$anon$4$$anonfun$run$1.apply(Utils.scala:
177)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:177)
Solution 1:[1]
Fixed for me after moving count() & take() statements on RDD before saving parquet using saveAsParquetFile. So try moving any rdd.take() before u call saveAsParquetFile.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | spats |
