'NotSerializableException: org.apache.hadoop.io.LongWritable

I know this question has been answered many times, but I tried everything and I do not come to a solution. I have the following code which raises a NotSerializableException

val ids : Seq[Long] = ...
ids.foreach{ id =>
 sc.sequenceFile("file", classOf[LongWritable], classOf[MyWritable]).lookup(new LongWritable(id))
}

With the following exception

Caused by: java.io.NotSerializableException: org.apache.hadoop.io.LongWritable
Serialization stack:
...
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)

When creating the SparkContext, I do

val sparkConfig = new SparkConf().setAppName("...").setMaster("...")
sparkConfig.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
sparkConfig.registerKryoClasses(Array(classOf[BitString[_]], classOf[MinimalBitString], classOf[org.apache.hadoop.io.LongWritable]))
sparkConfig.set("spark.kryoserializer.classesToRegister", "org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.LongWritable")

and looking at the environment tab, I can see these entries. However, I do not understand why

  1. the Kryo serializer does not seem to be used (the stack does not mention Kryo)
  2. LongWritable is not serialized.

I'm using Apache Spark v. 1.5.1



Solution 1:[1]

I'm new to apache spark but tried to solve your problem, please evaluate it, if it can help you out with the problem of serialization, it's occurring because for spark - hadoop LongWritable and other writables are not serialized.

val temp_rdd = sc.parallelize(ids.map(id =>
sc.sequenceFile("file", classOf[LongWritable], classOf[LongWritable]).toArray.toSeq
)).flatMap(identity)

ids.foreach(id =>temp_rdd.lookup(new LongWritable(id)))

Solution 2:[2]

Try this solution. It worked fine for me.

SparkConf conf = new SparkConf().setMaster("local[*]").setAppName("SparkMapReduceApp");
        
        conf.registerKryoClasses(new Class<?>[]{
            LongWritable.class,
            Text.class
        });

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Kshitij Kulshrestha
Solution 2 suraj vijayakumar