'How to convert org.apache.spark.sql.Column to data types like Long or String

I am new to Scala and Spark. I am trying to load data from Spark SQL to build graphX vertices however I am facing an error that I don't know how to solve. This is the code:

val vRDD: RDD[(VertexId, String)] = spark.sparkContext.parallelize(Seq(spark.table("sw")))
                                    .map(row => (row("id"), row("title_value")))

And this is the error:

<console>:36: error: type mismatch;
 found   : org.apache.spark.sql.Column
 required: org.apache.spark.graphx.VertexId
    (which expands to)  Long
       val vRDD: RDD[(VertexId, String)] = spark.sparkContext.parallelize(Seq(spark.table("sw")))
                                           .map(row => (row("id"), row("title_value")))


Solution 1:[1]

There error message is correct you are getting columns returned. You can pull those values out of the column with the following:

spark.sparkContext.parallelize(Seq(spark.table("testme")))
.map(row => (row("id").asInstanceOf[Long],row("name").toString))

or maybe:

spark.sparkContext.parallelize(Seq(spark.table("testme")))
.map(row => (row("id").asInstanceOf[VertexId],row("name").asInstanceOf[String]))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Matt Andruff