'Databricks- Convert Python DataFrame to Scala DataFrame
I have a dataframe in python, df, that i want to pass to be able to use in % scala.
I have tried -
%python
pyDf.createOrReplaceTempView("testDF") // error message
Solution 1:[1]
it's not too difficult. I am sharing a sample code pls try. It's working in Pycharm or databricks.
from pyspark.sql import *
import pandas as pd
spark = SparkSession.builder.master("local").appName("testing").getOrCreate()
data = [['venu', 50], ['renu', 45], ['anu', 54],['bhanu',14]]
Create the pandas DataFrame
pdf= pd.DataFrame(data, columns = ['Name', 'Age'])
print(pdf)
Python Pands convert to Spark Dataframe.
sparkDF=spark.createDataFrame(pdf)
sparkDF.printSchema()
sparkDF.show()
Solution 2:[2]
Just query it with spark.sql:
val scalaDf = spark.sql("select * from testDF")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Venu A Positive |
| Solution 2 | Alex Ott |

