'How to call avro SchemaConverters in Pyspark

Although PySpark has Avro support, it does not have the SchemaConverters method. I may be able to use Py4J to accomplish this, but I have never used a Java package within Python.

This is the code I am using

# Import SparkSession
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, IntegerType


def _test():

      # Create SparkSession 
      spark = SparkSession.builder \
            .master("local[1]") \
            .appName("sparvro") \
            .getOrCreate() 

      avroSchema = sc._jvm.org.apache.spark.sql.avro.SchemaConverters.toAvroType(StructType([ StructField("firstname", StringType(), True)]))

if __name__ == "__main__":
    _test()

however, I keep getting this error

AttributeError: 'StructField' object has no attribute '_get_object_id'


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source