Category "aws-glue-spark"

Cast Issue with AWS Glue 3.0 - Pyspark

I'm using Glue 3.0 data = [("Java", "6241499.16943521594684385382059800664452")] rdd = spark.sparkContext.parallelize(data) df = rdd.toDF() df.show() df.select(

AWS glue NoClassDefFoundError on job.init()

Trying to debug AWS Glue scripts locally using Glue ETL library. I have installed aws-glue-libs and spark-3.1.1-amzn-0-bin-3.2.1-amzn-3.tgz. When I run job.init

Show Method for Dynamic Frame in AWS glue returns empty field

When I try to use the dyF.show() it returns an empty field, even though I checked the schema and count() and I know the table is populated. I transformed it int

Glue Dynamic Frame Parse text file with ¶ delimiter

I have a text file which look like below. HDR¶20200101 BDY¶1¶Jimmy BDY¶1¶Something TRL¶123 I would like to parse it to a Glue Dyn

Non-Partitioned Table Schema not updated with Glue ETL Job

We have an ETL job that uses the below code snippet to update the catalog table: sink = glueContext.getSink(connection_type='s3', path=config['glue_s3_path_bc']