I have a messy data source where some field values can come in with two different names but should map to one conformed field name on the output. e.g. data sour
I'm using Glue 3.0 data = [("Java", "6241499.16943521594684385382059800664452")] rdd = spark.sparkContext.parallelize(data) df = rdd.toDF() df.show() df.select(
Trying to debug AWS Glue scripts locally using Glue ETL library. I have installed aws-glue-libs and spark-3.1.1-amzn-0-bin-3.2.1-amzn-3.tgz. When I run job.init
When I try to use the dyF.show() it returns an empty field, even though I checked the schema and count() and I know the table is populated. I transformed it int
I have a text file which look like below. HDR¶20200101 BDY¶1¶Jimmy BDY¶1¶Something TRL¶123 I would like to parse it to a Glue Dyn
We have an ETL job that uses the below code snippet to update the catalog table: sink = glueContext.getSink(connection_type='s3', path=config['glue_s3_path_bc']