'Accessing datacatalog table in Glue properly

I created a table in Athena without a crawler from S3 source. It is showing up in my datacatalog. However, when I try to access it through a python job in Glue ETL, it shows that it has no column or any data. The following error pops up when accessing a column: AttributeError: 'DataFrame' object has no attribute '<COLUMN-NAME>'.

I am trying to access the dynamic frame following the glue way:

datasource = glueContext.create_dynamic_frame.from_catalog(
  database="datacatalog_database",
  table_name="table_name",
  transformation_ctx="datasource"
)

print(f"Count: {datasource.count()}")
print(f"Schema: {datasource.schema()}")

The above logs output: Count: 0 & Schema: StructType([], {}), where the Athena table shows I have around ~800,000 rows.

Sidenotes:

  • The ETL job concerned has AWSGlueServiceRole attached.
  • I tried Glue Visual Editor as well, it showed the datacatalog database/table concerned but sadly, same error.


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source