Preface: I defined an Athena table in AWS, using s3 as the source (defined it manually without glue crawler). The files contain data from Eventbridge, and each
When trying to refresh the partitions in a AWS Athena/Glue table I am getting this error line 1:1: mismatched input 'MSCK'. Expecting: 'ALTER', 'ANALYZE', 'CAL
I am trying to find the latest update of a particular row from a bunch of rows per uuid. For that we use row_number() over a partition as shown below, "row_numb
I have some data in S3 location in json format. It have 4 columns val, time__stamp, name and type. I would like to create an external Athena table from this dat
I have some data rows in AWS Athena table and I am trying to get the data from the last 1 hour. I am using awswrangler, I will post my snippet below. Basically,
I'm very new to Athena and I'm having a little bit of hard time understanding how partitioning works and if it can work for me. I have files in S3 in the follow
I recently ran into the following error "AthenaQueryError: Athena query failed: "NOT_SUPPORTED: Unsupported Hive type", and for this I followed this stack overf
I have Glue DBs(db1 and db2) and tables(tbl1 and tbl2) available in different AWS regions(eu-west-1 and us-east-1) respectively. My glue job in eu-west-1, needs
I have a table that is partitioned on one or more columns. I can do ... SHOW PARTITIONS table_db.table_1 which gives a list of all partitions like this, year=2
I understand that I can set the number or size of files using "Bucketing" method (Refer to this guide: https://aws.amazon.com/premiumsupport/knowledge-center/se
I am trying to find percentage based on id column. issue - I am trying to use count(column)/select count(column) from table which is giving output as 'Zero' Tab
I am doing a query in aws Athena where I want to get some total values, however I am having issues getting a column where the values are null, this column somet
We have an S3 data lake in AWS (with Lake Formation, Glue etc.) The end goal is to query the S3 data sources using SQL in Athena. When making the query in the A
I have input file in s3 bucket with .json.snappy compression and I am trying to read through athena table. I tried using different serde 'org.apache.hive.hcatal
I am trying to create an external function in Athena using AWS Lambda function. I am able to do so and query successfully using Athena query editor. Code is bel
need to find values in numeric_column(string) that don't contain '-' or '[0-9] or '.' I am a little bit novice in Athena... so honestly don't
I have a folder containing files in parquet format. I used crawler to create table defined in Glue Data Catalog which counted to 2500+ columns. I want to create
I have a pandas DataFrame containing a date column ("2022-02-02"). I write this table to parquet using pyarrow. df[col] = df[col].astype(str) df.to_parquet(loc)
I have my delta table, which can be read from Athena. When I try to get the data through a query from spark I get the following error: Caused by: org.apache.sp
I have a source bucket where small 5KB JSON files will be inserted every second. I want to use AWS Athena to query the files by using an AWS Glue Datasource and