Category "presto"

AWS Athena/Presto SQL: Having trouble getting null values

I am doing a query in aws Athena where I want to get some total values, however I am having issues getting a column where the values are null, this column somet

Missing dates for specific identifiers without adding extra dates when this identifier is no longer in the database SQL

To put the problem in words, I have a massive table which includes subscribers and data for every day. If the subscriber no longer exists, then they will have n

Presto Weighted Moving Average Syntax Error

I'm trying to run the weighted moving average Silota query with similar data in a Presto database but am encountering an error. The same query in the Redshift d

Presto local file connector testing

I deployed presto in my local machine and the server is up and running. I'm trying to access a local csv file named "poc.csv" using local file connector. I have

Convert string date format MM/dd/yyyy HH:mm:ss.SSSS to timestamp presto

I have a string that looks like: 2022-03-30 17:18:09.569000 I am trying to convert this to a timestamp as follows: select "date_parse"("date_format"('2022-03-3

replacing null value

If i am using select coalesce(firstname, surname, petname) as column_name select isnull(firstname, surname) as column_name Where all return the first non-

How to count occurrences of a character in a string in Presto?

I am trying to find the number of frequency of a character in a string in Presto. like 129.11.20.0 and I wan to find number of dot . in this string. just wond

Splitting an array into columns in Athena/Presto

I feel this should be simple, but I've struggled to find the right terminology, please bear with me. I have two columns, timestamp and voltages which is the a

Presto fails to import PARQUET files from S3

I have a presto table that imports PARQUET files based on partitions from s3 as follows: create table hive.data.datadump ( tUnixEpoch varchar, tDateTi

Amazon Athena partition with colon(:) is not working

When creating partition in Athena, I tried to use the date in the format (yyyy-MM-ddTHH:mm:ssZ) then I am not able to query the data Step 1: Create table CREA

How to pivot a table in Presto?

Let be a table named data with columns time, sensor, value : I want to pivot this table on Athena (Presto) to get a new table like this one : To do so, one ca

How to create Dataframe form presto db table of Array Data type column using spark

I am trying to create spark Dataframe from presto db table which has few columns as Array DataType. I tried multiple ways but I am getting same exception java.s

way to check if two intervals overlap in amazon Athena / Presto

I am wondering if we have a way to check if two dates overlap in amazon athena (when writing an athena query) . I can do this in R / Python using the int_overla

How to break datetime in 12 hour chunks and use it for aggregation in Presto SQL?

I have been trying to break the datetime in 12 hour chunk in Presto SQL but was unsuccessful. Raw data table: datetime Login 2022-05-08 07:10:00.000 1234 2022

Query (SQL like joins) remote CSV for data analysis

I would like to query (SQL with joins) CSV files sitting in a network folder for performing data analysis work. I'm not allowed to move the files out of the net

How can I convert an integer representing EPOCH time to a timestamp in Athena (Presto)?

I have a table where the datetime is stored as varchar but represents the EPOCH time (e.g. 1556895150). How can I get that value to be recognized as a timestamp

String to YYYY-MM-DD date format in Athena

So I've looked through documentation and previous answers on here, but can't seem to figure this out. I have a STRING that represents a date. A normal output l

Converting bigint to timestamp in presto

I have a column in my dataset that has a datatype of bigint: Col1 Col2 1 1519778444938790 2 1520563808877450 3 1519880608427160 4

How to merge rows by a similar column via levenshtein distance

I'm using AWS Athena and I'm trying to merge all the rows which have a specific column with levenshtein_distance value lower then 5 and sum the normalised perce