'Delete duplicate records without primary key in table in redshift

Need to delete duplicates from a table, but table does not have any primary key.

tried with Row_number,rowid but these function is not available in redshift.

can someone help on this.



Solution 1:[1]

You can use DISTINCT in a SELECT to create a new table without duplicates.

begin;
create table table_name_new as select distinct * from table_name;
alter table table_name rename to table_name_old;
alter table table_name_new rename to table_name;
drop table table_name_old;
commit;

NB ROW_NUMBER() does exist in Redshift. You could use the following syntax to delete duplicates. Replace identity columns with the columns which define uniqueness in your table.
Run the query with SELECT * FROM cte in the place of DELETE FROM cte first to check what will be deleted.

WITH cte AS
(SELECT
  *,
  ROW_NUMBER() OVER(PARTITION BY identiy_columns)
FROM table_name)
DELETE FROM cte 
WHERE ROW_NUMBER > 1;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1