'Delete duplicate records without primary key in table in redshift
Need to delete duplicates from a table, but table does not have any primary key.
tried with Row_number,rowid but these function is not available in redshift.
can someone help on this.
Solution 1:[1]
You can use DISTINCT in a SELECT to create a new table without duplicates.
begin;
create table table_name_new as select distinct * from table_name;
alter table table_name rename to table_name_old;
alter table table_name_new rename to table_name;
drop table table_name_old;
commit;
NB ROW_NUMBER() does exist in Redshift.
You could use the following syntax to delete duplicates. Replace identity columns with the columns which define uniqueness in your table.
Run the query with SELECT * FROM cte in the place of DELETE FROM cte first to check what will be deleted.
WITH cte AS
(SELECT
*,
ROW_NUMBER() OVER(PARTITION BY identiy_columns)
FROM table_name)
DELETE FROM cte
WHERE ROW_NUMBER > 1;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
