'UNION 2 tablesand remove duplicates based on some columns

I'm writing SQL query in Python and I'm trying to UNION 2 tables and remove duplicates(duplicates are considered rows with same values in fields "id" and "product_id"), I found answers that NOT EXISTS is helpful, but I couldn't make it work properly and I have no idea where is the problem. This is my SQL query:

        SELECT cc."id", cc."default_name", cc."slug", cpc."product_id", ce."value" 
        FROM 
            "catalog_category" AS cc, 
            "catalog_product_categories" AS cpc, 
            "catalog_categoryentitydetailvaluevarchar" AS ce
        WHERE 
            ( 
                cc."id" = cpc."category_id" 
                AND cpc."product_id" IN {objects_id} 
                AND cc."id" = ce."parent_object_id" 
                AND cpc."category_id" = ce."parent_object_id" 
                AND ce."shop" = \'shop_name\' 
            ) 
        UNION ALL 
        SELECT cc2."id", cc2."default_name", cc2."slug", cpc2."product_id", ce2."value" 
        FROM 
            "catalog_category" AS cc2, 
            "catalog_product_categories" AS cpc2, 
            "catalog_categoryentitydetailvaluevarchar" AS ce2 
        WHERE 
            ( 
                cc2."id" = cpc2."category_id" 
                AND cpc2."product_id" IN {objects_id} 
                AND cc2."id" = ce2."parent_object_id" 
                AND cpc2."category_id" = ce2."parent_object_id" 
                AND ce2."shop" = \'default\' 
                AND NOT EXISTS(
                                SELECT 1 
                                FROM 
                                    "catalog_category" AS cc, 
                                    "catalog_product_categories" AS cpc, 
                                    "catalog_categoryentitydetailvaluevarchar" AS ce 
                                WHERE 
                                    cc2."id"=cc."id" 
                                    AND cpc2."product_id"=cpc."product_id" 
                                )
            ) 

But result is that with NOT EXISTS statement I receive only values from first SELECT and without NOT EXISTS i receive all values from both tables with all duplicates.



Solution 1:[1]

You can modify the query as below:

;With cte1 as (SELECT cc."id", cc."default_name", cc."slug", cpc."product_id", ce."value" 
        FROM 
            "catalog_category" AS cc, 
            "catalog_product_categories" AS cpc, 
            "catalog_categoryentitydetailvaluevarchar" AS ce
        WHERE 
            ( 
                cc."id" = cpc."category_id" 
                AND cpc."product_id" IN {objects_id} 
                AND cc."id" = ce."parent_object_id" 
                AND cpc."category_id" = ce."parent_object_id" 
                AND ce."shop" = \'shop_name\' 
            ))
Select * from cte1 
        UNION 
        SELECT cc2."id", cc2."default_name", cc2."slug", cpc2."product_id", ce2."value" 
        FROM 
            "catalog_category" AS cc2, 
            "catalog_product_categories" AS cpc2, 
            "catalog_categoryentitydetailvaluevarchar" AS ce2 
        WHERE 
            ( 
                cc2."id" = cpc2."category_id" 
                AND cpc2."product_id" IN {objects_id} 
                AND cc2."id" = ce2."parent_object_id" 
                AND cpc2."category_id" = ce2."parent_object_id" 
                AND ce2."shop" = \'default\' 
                AND NOT EXISTS(
                                SELECT 1 
                                FROM 
                                    Cte1                                WHERE 
                                    cc2."id"=cte1."id" 
                                    AND cpc2."product_id"=cte1."product_id" )

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ahuemmer