'SQL Join is Returning Multiple Rows

Below is a Snowflake query that is trying to get the serial number that's associated with each account name (ACCT_NAME), account ID (ACCT_ID), SKU ID (SKU_ID_p2), and offer description (OFFER_DESC). However when I bring in the serial number column from the FACT_PRODUCT_OFFER_CUST_VALUE table, it goes from 3 rows to 500+ rows--any idea why? Here is my current query:

SELECT a.NAME AS ACCT_NAME,
       a.ID AS ACCT_ID,
       p2.NAME AS SKU_ID_p2,
       dpo.OFFER_DESC
FROM ASSET ast
    JOIN DB_A.SA.ACCOUNT a
        ON a.ID = ast.ACCOUNT_ID
    JOIN DB_A.SA.PRODUCT_2 p2
        ON p2.ID = ast.PRODUCT_2_ID
    JOIN DB_A.SA.OPPORTUNITY o
        ON o.ACCOUNT_ID = ast.ACCOUNT_ID
    JOIN DB_B.SB.DIM_PRODUCT_OFFER dpo
        ON dpo.PRODUCT_ID = p2.ID
    JOIN DB_B.SB.DIM_PRODUCT_OFFER_CUST_VALUE dpocv
        ON dpocv.OFFER_SKU_ID = dpo.OFFER_SKU_ID
    JOIN DB_B.SB.FACT_PRODUCT_OFFER_CUST_VALUE fpocv
        ON fpocv.SERIAL_NUM = dpocv.SERIAL_NUM
GROUP BY a.NAME,
         a.ID,
         p2.NAME,
         dpo.OFFER_DESC

And here's a sample result of this query:

ACCT_NAME ACCT_ID SKU_ID_P2 OFFER_DESC
Joe's Hardware Shop 9084785487488HUY JH-WEQ-3S-FER-KIR Plywood: 500M (Config B)
Joe's Hardware Shop 9084785487488HUY JH-WEQ-230j-FER-KIR Plywood: 1000M (Config D)
Joe's Hardware Shop 9084785487488HUY JH-ITK-OPKF-LFPL-KIR1 Plywood: 250M (Config A)

However when I bring in the Serial Number column from the FACT_PRODUCT_OFFER_CUST_VALUE table, the original query jumps from 3 rows to 500+ rows. The correct output should have a unique serial number for the three rows in the table above. Correct output below:

ACCT_NAME ACCT_ID SKU_ID_P2 OFFER_DESC SERIAL_NUM
Joe's Hardware Shop 9084785487488HUY JH-WEQ-3S-FER-KIR Washer: Model A (Config B) AUEJKLOR
Joe's Hardware Shop 9084785487488HUY JH-WEQ-230j-FER-KIR Washer: Model B (Config D) AGITJRD
Joe's Hardware Shop 9084785487488HUY JH-ITK-OPKF-LFPL-KIR1 Washer: Model E (Config A) AREJKEK

Updated query with SERIAL_NUM brought in as a column:

SELECT a.NAME AS ACCT_NAME,
       a.ID AS ACCT_ID,
       p2.NAME AS SKU_ID_p2,
       dpo.OFFER_DESC,
       fpocv.SERIAL_NUM
FROM ASSET ast
    JOIN DB_A.SA.ACCOUNT a
        ON a.ID = ast.ACCOUNT_ID
    JOIN DB_A.SA.PRODUCT_2 p2
        ON p2.ID = ast.PRODUCT_2_ID
    JOIN DB_A.SA.OPPORTUNITY o
        ON o.ACCOUNT_ID = ast.ACCOUNT_ID
    JOIN DB_B.SB.DIM_PRODUCT_OFFER dpo
        ON dpo.PRODUCT_ID = p2.ID
    JOIN DB_B.SB.DIM_PRODUCT_OFFER_CUST_VALUE dpocv
        ON dpocv.OFFER_SKU_ID = dpo.OFFER_SKU_ID
    JOIN DB_B.SB.FACT_PRODUCT_OFFER_CUST_VALUE fpocv
        ON fpocv.SERIAL_NUM = dpocv.SERIAL_NUM
GROUP BY a.NAME,
         a.ID,
         p2.NAME,
         dpo.OFFER_DESC,
         fpocv.SERIAL_NUM


Solution 1:[1]

The simple answer is: You have many rows in the new table with the same value as you are join on. This is the ONLY reason you ever get more rows.

As to the next question why do you have many value that are the same, that is harder to know. But given you have used the word FACT it's making me think of a SCD2 like table, and thus while there is only one Plywood: 500M (Config B) over time it has had different prices. And when you join to a FACT/DIMENSION table like price, it is also had to be with respect to time.

Which implies you ether want the latest price, OR you want the price as the time of the event to a looking at.

Normally a DIMENSION table will have start_time, end_time columns and often I have seen then with end_time being null if it is the current "price", so you will need more clauses on your JOIN ON to bind to the row you want.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Simeon Pilgrim