'Using lateral flatten logic in Snowflake

I have a table named table_1. It contains a column called ITEM_LIST which has below mentioned values:

;;;;;103=::info::0|112=::info::0|114=::info::0|170=::info::0|1237=::info::0|1293=::info::0
;;;;;103=::info::0|112=::info::0|114=::info::0|170=::info::0|1237=::info::0|1293=::info::0

SELECT X.value::string AS purchase_list
   FROM table_1 a,
   LATERAL flatten
   (INPUT=>split(a.ITEM_LIST, '|')) X 
   limit 100

When I run the above SQL query, I'm getting the result as mentioned below:

ITEM_LIST
;;;;;103=::info::0
;;;;;103=::info::0

I'm not getting the values after '|' symbol

I need to display the index value of second record.

When I run the below sample query, I'm getting records. But when I replace the dual table with original table, I'm not getting the correct results.

SELECT index,d.value::string AS purchase_list 
              FROM (
                        SELECT c.value::string AS purchase_list
                          FROM DUAL,LATERAL flatten(INPUT=>split(';960137161;10;9.49;20339=1.49|20340=1.00|20342=4.90|20341=0.95|20345=0.51;103=::info::0|112=::info::0|114=::info::0|170=available|1226=006-Employee Savings|1227=Associate Discount|1237=::info::0|1293=::info::0,;125300189;3;3.57;20339=4.49|20340=3.49|20342=3.00|20341=1.19|20343=6.72|20345=0.18;103=justforu|112=no-value|114=justforu:offer-details#product-grid#non-search#not-recommended#r02#s05|170=available|1226=003-001 EVoucher (elec)#009-PaperVoucher|1227=4 for $5.00#Salesperson Discount|1237=::info::0|1293=::info::0,;125300057;3;3.57;20339=4.49|20340=3.49|20342=3.00|20341=1.19|20343=6.72|20345=0.18;103=justforu|112=no-value|114=justforu:offer-details#product-grid#non-search#not-recommended#r06#s22|170=available|1226=003-J4U Store eCoupons (elec)#006-Employee Savings|1227=4 for $5.00#Salesperson Discount|1237=::info::0|1293=::info::0,;960113015;1;8.54;20339=9.99|20340=8.99|20342=1.00|20341=8.54|20345=0.45;103=search-results|112=no-value|114=search-results#product-grid#nutella#not-recommended#r01#s01|170=available|1293=::info::0,;970104604;1;5.21;20339=5.99|20340=5.49|20342=0.50|20341=5.21|20345=0.28;103=home|112=no-value|114=home#buy it again#non-search#past-purchase#r01#s06|170=available|1226=006-Salesperson promo|1237=::info::0|1293=::info::0', '|')) c 
                      ), LATERAL flatten(INPUT=>split(purchase_list, ';')) d
                      


Solution 1:[1]

SELECT X.value::string AS purchase_list
   FROM (
    SELECT ';;;;;103=::info::0|112=::info::0|114=::info::0|170=::info::0|1237=::info::0|1293=::info::0' as item_list
   ) a,
   LATERAL flatten
   (INPUT=>split(a.ITEM_LIST, '|')) X 
   limit 100

works.

PURCHASE_LIST
;;;;;103=::info::0
112=::info::0
114=::info::0
170=::info::0
1237=::info::0
1293=::info::0

So I going to take a wild guess, and say your limit 100 is return just the first 100 rows, and you are not seeing any of the later parts, because they are dropped by the limit??

Second Guess:

When you have only have one row of input you and split it you see the N rows of output for that row.

But when you have two split, and a read from the table, there are hundreds of table rows, and the two splits make to many rows.

So you add a limit 100, but it limiting the final output..

so if we limit on the original table, via a sub-select

SELECT X.value::string AS purchase_list
   FROM table_1 a,
   LATERAL flatten
   (INPUT=>split(a.ITEM_LIST, '|')) X 
   limit 100

goes to:

SELECT X.value::string AS purchase_list
   FROM (
       SELECT * 
       FROM table_1 
       LIMIT 2 -- one 1 even
   ) AS a,
   LATERAL flatten
   (INPUT=>split(a.ITEM_LIST, '|')) X 
   --limit 100

then you will only see one table rows worth of pipe splitting, and you could then also have the semi-colon splitting also.

Yesterday I was not seeing "why would prod code want a limit of 100 in it" but if I connect the dots that you are output the index from the second flatten. It would appear you are trying to check that semi-colon split "rule of grabbing the second value" is true, or you have already isolated the edge case where it's different, thus are trying to narrow in on that case. Aka data engineering, and which point this question makes sense.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1