'Databricks spark snowflake dataframe.toPandas() taking more space and time

I have 2 exactly same table(rows and schema). One table recides in AZSQL server data base and other one is in snowflake database. Now we have some existing code which we want to migrate from azsql to snowflake but when we are trying to create a pandas dataframe from the snowflake its takes 20 to 30 mins to do it(3.8 millon records) and in AZSQL same task, we are able to do it in 5 mins. Also to perfome toPandas for AZSQL we were using really small single node cluster and for snowflake we have to use a cluster with 56 gb of memory(otherwise we get OOM error).

I have asked same question to databricks community https://community.databricks.com/s/feed/0D53f00001LKMhHCAX

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Databricks spark snowflake dataframe.toPandas() taking more space and time

Sources

Related Questions