'dpylr::tbl equivalent for Pandas

I am currently trying two switch from R to Python. I am working with large tables for a Uni project. I load the data as Snowflake objects in R via the commands:

con <-
  nuvolos::get_connection()

db_mcc_desc <-
  dplyr::tbl(con, "Table")

"Table" is about 100GB, so I really like that I can use many dplyr functions on db_mcc_desc without loading it into memory. Whenever needed I can create smaller data frames and load them into memory using collect().

However, when using the following pandas command, it automatically reads in a dataframe which exceeds my memory quite quickly.

import pandas as pd
from nuvolos import get_connection

con = get_connection()
db_mcc_desc = pd.read_sql_table('SELECT * TABLE', con=con)

Normal batching does not really work because the Table is so big. Is there a similar and easy solution in Pandas as available in the dplyr package for R.

Thanks a lot!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source