'Ideally maintaining the same structure, how can I parameterize this batching function to control the batch sizes?

I have hard-coded the following algorithm to achieve a task I am wishing to accomplish, in this case returning 5 consecutive values at a time for each time step in a Pandas DataFrame. I am hoping to structure this as a defined function that takes in input a Pandas DataFrame as well as a number to batch the frame by: in the example provided below, 5, but this could vary depending upon the value of the parameter passed in input.

The code currently follows this structure:

Last_5 = []

for i, j  in enumerate(Pandas_Dataframe['Column']):
    
    sublist = [i-4 , list(Pandas_Dataframe['Column'])[i-4],
               i-3 , list(Pandas_Dataframe['Column'])[i-3],
               i-2 , list(Pandas_Dataframe['Column'])[i-2],
               i-1 , list(Pandas_Dataframe['Column'])[i-1],
               i-0 , list(Pandas_Dataframe['Column'])[i-0]]
    
    Last_5.append(sublist[1::2])

Parameterized, I would like for it to follow this new structure:

def Batcher(delta_t, n):
    ...
    ...
    ...
    return Last_n


Solution 1:[1]

Figured it out and have since used it to generate, store, and analyze 400,000,000+ data points and not encountered memory issues.

Solution:

def Batcher(vector, delta_t, gap):


indexPlaceholderList = []
valuePlaceholderList = []

for t, j in tqdm(enumerate(vector), total = len((vector * delta_t))/1):
    for i in range(0, delta_t, gap):
        indexPlaceholderList.append(t-i)
        valuePlaceholderList.append((list(vector)[t-i]))

values = [valuePlaceholderList[z:z+delta_t] for z in range(0, len(valuePlaceholderList), delta_t)]
indices = [indexPlaceholderList[z:z+delta_t] for z in range(0, len(indexPlaceholderList), delta_t)]

for i in values:
    i.reverse()
    
for i in indices:
    i.reverse()

indices[:delta_t] = [0] * delta_t

return values

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1