'How can I pass function as argument with parameter in python pandas pipe
I wanna make some function to use pipe of pandas.
Like this
import pandas as pd
def foo(df):
df['A'] = 1
return df
def goo(df):
df['B'] = 2
return df
def hoo(df, arg1):
df[arg1] = 3
return df
df = pd.DataFrame.from_dict({"A":[1, 2, 3],
"B":[4, 5, 6]})
print(df)
(df.pipe(foo)
.pipe(goo)
.pipe(hoo, arg1='Hello')
)
print(df)
The first print is
A B
0 1 4
1 2 5
2 3 6
The second ptint is
A B Hello
0 1 2 3
1 1 2 3
2 1 2 3
It is meaningless code and easy to understand.
There are many combination of function sch as foo, goo, hoo. I need to abstract this pipe code.
import pandas as pd
def foo(df):
df['A'] += 1
return df
def goo(df):
df['B'] += 2
return df
def hoo(df, arg1):
df[arg1] = 3
return df
def pipe_line(df, func_list, kargs_list):
for func, kargs in zip(func_list, kargs_list):
df = func(df, **kargs)
return df
df = pd.DataFrame.from_dict({"A":[1, 2, 3],
"B":[4, 5, 6]})
df = pipe_line(df,
[foo, goo, hoo],
[{}, {}, dict(arg1="HELLO")])
print(df)
But, pipe_line function is very ugly. How can I upgrade readability of this function?
Solution 1:[1]
pipe_line doesn't really have to much at all: just repeatedly apply functions to the return values of previous functions until you are out of functions.
def pipe_line(df, fs):
for f in fs:
df = f(df)
return df
The trick is to define appropriate functions that all take a single dataframe argument. functools.partial helps with that.
from functools import partial
df = pipeline(df, [foo, goo, partial(hoo, arg1="HELLO")])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | chepner |
