'Horizontal concatenation in Pyspark
is there an equivalent on pyspark that allow me to do similar operation as in Pandas
pd.contact(df1, df2, Axis=1)
I have tried several methods so far none of them seems to work. the concatenation that it does is vertical, and I'm needing to concatenate multiple spark dataframes into 1 whole dataframe.
if I use union or unionAll it the dataframes get stacked vertically, as one single column which is not useful for my use case. I also have tried this example (did not work either):
from functools import reduce
from pyspark.sql import DataFrame
def unionAll(*dfs):
return reduce(DataFrame.unionAll, dfs)
any help will be greatly appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
