'Convert list of strings to array struct in pyspark
I have PySpark dataframe with one string data type like this:'00639,43701,00007,00632,43701,00007'
I need to convert the above string into an array of structs using withColumn, to have this:
[{"network_id":"00639","network_bic":"43701","network_seqr":"00007"},{"network_id":"00632","network_bic":"43701","network_seqr":"00007"}]
How to achieve this using PySpark dataframes?
Solution 1:[1]
First, you may create an array out of your string, then access every element of that array using element_at, giving a name and putting them into a struct.
from pyspark.sql import functions as F
df = spark.createDataFrame([('00639,43701,00007,00632,43701,00007',)], ['col_str'])
col_split = F.split('col_str', ',')
df = df.withColumn('array_of_struct', F.array(
F.struct(
F.element_at(col_split, 1).alias('network_id'),
F.element_at(col_split, 2).alias('network_bic'),
F.element_at(col_split, 3).alias('network_seqr'),
),
F.struct(
F.element_at(col_split, 4).alias('network_id'),
F.element_at(col_split, 5).alias('network_bic'),
F.element_at(col_split, 6).alias('network_seqr'),
)
))
df.show(truncate=0)
# +-----------------------------------+----------------------------------------------+
# |col_str |array_of_struct |
# +-----------------------------------+----------------------------------------------+
# |00639,43701,00007,00632,43701,00007|[{00639, 43701, 00007}, {00632, 43701, 00007}]|
# +-----------------------------------+----------------------------------------------+
df.printSchema()
# root
# |-- col_str: string (nullable = true)
# |-- array_of_struct: array (nullable = false)
# | |-- element: struct (containsNull = false)
# | | |-- network_id: string (nullable = true)
# | | |-- network_bic: string (nullable = true)
# | | |-- network_seqr: string (nullable = true)
Solution 2:[2]
There is no exact string function available but you can use CONCAT as::
SELECT CONCAT('T.P.', ' ', 'Bar') as author;
+---------------------+
| author |
+---------------------+
| T.P. Bar |
+---------------------+
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | ZygD |
| Solution 2 | rtenha |
