'how to get the data from a column based on name not the index number
I have a dataframe with column abc having values like below
[{note=Part 3 of 4; Total = $11,000, cost=2750, startDate=2021-11-01T05:00:00Z+0000}]
Now I want to extract data based on name,for example i want to extract cost and start date and create a new column.
Asking it to be working on name because the order of these values might change.
I have tried below line of code but due to change in the data order I am getting wrong data.
df_mod = df_mod.withColumn('cost', split(df_mod['costs'], ',').getItem(1)) \
.withColumn('costStartdate', split(df_mod['costs'], ',').getItem(2))
Solution 1:[1]
That's because your data is not comma-separated, it just looks like that. You'll want to use regexp_extract to find the correct content.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | pltc |
