'shortest path column access pyspark
I am new to pyspark and I am currently learning it. I have a sample data set on which i have used shortestPath method of pyspark. The output looks something like this.....
id | distance
id | distance
1 [1 -> 0]
3 [1 -> 2]
2 [1 -> 1]
4 []
5 []
0 [1 -> 1]
I want to make a dictionary where the key would be id and the value would be the distance so...
{1:0, 3:2, 2:1, 4:-1, 5:-1, 0:1}
but I don't how to access the distances column. -1 is for [].
Solution 1:[1]
First extract map values and lit(-1) where needed
df=df.withColumn('key',when(col('distance').isNotNull(),map_values(col('distance'))[0]).otherwise(lit(-1)))
#create dict
{row['id']:row['key'] for row in df.collect()}
{1: 0, 3: 2, 2: 1, 4: -1, 5: -1, 0: 1}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | wwnde |
