'Spark dataframe from dictionary
I'm trying to create a spark dataframe from a dictionary which has data in the format
{'33_45677': 0, '45_3233': 25, '56_4599': 43524} .. etc.
dict_pairs={'33_45677': 0, '45_3233': 25, '56_4599': 43524}
df=spark.createDataFrame(data=dict_pairs)
It throws:
TypeError: can not infer schema for type: <class 'str'>
Is it because of the underscore in the keys of the dictionary?
Solution 1:[1]
Enclose dict using square braces '[]'. Its not because of _ in your keys.
dict_pairs={'33_45677': 0, '45_3233': 25, '56_4599': 43524}
df=spark.createDataFrame(data=[dict_pairs])
df.show()
or
dict_pairs=[{'33_45677': 0, '45_3233': 25, '56_4599': 43524}]
df=spark.createDataFrame(data=dict_pairs)
df.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Sudhin |