'Specifying column with multiple datatypes in Spark Schema
I am trying to create schema to parse json into spark dataframe
I have column value in json which could be either struct or string
"value": {
"entity-type": "item",
"id": "someid",
"numeric-id": 30
}
"value": "SomePicture.jpg",
How can i specify that in the schema
Solution 1:[1]
{
"type": ["object", "string"],
"properties": { ... }
}
https://json-schema.org/understanding-json-schema/index.html
Solution 2:[2]
Solved it using below approach
In JSON, we can do the way you specified above.
But while defining spark schema it doesn't work
So for Spark schema I had to fetch value in String and then determine if value is going to be of structtype, based on certain conditions and then use from_json(value, new StructType())
to convert string back to JSON
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ether |
| Solution 2 |
