'Access to a MongoDB collection with Spark DataFrame using a composite key
I try to have a DataFrame resulting from a MongoDB search within a collection named test. The following code is used with a single column to do the search :
val mongoDf = MongoSpark.load(sparkSession, confMongoDb.getReadConfig("test"))
Then, an array is built in order to select the corresponding ID values :
val arrayOfValues = dfUsedForSearch
.select("ID")
.map(r =>{r.getString(0)})
.collect()
Finally, the search of the ID into the MongoDB collection is perform with the filter syntax :
val dfFilter = mongoDf.filter(mongoDf.col("ID").isInCollection(arrayOfValues))
.select("ID")
How to do this kind of search using a composite key as for example : (ID_A, ID_B, ID_C) ? I was thinking of using a foldLeft function to generalize this type of search as much as possible.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
