'Access to a MongoDB collection with Spark DataFrame using a composite key

I try to have a DataFrame resulting from a MongoDB search within a collection named test. The following code is used with a single column to do the search :

val mongoDf = MongoSpark.load(sparkSession, confMongoDb.getReadConfig("test"))

Then, an array is built in order to select the corresponding ID values :

val arrayOfValues = dfUsedForSearch
  .select("ID")
  .map(r =>{r.getString(0)})
  .collect()

Finally, the search of the ID into the MongoDB collection is perform with the filter syntax :

val dfFilter = mongoDf.filter(mongoDf.col("ID").isInCollection(arrayOfValues))
  .select("ID")

How to do this kind of search using a composite key as for example : (ID_A, ID_B, ID_C) ? I was thinking of using a foldLeft function to generalize this type of search as much as possible.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source