'Spark MongoDB Connector unable to df.join - Unspecialised MongoConfig

Using the latest MongoDB connector for Spark (v10) and trying to join two dataframes yields the following unhelpful error.

Py4JJavaError: An error occurred while calling o64.showString.
: java.lang.UnsupportedOperationException: Unspecialised MongoConfig. Use `mongoConfig.toReadConfig()` or `mongoConfig.toWriteConfig()` to specialize
    at com.mongodb.spark.sql.connector.config.MongoConfig.getDatabaseName(MongoConfig.java:201)
    at com.mongodb.spark.sql.connector.config.MongoConfig.getNamespace(MongoConfig.java:196)
    at com.mongodb.spark.sql.connector.MongoTable.name(MongoTable.java:99)
    at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation.name(DataSourceV2Relation.scala:66)
    at org.apache.spark.sql.execution.datasources.v2.V2ScanRelationPushDown$$anonfun$pushDownFilters$1.$anonfun$applyOrElse$2(V2ScanRelationPushDown.scala:65)

Pyspark Code is simply pulling in two tables and running a join:

dfa = spark.read.format("mongodb").option("uri", mongodb://127.0.0.1/people.contacts").load()
dfb = spark.read.format("mongodb").option("uri", mongodb://127.0.0.1/people.accounts").load()
dfa.join(dfb, 'PKey').count()

SQL gives the same error:

dfa.createOrReplaceTempView("usr")
dfb.createOrReplaceTempView("ast")
spark.sql("SELECT count(*) FROM ast JOIN usr on usr._id = ast._id").show()

Document structures are flat.



Solution 1:[1]

Have you try using the latest version (10.0.2) of mongo-spark-connector? Can find it at here

I had a similar problem, solved it by replace 10.0.1 by 10.0.2

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 FULLHOUSE