'The odd number of partitions for reading data in Spark RDD

I am working with 3GB data and I have read data using Spark RDD:

rdd = sc.textFile("data.json")

When I used rdd.getNumPartitions(), the number of partition is 99! It is really odd. If I even use sc.textFile("data.json", 20), there are again 99 portions! Also, I cannot change the number of partitions by rdd.repartition() or rdd.coalesce(). It still keep 99 for the number of partitions.

I am really confused and I do not know why my data split into 99 partitions without any reasons! Please advice.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'The odd number of partitions for reading data in Spark RDD

Sources

Related Questions