'How do I take a column of String type decimals in Pyspark and round them to the nearest 50 value?

I have a column in a dataset called "X":

X
"2893.324"
"1058.112"
"5651.324"

Im trying to make these numbers be interger values that round to nearest 50

Output should be this:

X results
"2893.324" 2900
"1058.112" 1050
"5651.324" 5650


Solution 1:[1]

I would suggest dividing by 50, rounding to nearest integer and then multiplying again.

no need for user-defined-functions, pyspark.sql.functions module has you covered. see suggested code:

from pyspark.sql.functions import col, round 

df \
.withColumn(
  "results",
  round(col("X").cast("double") / 50) * 50
)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 walking