'How do I take a column of String type decimals in Pyspark and round them to the nearest 50 value?
I have a column in a dataset called "X":
| X |
|---|
| "2893.324" |
| "1058.112" |
| "5651.324" |
Im trying to make these numbers be interger values that round to nearest 50
Output should be this:
| X | results |
|---|---|
| "2893.324" | 2900 |
| "1058.112" | 1050 |
| "5651.324" | 5650 |
Solution 1:[1]
I would suggest dividing by 50, rounding to nearest integer and then multiplying again.
no need for user-defined-functions, pyspark.sql.functions module has you covered. see suggested code:
from pyspark.sql.functions import col, round
df \
.withColumn(
"results",
round(col("X").cast("double") / 50) * 50
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | walking |
