'Replace all occurrences of a String in all columns in a dataframe in scala

I have a dataframe with 20 Columns and in these columns there is a value XX which i want to replace with Empty String. How do i achieve that in scala. The withColumn function is for a single column, But i want to pass all 20 columns and replace values that have XX in the entire frame with Empty String , Can some one suggest a way.

Thanks



Solution 1:[1]

You can gather all the stringType columns in a list and use foldLeft to apply your removeXX UDF to each of the columns as follows:

val df = Seq(
  (1, "aaXX", "bb"),
  (2, "ccXX", "XXdd"),
  (3, "ee", "fXXf")
).toDF("id", "desc1", "desc2")

import org.apache.spark.sql.types._

val stringColumns = df.schema.fields.collect{
  case StructField(name, StringType, _, _) => name
}

val removeXX = udf( (s: String) =>     
  if (s == null) null else s.replaceAll("XX", "")
)

val dfResult = stringColumns.foldLeft( df )( (acc, c) =>
  acc.withColumn( c, removeXX(df(c)) )
)

dfResult.show
+---+-----+-----+
| id|desc1|desc2|
+---+-----+-----+
|  1|   aa|   bb|
|  2|   cc|   dd|
|  3|   ee|   ff|
+---+-----+-----+

Solution 2:[2]

def clearValueContains(dataFrame: DataFrame,token :String,columnsToBeUpdated : List[String])={
  columnsToBeUpdated.foldLeft(dataFrame){
    (dataset ,columnName) =>
      dataset.withColumn(columnName, when(col(columnName).contains(token), "").otherwise(col(columnName)))
  }
}

You can use this function .. where you can put token as "XX" . Also the columnsToBeUpdated is the list of columns in which you need to search for the particular column.

dataset.withColumn(columnName, when(col(columnName) === token, "").otherwise(col(columnName)))

you can use the above code to replace on exact match.

Solution 3:[3]

We can do like this as well in scala.

//Getting all columns
val columns: Seq[String] = df.columns
//Using DataFrameNaFunctions to achieve this.
val changedDF = df.na.replace(columns, Map("XX"-> ""))

Hope this helps.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Leo C
Solution 2 Stephen Rauch
Solution 3 whoisthis