'Scala - Dataframe - Select based on list of Columns AND apply function to some columns

I am using Apache Spark dataframe to process some files, and my requirement is as follows

  1. I need to select specific columns from dataframe and create another dataframe - this is simple
  2. I need to apply some function to specific columns from data frame and create another dataframe - I am able to do it
//For 1
dataframe.select("COLUMN1") 
//more specifically I use the below
dataframe.select(listOfCol.map(col): _*)

//For 2
/dataframe.withColumn("newColumn", someUDF(col("COLUMN2")))

But I need both to be applied together in a elegant way, I came across this - Spark apply function to certain columns

Can I follow this OR is there any wat better, any suggestion or direction would be great help. thanks in advance.

EDIT

There is predefined map that will give which specific columns what functions (if needed) are applied, a sample one

example val colFunctionMap = Map(
  "COLUMN1" -> ("function1", true),
  "COLUMN2" -> ("function2", true),
  "COLUMN3" -> ("", false),
  "COLUMN4" -> ("", false),
...
)

so I need something like this - below is just a sudo code

dataframe.select(
     for each (listOfCol)
       if function has to be applied -> Apply and select column
       else -> just select column 
)

Please let me know if you need any other details...



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source