Replace all occurrences of a String in all columns in a dataframe in scala


Replace all occurrences of a String in all columns in a dataframe in scala



I have a dataframe with 20 Columns and in these columns there is a value XX which i want to replace with Empty String. How do i achieve that in scala. The withColumn function is for a single column, But i want to pass all 20 columns and replace values that have XX in the entire frame with Empty String , Can some one suggest a way.



Thanks




2 Answers
2



You can gather all the stringType columns in a list and use foldLeft to apply your removeXX UDF to each of the columns as follows:


stringType


foldLeft


removeXX


val df = Seq(
(1, "aaXX", "bb"),
(2, "ccXX", "XXdd"),
(3, "ee", "fXXf")
).toDF("id", "desc1", "desc2")

import org.apache.spark.sql.types._

val stringColumns = df.schema.fields.collect{
case StructField(name, StringType, _, _) => name
}

val removeXX = udf( (s: String) =>
if (s == null) null else s.replaceAll("XX", "")
)

val dfResult = stringColumns.foldLeft( df )( (acc, c) =>
acc.withColumn( c, removeXX(df(c)) )
)

dfResult.show
+---+-----+-----+
| id|desc1|desc2|
+---+-----+-----+
| 1| aa| bb|
| 2| cc| dd|
| 3| ee| ff|
+---+-----+-----+





Another way to get StringColumns is testDF.dtypes.filter (._2 == "StringType").map (._1)
– Sudheer Palyam
Mar 22 at 5:41


def clearValueContains(dataFrame: DataFrame,token :String,columnsToBeUpdated : List[String])={
columnsToBeUpdated.foldLeft(dataFrame){
(dataset ,columnName) =>
dataset.withColumn(columnName, when(col(columnName).contains(token), "").otherwise(col(columnName)))
}
}



You can use this function .. where you can put token as "XX" . Also the columnsToBeUpdated is the list of columns in which you need to search for the particular column.


dataset.withColumn(columnName, when(col(columnName) === token, "").otherwise(col(columnName)))



you can use the above code to replace on exact match.






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

api-platform.com Unable to generate an IRI for the item of type

How to set up datasource with Spring for HikariCP?

Display dokan vendor name on Woocommerce single product pages