'How to subtract 2 string columns in a Pyspark dataframe
Below is the scenario: Consider a Pyspark dataframe having 2 columns like below:
{ fullname: facebook, lastname: book }
I want a new column firstname by subracting fullname and lastname like below
{ firstname:face, lastname:book }
Solution 1:[1]
df = spark.createDataFrame(
[
('facebook','book')
], ['fullname','lastname'])
df.withColumn('firstname', F.expr("regexp_replace(fullname,lastname,'')")).show()
+--------+--------+---------+
|fullname|lastname|firstname|
+--------+--------+---------+
|facebook| book| face|
+--------+--------+---------+
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Luiz Viola |
