This is my dataset: from pyspark.sql import SparkSession, functions as F spark = SparkSession.builder.getOrCreate() df = spark.createDataFrame([('2021-02-07',)
planar-graph
longitudinal
pee
weblogic
incremental-compiler
simian
qfile
cson
c#-2.0
faraday
app-inventor
options
language-concepts
jenkins-2
emc
protoc-gen-openapiv2
jsoncpp
react-storefront
video-conversion
function-points
dendropy
histogram-equalization
mysqlcommand
basename
file-association
cypress-after-hook
rolling-average
selenium-stealth
imgix
etag