'How to convert enumrate pyspark code into scala code

Below is the pyspark code for matrix multiplication. I need same code logic in scala for matrix multiplication as this logic is good for large volume dataset.

from pyspark import SparkConf, SparkContext
from pyspark.sql import functions as F
from functools import reduce

df = spark.sql("select * from tablename")

colDFs = []
for c2 in df.columns:
    colDFs.append( df.select( [ F.sum(df[c1]*df[c2]).alias("op_{0}".format(i)) for i,c1 in enumerate(df.columns) ] ) )

mtx = reduce(lambda a,b: a.select(a.columns).union(b.select(a.columns)), colDFs )
mtx.show()


Solution 1:[1]

for enumerate you can use zipWithIndex as in df.columns.zipWithIndex

I didn't test it but overall code should be something like

val colsDf=df.columns.flatMap{ case c => 
  df.columns.zipWithIndex.map{ case (c2,i) => 
    df.select(sum(col(c)*col(c2).alias(s"op_$i")))
  }
}
colsDf.reduce((a,b)=>a.select(a.columns.map(col):_*).union(b.select(b.columns.map(col):_*)))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1