'How to assign key value pair to two matrices from text file in Pyspark RDD using Python

I have a text file that looks like:

1 2 3 4
4 5 6 7
3 4 5 6

2 3 4
3 4 5
4 5 7
6 7 9

I want to create two matrices ab, bc (e.g. 34 and 43 here) for further matrix operation and assign key value pairs using Pyspark rdd with Python.

I tried:

import pyspark
from pyspark import SparkContext, SparkConf
sc = SparkContext.getOrCreate()
data = sc.textfile('file.txt')
data2 = data.filter(lambda x: x.strip()).map(lambda x: x.split(' '))

I don't understand the next step to map the key value to two matrices because if i apply a function, it iterates over a row.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source