'import csv file in RDD PySpark

We have a csv file called survey.csv and we need to load it into an rdd.

We tried this:

rdd_test = survey_results.csv.map(lambda x: (x, 1)) 

it doesn't work. Anyone can help?



Solution 1:[1]

SparkContext.textFile creates an RDD

import sys

from pyspark import SparkContext
 
# create Spark context
sc = SparkContext()
 
# read input text file to RDD
lines = sc.textFile("./survey.csv")

Source

Helpful SO post

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1