'import csv file in RDD PySpark
We have a csv file called survey.csv and we need to load it into an rdd.
We tried this:
rdd_test = survey_results.csv.map(lambda x: (x, 1))
it doesn't work. Anyone can help?
Solution 1:[1]
SparkContext.textFile creates an RDD
import sys
from pyspark import SparkContext
# create Spark context
sc = SparkContext()
# read input text file to RDD
lines = sc.textFile("./survey.csv")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
