'why does changing global variable not work in pyspark?
i have two python scripts. the main script is like
from testa import modify, see
from pyspark import SparkContext
if __name__ == '__main__':
sc = SparkContext()
modify(40)
rdd = sc.parallelize([i for i in range(100)])
rdd = rdd.map(see).collect()
print(rdd)
the testa.py is
a = 1
def modify(p):
global a
a = p
def see(b):
print(a)
return a
but the result i got is [1,1,1,1,1,1...], where 1 is expected as 40. I want to know why changing global variable doesn't work in rdd.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
