'why does changing global variable not work in pyspark?

i have two python scripts. the main script is like

from testa import modify, see
from pyspark import SparkContext
if __name__ == '__main__':
    sc = SparkContext()
    modify(40)
    rdd = sc.parallelize([i for i in range(100)])
    rdd = rdd.map(see).collect()
    print(rdd)

the testa.py is

a = 1


def modify(p):
    global a
    a = p


def see(b):
    print(a)
    return a

but the result i got is [1,1,1,1,1,1...], where 1 is expected as 40. I want to know why changing global variable doesn't work in rdd.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source