'How to add environment variables to an EMR cluster
How to add environment variables to an EMR cluster.
Currently, I have added them in a .sh file and was using script-runner.jar to run the script.
#!/bin/bash
export PYSPARK_PYTHON=/home/hadoop/bin/python
export PYSPARK_DRIVER_PYTHON=/home/hadoop/bin/python
Like this I was submitting the script as mentioned here:
aws emr add-steps \
--cluster-id j-2AXXXXXXGAPLF \
--steps Type=CUSTOM_JAR,Name="Run a script from S3 with script-runner.jar",ActionOnFailure=CONTINUE,Jar=s3://us-west-2.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[s3://mybucket/my-script.sh]
I have also tried using command-runner.jar. Both the approaches did not work. Can you suggest some other approach to add env variables to the cluster remotely/from an EC2 instance?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
