'ModuleNotFoundError: No module named 'psycopg2' when redshift connection is added to the Gluejob
I'm using glue to load data from redshift, glubjob is failing with psycopg2 import error. 1) I tried --additional-python-modules method to import psycopg2-binary, glue is importing the module like this, Collecting psycopg2-binary Downloading psycopg2_binary-2.9.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)Installing collected packages: psycopg2-binarySuccessfully installed psycopg2-binary-2.9.3. But the job is failed with below error, redshift.amazonaws.com", port 5439 failed: Connection timed out Is the server running on that host and accepting TCP/IP connections?
- I tried by attaching redshift connection to the glue job, this time it is failed with ModuleNotFoundError: No module named 'psycopg2' error and below is the reason for that,Could not fetch URL https://pypi.org/simple/psycopg2-binary/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/psycopg2-binary/ (Caused by SSLError(SSLError(1, '[SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:1091)'))) - skippingCould not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError(SSLError(1, '[SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:1091)'))) - skipping
I have tried other options like zipping the psycopg2-binary module, uploading it to s3 and used the path in python library path in glue but still getting module not found error. Any help would be appreciated. Thanks in advance.
Solution 1:[1]
I was facing the same issue for a different set of libraries. Even i installed the libraries it was throwing the error. so I found out about this code snip and added in my code and it worked for me. You can add at the start of your script and see if this works for you.
# Load all the custom modules which are required by the glue script.
import sys
sys.path.insert(0, '/glue/lib/installation')
keys = [k for k in sys.modules.keys() if 'boto' in k]
for k in keys:
if 'boto' in k:
del sys.modules[k]
Here is the link to the guide for Installing Additional Python Modules in glue - https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
