'How can I import external python libraries in python shell AWS Glue job

I have been trying to import an external python libraries in aws glue python shell job.

  1. I have uploaded the whl file for Pyodbc in s3.
  2. I referenced the s3 path in "python library path" in additional properties of Glue job.
  3. I also tried to give job parameter --extra-py-files with value as s3 path of whl file.
  4. whenever I write the line "from pyodbc import pyodbc as db"or just "import pyodbc" it always returns "ModuleNotFoundError: No module named 'pyodbc'"
  5. Logs are shown as below:

Processing ./glue-python-libs-cq4p0rs8/pyodbc-4.0.32-cp310-cp310-win_amd64.whl Installing collected packages: pyodbc Successfully installed pyodbc-4.0.32

WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

File "/tmp/glue-python-scripts-g_mt5xzp/Glue-ETL-Dev.py", line 2, in ModuleNotFoundError: No module named 'pyodbc'

I am downloading the wheel files from here :https://pypi.org/project/pyodbc/#files

No matter how many versions of whl files I refer in the glue job, it always throws the same error.

can anyone enlighten me where it's going wrong?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source