'Docker-compose and extended Dokerfile: External Providers not found
I've deployed apache airflow using docker-compose using an extended image to include requirements. I can see in the airflow UI that the snowflake providers package was installed (apache-airflow-providers-snowflake), but for some reason, I'm getting a DAG import error:
from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook
ModuleNotFoundError: No module named 'airflow.providers.snowflake'
I've logged into the docker webserver, used the airflow user, and successfully imported this package.
Dockerfile:
FROM apache/airflow:2.3.0-python3.9
USER root
COPY requirements.txt .
RUN apt-get update && \
apt-get install -y apt-utils
USER airflow
RUN pip3 install --no-cache-dir -r requirements.txt
see below:
[philip@ip-10-5-51-10 airflow_2]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0516db2653a0 airflow_2_airflow-webserver "/usr/bin/dumb-init …" About an hour ago Up 52 minutes (healthy) 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp airflow_2_airflow-webserver_1
36142cc09a6b airflow_2_airflow-scheduler "/usr/bin/dumb-init …" About an hour ago Up 52 minutes (healthy) 8080/tcp airflow_2_airflow-scheduler_1
c090e86e9279 airflow_2_airflow-triggerer "/usr/bin/dumb-init …" About an hour ago Up 52 minutes (healthy) 8080/tcp airflow_2_airflow-triggerer_1
ade8996d759f airflow_2_airflow-worker "/usr/bin/dumb-init …" About an hour ago Up 52 minutes (healthy) 8080/tcp airflow_2_airflow-worker_1
8915e4fb4ffe airflow_2_flower "/usr/bin/dumb-init …" About an hour ago Up 52 minutes (healthy) 0.0.0.0:5555->5555/tcp, :::5555->5555/tcp, 8080/tcp airflow_2_flower_1
353f2fa5552f postgres:13 "docker-entrypoint.s…" 4 hours ago Up 52 minutes (healthy) 5432/tcp airflow_2_postgres_1
1bafa42ece77 redis:latest "docker-entrypoint.s…" 4 hours ago Up 52 minutes (healthy) 6379/tcp airflow_2_redis_1
408ad15538ce denny/monitor-docker-slack:latest "/monitor-docker-sla…" 12 months ago Up 3 days monitor-docker-slack
[philip@ip-10-5-51-10 airflow_2]$ docker exec -u 0 -it airflow_2_airflow-webserver_1 bash
root@0516db2653a0:/opt/airflow# su - airflow
airflow@0516db2653a0:~$ pip list
Package Version
---------------------------------------- ------------
adal 1.2.7
alembic 1.7.7
amqp 5.1.1
anyio 3.5.0
apache-airflow 2.3.0
apache-airflow-providers-amazon 3.3.0
apache-airflow-providers-celery 2.1.4
apache-airflow-providers-cncf-kubernetes 4.0.1
apache-airflow-providers-docker 2.6.0
apache-airflow-providers-elasticsearch 3.0.3
apache-airflow-providers-ftp 2.1.2
apache-airflow-providers-google 6.8.0
apache-airflow-providers-grpc 2.0.4
apache-airflow-providers-hashicorp 2.2.0
apache-airflow-providers-http 2.1.2
apache-airflow-providers-imap 2.2.3
apache-airflow-providers-microsoft-azure 3.8.0
apache-airflow-providers-mysql 2.2.3
apache-airflow-providers-odbc 2.0.4
apache-airflow-providers-postgres 4.1.0
apache-airflow-providers-redis 2.0.4
apache-airflow-providers-sendgrid 2.0.4
apache-airflow-providers-sftp 2.6.0
apache-airflow-providers-slack 4.2.3
apache-airflow-providers-snowflake 2.6.0
apache-airflow-providers-sqlite 2.1.3
apache-airflow-providers-ssh 2.4.3
apispec 3.3.2
argcomplete 1.12.2
asn1crypto 1.4.0
attrs 20.3.0
Authlib 1.0.1
azure-batch 12.0.0
azure-common 1.1.28
azure-core 1.23.1
azure-cosmos 4.2.0
azure-datalake-store 0.0.52
azure-identity 1.9.0
azure-keyvault-secrets 4.4.0
azure-kusto-data 0.0.45
azure-mgmt-containerinstance 1.5.0
azure-mgmt-core 1.3.0
azure-mgmt-datafactory 1.1.0
azure-mgmt-datalake-nspkg 3.0.1
azure-mgmt-datalake-store 0.5.0
azure-mgmt-nspkg 3.0.2
azure-mgmt-resource 21.0.0
azure-nspkg 3.0.2
azure-storage-blob 12.8.1
azure-storage-common 2.1.0
azure-storage-file 2.1.0
Babel 2.9.0
bcrypt 3.2.0
beautifulsoup4 4.11.1
billiard 3.6.4.0
blinker 1.4
boto3 1.16.59
botocore 1.19.59
cached-property 1.5.2
cachelib 0.6.0
cachetools 4.2.2
catboost 0.25.1
cattrs 1.10.0
celery 5.2.6
certifi 2020.12.5
cffi 1.14.4
chardet 3.0.4
charset-normalizer 2.0.12
click 8.1.2
click-didyoumean 0.3.0
click-plugins 1.1.1
click-repl 0.2.0
clickclick 20.10.2
cloudpickle 2.0.0
colorama 0.4.4
colorlog 4.0.2
commonmark 0.9.1
configparser 3.5.3
connexion 2.13.0
cron-descriptor 1.2.24
croniter 0.3.37
cryptography 3.3.1
cycler 0.11.0
dask 2022.2.0
db-dtypes 1.0.0
decorator 5.1.1
defusedxml 0.6.0
Deprecated 1.2.13
dill 0.3.3
distlib 0.3.4
distributed 2022.2.0
dnspython 2.1.0
docker 5.0.3
docutils 0.16
elasticsearch 7.13.4
elasticsearch-dbapi 0.2.9
elasticsearch-dsl 7.4.0
email-validator 1.1.2
eventlet 0.33.0
filelock 3.6.0
Flask 1.1.2
Flask-Admin 1.5.4
Flask-AppBuilder 3.4.5
Flask-Babel 1.0.0
Flask-Caching 1.10.1
Flask-JWT-Extended 3.25.0
Flask-Login 0.4.1
Flask-OpenID 1.2.5
Flask-Session 0.4.0
Flask-SQLAlchemy 2.4.4
flask-swagger 0.2.13
Flask-WTF 0.14.3
flower 1.0.0
fonttools 4.33.3
fsspec 2022.3.0
funcsigs 1.0.2
future 0.16.0
gevent 21.12.0
google-ads 14.0.0
google-api-core 2.7.2
google-api-python-client 1.12.11
google-auth 2.6.6
google-auth-httplib2 0.1.0
google-auth-oauthlib 0.5.1
google-cloud-aiplatform 1.12.1
google-cloud-appengine-logging 1.1.1
google-cloud-audit-log 0.2.0
google-cloud-automl 2.7.2
google-cloud-bigquery 2.34.3
google-cloud-bigquery-datatransfer 3.6.1
google-cloud-bigquery-storage 2.13.1
google-cloud-bigtable 1.7.1
google-cloud-build 3.8.1
google-cloud-container 1.0.2
google-cloud-core 2.3.0
google-cloud-datacatalog 3.7.1
google-cloud-dataplex 0.2.1
google-cloud-dataproc 4.0.2
google-cloud-dataproc-metastore 1.5.0
google-cloud-dlp 1.0.1
google-cloud-kms 2.11.1
google-cloud-language 1.3.1
google-cloud-logging 3.0.0
google-cloud-memcache 1.3.1
google-cloud-monitoring 2.9.1
google-cloud-orchestration-airflow 1.3.1
google-cloud-os-login 2.6.1
google-cloud-pubsub 2.12.0
google-cloud-redis 2.8.0
google-cloud-resource-manager 1.4.1
google-cloud-secret-manager 1.0.1
google-cloud-spanner 1.19.2
google-cloud-speech 1.3.3
google-cloud-storage 1.44.0
google-cloud-tasks 2.8.1
google-cloud-texttospeech 1.0.2
google-cloud-translate 1.7.1
google-cloud-videointelligence 1.16.2
google-cloud-vision 1.0.1
google-cloud-workflows 1.6.1
google-crc32c 1.3.0
google-resumable-media 2.3.2
googleapis-common-protos 1.56.0
graphviz 0.16
greenlet 1.1.2
grpc-google-iam-v1 0.12.4
grpcio 1.44.0
grpcio-gcp 0.2.2
grpcio-status 1.44.0
gunicorn 20.1.0
h11 0.12.0
HeapDict 1.0.1
httpcore 0.14.7
httplib2 0.19.1
httpx 0.22.0
humanize 4.0.0
hvac 0.11.2
idna 2.10
importlib-metadata 3.10.1
importlib-resources 5.7.1
inflection 0.5.1
iso8601 0.1.13
isodate 0.6.0
itsdangerous 1.1.0
Jinja2 2.10.3
jmespath 0.10.0
joblib 1.0.0
json-merge-patch 0.2
jsonpath-ng 1.5.3
jsonschema 3.2.0
kiwisolver 1.4.2
kombu 5.2.4
kubernetes 23.3.0
lazy-object-proxy 1.4.3
ldap3 2.9.1
locket 1.0.0
lockfile 0.12.2
looker-sdk 22.4.0
lxml 4.8.0
Mako 1.1.4
Markdown 3.3.4
MarkupSafe 1.1.1
marshmallow 3.15.0
marshmallow-enum 1.5.1
marshmallow-oneofschema 3.0.1
marshmallow-sqlalchemy 0.26.1
matplotlib 3.5.2
msal 1.17.0
msal-extensions 0.3.1
msgpack 1.0.3
msrest 0.6.21
msrestazure 0.6.4
mypy-boto3-rds 1.22.0.post1
mypy-boto3-redshift-data 1.22.0.post1
mysql-connector-python 8.0.29
mysqlclient 2.1.0
natsort 7.1.0
nox 2020.12.31
numpy 1.19.5
oauthlib 3.1.0
openapi-spec-validator 0.2.9
oscrypto 1.2.1
packaging 21.3
pandas 0.25.3
pandas-gbq 0.17.4
paramiko 2.10.4
partd 1.2.0
pathspec 0.9.0
pendulum 2.1.2
Pillow 9.1.0
pip 22.0.4
platformdirs 2.5.2
plotly 5.7.0
pluggy 1.0.0
ply 3.11
portalocker 2.4.0
prison 0.2.1
prometheus-client 0.14.1
prompt-toolkit 3.0.29
proto-plus 1.18.1
protobuf 3.20.1
psutil 5.8.0
psycopg2-binary 2.9.3
py 1.11.0
pyarrow 6.0.1
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycparser 2.20
pycryptodomex 3.9.9
pydata-google-auth 1.4.0
Pygments 2.7.4
PyJWT 1.7.1
PyNaCl 1.5.0
pyodbc 4.0.32
pyOpenSSL 19.1.0
pyparsing 2.4.7
pyrsistent 0.17.3
pysftp 0.2.9
python-daemon 2.3.0
python-dateutil 2.8.1
python-editor 1.0.4
python-http-client 3.3.7
python-ldap 3.4.0
python-nvd3 0.15.0
python-slugify 6.1.1
python3-openid 3.2.0
pytz 2020.5
pytzdata 2020.1
PyYAML 5.4.1
redis 3.5.3
redshift-connector 2.0.906
requests 2.27.1
requests-oauthlib 1.3.1
rfc3986 1.5.0
rich 12.3.0
rsa 4.8
s3transfer 0.3.4
scikit-learn 0.24.1
scipy 1.7.3
scramp 1.4.1
sendgrid 6.9.7
setproctitle 1.2.1
setuptools 57.5.0
six 1.15.0
sklearn 0.0
slack-sdk 3.15.2
slackclient 1.0.7
sniffio 1.2.0
snowflake-connector-python 2.7.7
snowflake-sqlalchemy 1.3.4
sortedcontainers 2.4.0
soupsieve 2.3.2.post1
SQLAlchemy 1.4.9
sqlalchemy-bigquery 1.4.3
SQLAlchemy-JSONField 1.0.0
sqlalchemy-redshift 0.8.9
SQLAlchemy-Utils 0.38.2
sqlparse 0.4.1
sshtunnel 0.4.0
starkbank-ecdsa 2.0.3
statsd 3.3.0
swagger-ui-bundle 0.0.8
tabulate 0.8.7
tblib 1.7.0
tenacity 8.0.1
termcolor 1.1.0
text-unidecode 1.3
threadpoolctl 3.1.0
thrift 0.13.0
toolz 0.11.2
tornado 6.1
typing_extensions 4.2.0
tzlocal 1.5.1
unicodecsv 0.14.1
uritemplate 3.0.1
urllib3 1.25.11
vine 5.0.0
virtualenv 20.14.1
watchtower 2.0.1
wcwidth 0.2.5
websocket-client 0.59.0
Werkzeug 1.0.1
wheel 0.37.1
wrapt 1.14.0
WTForms 2.3.3
zict 2.1.0
zipp 3.4.0
zope.deprecation 4.4.0
zope.event 4.5.0
zope.interface 5.4.0
Solution 1:[1]
HI I think this provider is not installed can you follow some steps to confirm if this is installed
first open cmd and write docker ps
Then copy the cluster id of the webserver which will be like
in this picture cluster id will be like starting from 39d11(this will change your cluster id will be different )after copying the cluster id can you go write this in cmd type docker exec -it cluster-id /bin/bash
now when you are in the airflow cluster go to the airflow /dag folder and now type pip list and you can see there is no provider install and you have to install it here again thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Kaolin |
