'Azure Databricks - Python SSLCertVerificationError - unable to get local issuer certificate
I have an Azure Databricks notebook that gets a list of CSV files from a public government website and downloads them on a monthly basis or so. It's automating a process that was manual beforehand. The automation was working until recently. This is an SSL error, so it's not some sort of scraping issue.
The failing code is straightforward:
import requests
headers = {'Connection': 'Close'}
GovHTML = requests.get(GovURL, headers=headers)
It blows up with this (i've slightly obscured things):
SSLError: HTTPSConnectionPool(host='somegovtwebsite.gov', port=443): Max retries exceeded with url: /PublicReport/default.aspx (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))
However, when I browse to that URL in a browser and click the cert info, everything looks fine (again, the particular site is obscured):
Issued to: *.somegovtwebsite.gov
Issued by: DigiCert TLS RSA SHA256 2020 CA1
Valid from 3/20/2022 to 3/24/2023
Given the dates, it seems likely that the certificate that Databricks is seeing is one that expired roughly at the beginning of the month. However, the website has a fresh cert that I've shown above.
I have temporarily worked around the issue by changing the request to this:
GovHTML = requests.get(GovURL, headers=headers, verify=False)
How do I get Databricks/Python to see the new certificate and prevent the issue in the future? (In a non-crappy way)
Solution 1:[1]
One solution would be to import a custom ca certificate for the cluster. By performing this you may not get the error described by you.
To import a ca certificate, you have create a init script which will add the entire CA chain and sets the REQUESTS_CA_BUNDLE property .
After that attach the init script to the cluster and then restart the cluster.
Refer the documentation for Indepth explanation and examples
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MohitGanorkar-MT |
