'Azure Databricks Pyspark - how to wake paused serverless Azure SQL DB
I've got a Pyspark notebook that works fine connecting to an Azure SQL Database when it's running. The problem is when it goes and pauses itself. Now, I know I can end-around it with Data Factory pipelines and make sure it wakes up with a Lookup activity then run the notebook.
Here's what I'm trying. What winds up happening is that I get the "Timed out, retrying" message once, but it dies on the exception anyway and never retries. If it was retrying, it's say "Timed out, retrying" more than once. If the database is online, it succeeds. Maybe I'm doing the retry logic wrong?
df = None
errorCount = 0
maxErrorCount = 4
while errorCount < maxErrorCount:
print("Connecting...")
try:
df = spark.read.format("jdbc").option("driver", "com.microsoft.jdbc.SQLServerDriver") \
.option("url", connString).option("user", userName).option("password", password) \
.option("dbtable", query).load()
errorCount = maxErrorCount
except Exception as e:
df = None
errorCount = errorCount + 1
if errorCount < maxErrorCount:
print("Timed out, retrying")
time.sleep(30)
else:
raise(e)
if df is None:
raise("Could not connect. Unknown error.")
Solution 1:[1]
You can auto-pause the Azure SQL Database using Portal and this will auto-pause after the given time if there is no activity happen.
Once the azure database is in pause status, it resumes automatically in the following conditions:
- Database connection
- database export or copy
- Viewing auditing records
- Viewing or applying performance recommendation
- Vulnerability assessment
- Modifying or viewing data masking rules
- View state for transparent data encryption
- Modification for serverless configuration such as max vCores, min vCores, or auto-pause delay
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | UtkarshPal-MT |

