'How to capture databricks notebook failure to cluster timeout into DataFactory pipeline, passing the error through usp to azure sql db

DataFactory connects to data bricks and calls a notebook. When the notebook completes it either

enter image description here

  • Upon success > sends to pipeline log
  • Upon failure > sends to the pipeline error log

This works fine and picks up errors if there is a problem with the data source.

However, when the notebook fails due to it losing connection to data bricks, by which I mean the database cluster loses connection.

enter image description here

Consequently, the data factory pipeline fails and does not pass this into the activity Stored Procedure: SP_Pipeline_Error_Log

  • which is just a stored procedure that runs to update an error table with some parameters being passed through

Error from data bricks notebook: “the spark driver has stopped unexpectedly and is restarting. your notebook will be automatically reattached”

I can see that the data factory pipeline failure happens around the same time the cluster connection goes via the data bricks cluster log.

enter image description here

How can I capture the databricks connection failure into my SP_Pipeline_Error_Log?

How can I prevent this error from happening? I've read online it can be down to a memory issue caused by cache.

I can provide more detail if needed.



Solution 1:[1]

"Pipeline result is success if and only if all nodes evaluated succeed" Here are some know errors and their troubleshooting docs

You can checkout this external document of a similar scenario and for alternative methods of capturing errors, read Get Any Azure Data Factory Pipeline Activity Error details with Azure Functions. and Share your feedback at official ADF forum here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 KarthikBhyresh-MT