'How to return data from azure databricks notebook in Azure Data Factory
I have a requirement where I need to transform data in azure databricks and then return the transformed data. Below is notebook sample code where I am trying to return some json.
from pyspark.sql.functions import *
from pyspark.sql.types import *
import json
import pandas as pd
# Define a dictionary containing ICC rankings
rankings = {'test': ['India', 'South Africa', 'England',
'New Zealand', 'Australia'],
'odi': ['England', 'India', 'New Zealand',
'South Africa', 'Pakistan'],
't20': ['Pakistan', 'India', 'Australia',
'England', 'New Zealand']}
# Convert the dictionary into DataFrame
rankings_pd = pd.DataFrame(rankings)
# Before renaming the columns
rankings_pd.rename(columns = {'test':'TEST'}, inplace = True)
rankings_pd.rename(columns = {'odi':'ODI'}, inplace = True)
rankings_pd.rename(columns = {'t20':'twenty-20'}, inplace = True)
# After renaming the columns
#print(rankings_pd.to_json())
dbutils.notebook.exit(rankings_pd.to_json())
In order to achieve the same, I created a job under a cluster for this notebook and then I had to create a custom connector too following this article https://medium.com/@poojaanilshinde/create-azure-logic-apps-custom-connector-for-azure-databricks-e51f4524ab27. Using the connectors with API endpoint '/2.1/jobs/run-now' and then '/2.1/jobs/runs/get-output' in Azure Logic App, I am able to get the return value but after the job is executed successfully, sometimes I just get the status as running with no output. I need to get the output when job is executed successfully with transformation. Please suggest a way better way for this if I am missing anything.
Solution 1:[1]
looks like dbutils.notebooks.exit() only accpet "string", you can return the value as json string and convert to json object in DataFactory or Logic App. https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-utils#--notebook-utility-dbutilsnotebook
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | blur-odbu |
