'Is there any way to call a specific value or dataframe from another notebook in databricks?

I am trying to access a specific table from one notebook using another in databricks.

I use dbutils.notebook.run('notebook_name', 60, parameters) with a for loop. I do not know how to get that specific table/value from the first notebook



Solution 1:[1]

TO implement it correctly you need to understand how things are working:

%run is a separate directive that should be put into the separate notebook cell, you can't mix it with the Python code. Plus, it can't accept the notebook name as variable. What %run is doing - it's evaluating the code from specified notebook in the context of the current Spark session, so everything that is defined in that notebook - variables, functions, etc. is available in the caller notebook. dbutils.notebook.run is a function that may take a notebook path, plus parameters and execute it as a separate job on the current cluster. Because it's executed as a separate job, then it doesn't share the context with current notebook, and everything that is defined in it won't be available in the caller notebook (you can return a simple string as execution result, but it has a relatively small max length). One of the problems with dbutils.notebook.run is that scheduling of a job takes several seconds, even if the code is very simple. How you can implement what you need?

if you use dbutils.notebook.run, then in the called notebook you can register a temp view, and caller notebook can read data from it (examples are adopted from this demo) Called notebook (Code1 - it requires two parameters - name for view name & n - for number of entries to generate):

name = dbutils.widgets.get("name")
n = int(dbutils.widgets.get("n"))
df = spark.range(0, n)
df.createOrReplaceTempView(name)

Caller notebook (let's call it main):

if "dataset" in "path": 
  view_name = "some_name"
  dbutils.notebook.run(ntbk_path, 300, {'name': view_name, 'n': "1000"})
  df = spark.sql(f"select * from {view_name}")
  ... work with data

it's even possible to do something like with %run, but it could require a kind of "magic". The foundation of it is the fact that you can pass arguments to the called notebook by using the $arg_name="value", and you can even refer to the values specified in the widgets. But in any case, the check for value will happen in the called notebook. The called notebook could look as following:

flag = dbutils.widgets.get("generate_data")
dataframe = None
if flag == "true":
  dataframe = ..... create datarame

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 tomerar