'Databricks init script is failing to install packages but reporting as "Succeeded" regardless?

I have been attempting to setup 'init scripts' on databricks, so I can install all of my python libraries and keep the environment controlled.

Tried yesterday using the init script pictured below:

dbutils.fs.put("/DA/Temp/ClusterTest/python_requirements_test2.sh","""
#!/bin/bash
pip install pyodbc==4.0.32
pip install zeep==4.1.0
""", True)

This was successful!

But after trialling some other techinques and coming back to this method, the init script is no longer installing any of the libraries..

Can anyone shed some light on this?

Below is the 'Event Log' note about the init script, showing "Status: SUCCEEDED" even though no libraries are actually installed: "Status: SUCCEEDED" even though no libraries are actually installed



Solution 1:[1]

When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.

To View the libraries installed on a cluster:

  • Click compute icon Compute in the sidebar.
  • Click the cluster name.
  • Click the Libraries tab. For each library, the tab displays the name and version, type, install status, and, if uploaded, the source file.

Solution 2:[2]

After a conversation with Databricks support team, it is an intended functionality of their Clusters UI to NOT show packages installed via init script in the UI. These packages are installed however and this does not indicate that the packages arent installed.

I have asked the support team to update their documentation on this, as it was not clear that this was an intended behaviour.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 UtkarshPal-MT
Solution 2 Wmerrick3