'How to get reference to AzureML Workspace Class in scoring script?
My scoring function needs to refer to an Azure ML Registered Dataset for which I need a reference to the AzureML Workspace object. When including this in the init() function of the scoring script it gives the following error:
"code": "ScoreInitRestart",
"message": "Your scoring file's init() function restarts frequently. You can address the error by increasing the value of memory_gb in deployment_config."
On debugging the issue is:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code [REDACTED] to authenticate.
How can I resolve this issue without exposing Service Principal Credentials in the scoring script?
Solution 1:[1]
I found a workaround to reference the workspace in the scoring script. Below is a code snippet of how one can do that -
My deploy script looks like this :
from azureml.core import Environment
from azureml.core.model import InferenceConfig
#Add python dependencies for the models
scoringenv = Environment.from_conda_specification(
name = "scoringenv",
file_path="config_files/scoring_env.yml"
)
#Create a dictionary to set-up the env variables
env_variables={'tenant_id':tenant_id,
'subscription_id':subscription_id,
'resource_group':resource_group,
'client_id':client_id,
'client_secret':client_secret
}
scoringenv.environment_variables=env_variables
# Configure the scoring environment
inference_config = InferenceConfig(
entry_script='score.py',
source_directory='scripts/',
environment=scoringenv
)
What I am doing here is creating an image with the python dependencies(in the scoring_env.yml) and passing a dictionary of the secrets as environment variables. I have the secrets stored in the key-vault. You may define and pass native python datatype variables.
Now, In my score.py, I reference these environment variables in the init() like this -
tenant_id = os.environ.get('tenant_id')
client_id = os.environ.get('client_id')
client_secret = os.environ.get('client_secret')
subscription_id = os.environ.get('subscription_id')
resource_group = os.environ.get('resource_group')
Once you have these variables, you may create a workspace object using Service Principal authentication like @Anders Swanson mentioned in his reply.
Another way to resolve this may be by using managed identities for AKS. I did not explore that option.
Hope this helps! Please let me know if you found a better way of solving this.
Thanks!
Solution 2:[2]
Does your score.py include a Workspace.get() with auth=InteractiveAuthentication call? You should swap it to ServicePrincipalAuthentication (docs) to which you pass your credentials ideally through environment variables.
import os
from azureml.core.authentication import ServicePrincipalAuthentication
svc_pr_password = os.environ.get("AZUREML_PASSWORD")
svc_pr = ServicePrincipalAuthentication(
tenant_id="my-tenant-id",
service_principal_id="my-application-id",
service_principal_password=svc_pr_password)
ws = Workspace(
subscription_id="my-subscription-id",
resource_group="my-ml-rg",
workspace_name="my-ml-workspace",
auth=svc_pr
)
print("Found workspace {} at location {}".format(ws.name, ws.location))
Solution 3:[3]
You can get the workspace object directly from your run.
from azureml.core.run import Run
ws = Run.get_context().experiment.workspace
Solution 4:[4]
I came across the same challenge. As you are mentioning AML Datasets, I assume an AML Batch Endpoint is suitable to your scenario. The scoring script for a batch endpoint is meant to receive a list of files as input. When invoking the batch endpoint, you can pass (among the others) AML Datasets (consider that an endpoint is deployed in the context of an AML workspace). Have a look to this.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Anders Swanson |
| Solution 3 | Ninja_coder |
| Solution 4 | ouflak |
