'Can't create a TPU node/VM since March 4
Since some time around March 4, suddenly I have not been able to create a Cloud TPU node.
When I attempt to create a TPU node/VM via GUI, it crashes upon choosing TPU type with any region. I get tons of JS errors in the console:
ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
m=b:90 ERROR TypeError: Cannot read properties of undefined (reading 'CP-CLOUD-TPU-V3')
Attempting to create a TPU VM from Cloud Shell results in error code 13 with combinations of any zone or version:
gcloud alpha compute tpus tpu-vm create testnode --zone us-central1-a --accelerator-type='v3-8' --version='v2-alpha' --scopes='cloud-platform'
ERROR: (gcloud.alpha.compute.tpus.tpu-vm.create) {
"code": 13,
"message": "an internal error has occurred"
}
What I tested:
- Attempting the same procedure with a different project - same behavior and error.
- Attempting the same procedure with a new account that never used Cloud TPU before - same behavior and error.
- Using Chrome from an Android phone with mobile network - same behavior and error.
- Quotas are fine.
I figured google-cloud-tpu 1.3.2 was released March 8, but I am not sure if that is related to the issue I am getting.
Other parts of GCP, such as VM instances or Cloud Storage work fine - just TPU has been down for me.
Solution 1:[1]
You can try this:
gcloud alpha compute tpus tpu-vm create testnode
--zone us-central1-a --accelerator-type='v3-8' --version='v2-alpha'
--scopes=https://www.googleapis.com/auth/cloud-platform
The short form --scopes='cloud-platform' is not supported for tpus.
Solution 2:[2]
I was able to create a TPU VM via Cloud Console by using --service-account instead of --scopes.
The GUI still crashes, but you can somehow create a node by repeatedly clicking at preemptible checkbox. I think the possible cause is that they removed scopes from TPU VM and something in their backend now is incompatible with the current GUI code.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | ouflak |
| Solution 2 | Stakk |
