'Cannot restart TPU Node in preempted state
I am using a preemptible TPUv3-8 node with a GCE VM and I am having some difficulty restarting the TPU node after it has been preempted.
On the TPUs page, it shows that the TPU-node has been preempted.

But when I try to start it back it shows that its not in stopped or preempted state. Why is this happening and what should I do to fix it.

I would also love to know if there is way to auto-restart the TPU node and run a simple startup script. Thank you
Solution 1:[1]
This behavior is expected.
In Preemptible TPUs documentation you have information on how you can create TPU Preemptible nodes, best practice, like described here.
However in the bottom of the Detecting if a TPU has been preempted part, you have NOTE.
Note: If your Cloud TPU is preempted, you must delete it and create a new one as described in Managing TPUs.
In short, if the TPU VM was preempted, you cannot restart it. You must delete it and create a new one.
Regarding auto-restart the TPU node, there is only option mentioned in Preemptible VMs and TPUs (TPU Nodes only).
Note that the preemptible status of the TPU is independent of the preemptible status of the VM. You can define your TPU as preemptible and the VM as not preemptible, or the other way round. You can also define them both as preemptible.
The most likely combination is a preemptible TPU and a non-preemptible VM.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | PjoterS |
