'CUBLAS_STATUS_EXECUTION_FAILED for the second, similar object

My pyTorch embedding class has got following functions:

def __init__(..., d_model=256, ...):
   self.typeVal_0_emb    = nn.Linear(2, d_model)
   self.typeVal_1_emb    = nn.Linear(2, d_model)

def __call__(..., y):
   val_0, val_1, sourceType = torch.split(y, 1, dim=-1) 

   typeVal_0 = torch.cat((sourceType, val_0), -1).to(y.device)
   typeVal_1 = torch.cat((sourceType, val_1), -1).to(y.device)
        
   typeVal_0_emb   = self.typeVal_0_emb(typeVal_0)
   typeVal_1_emb   = self.typeVal_1_emb(typeVal_1)

Always when I start training the model I get for the line with typeVal_1_emb the message: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

typeVal_0 and typeVal_1 both have a shape of [128, 800, 2] with torch.float32 dtype. It doesn't matter if I swap the last two lines, the error always occurs on the second (last) line.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source