'_pickle.PicklingError, in training networks with num_workers positive
I was training a big dataset, using pytorch Dataset and DataLoader, and this problem appeared. I deleted much of my code to ask this question, and the problem exists, these code can run independently:
import numpy as np
from torch.utils.data import DataLoader, Dataset
class PGDataset(Dataset):
def __init__(self, X, y):
self.data_x = X
self.data_y = y
def __len__(self):
return len(self.data_x)
def __getitem__(self, item):
return self.data_x[item], self.data_y[item]
def train(dataloader):
epochs = 5
for epoch in range(epochs):
for i, (data_x, data_y) in enumerate(dataloader):
pass
print('Train ended successfully.')
if __name__ == '__main__':
X, y = np.random.rand(200, 4), np.random.randint(0, 5, 200, dtype='int64')
train_set = PGDataset(X, y)
dataloader = DataLoader(train_set, batch_size=50, shuffle=True, num_workers=4)
train(dataloader)
These code seems very standerd for me, and I do not know the reason for this problem.
I am using python==3.9.7, numpy==1.21.5, pytorch==1.10.1. By the way, I just updated pycharm to 2022.1.
I have tryed this for all day, and searched in many websites, I found:
- If I set num_workers in dataloader to 0, it's ok. But I need it not to be 0 for a big dataset.
- In pycharm debug mode, there's also no mistakes.
Why this happen?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
