'About the memory usage of Mobilenet
I'm building MobileNetV1 with Pytorch and had my memory ran out every time I train the model. (The pytorch log "Killed!" and suddenly crashed).
This is my code
Config file: (yaml)
n_gpu: 0
arch:
type: MobileNet
args:
in_channels: 3
num_classes: 26
data_loader:
type: BallDataLoader
args:
data_dir: data/balls/
batch_size: 64
shuffle: true
validation_split: 0.2
num_workers: 0
resize:
- 224
- 224
optimizer:
type: Adam
args:
lr: 1.0e-2
weight_decay: 0
amsgrad: true
loss: nll_loss
metrics:
- accuracy
- top_k_acc
lr_scheduler:
type: StepLR
args:
step_size: 50
gamma: 0.1
trainer:
epochs: 50
save_dir: saved/
save_period: 2
verbosity: 2
monitor: min val_loss
early_stop: 10
tensorboard: true
modules.py:
class DepthwiseSeparableConv(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size = 3, stride = 1, padding = None):
super().__init__()
if padding == None:
padding = kernel_size // 2
self.depth_wise_conv = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, groups= in_channels)
self.bn1 = nn.BatchNorm2d(in_channels)
self.point_wise_conv = nn.Conv2d(in_channels, out_channels, (1,1), 1, 0)
self.bn2 = nn.BatchNorm2d(out_channels)
self.in_channels = in_channels
self.out_channels = out_channels
def forward(self, x):
x = self.depth_wise_conv(x)
x = self.bn1(x)
x = F.relu(x)
x = self.point_wise_conv(x)
x = self.bn2(x)
x = F.relu(x)
return x
model.py
class MobileNet(ImageNet):
def __init__(self, in_channels = 3, num_classes = 1000):
super().__init__()
self.convs = nn.Sequential(
nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ),
nn.BatchNorm2d(32),
nn.ReLU(inplace = True),
DepthwiseSeparableConv(32, 64),
DepthwiseSeparableConv(64, 128, stride = 2),
DepthwiseSeparableConv(128, 128),
DepthwiseSeparableConv(128, 256),
DepthwiseSeparableConv(256, 256),
DepthwiseSeparableConv(256, 512, stride = 2),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 512),
DepthwiseSeparableConv(512, 1024, stride = 1),
DepthwiseSeparableConv(1024, 1024, stride= 2),
nn.AdaptiveAvgPool2d(1)
)
self.fc = nn.Linear(1024, num_classes)
def forward(self, x):
x = self.convs(x)
x = x.view(-1, 1024)
x = self.fc(x)
x = F.log_softmax(x, dim = 1)
return x
So I found a model from https://github.com/jmjeon94/MobileNet-Pytorch, and it worked. After hours I still can't find out why this happened as the models are nearly identical, and since the architect of mobilenet is farely light, this shouldn't take much space to run I supposed. Is there any chance that this is because of the python interpreter or there are actually something wrong with my code?
Solution 1:[1]
I think it's because of your batch size. Try using smaller batch size like 32,16,8,4,2.
Solution 2:[2]
I delete the line nn.Conv2d(in_channels, 32, kernel_size= 3, padding= 1, stride = 1 ) and rewrite the same and the code ran. Still don't know why but it seem to be the interpreter or the text editor which cause the error. Thank you for attending.
And special thanks to Mr. @Anmol Narang for your efford, I'm very appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Anmol Narang |
| Solution 2 | Do Nam |
