'Why CNN model after regularizer L2 overfitting?

x_train1, x_test, y_train1, y_test = train_test_split(images, labels,test_size=0.2,random_state=42)
x_train2, x_val,y_train2,y_val = train_test_split(x_train1, y_train1,test_size=0.05,random_state=42)

Layers

model = Sequential()
model.add(Conv2D(32, (3, 3), activation = 'relu', input_shape=(128,128,1), kernel_regularizer=keras.regularizers.l2(0.005), padding ='same', name='Conv_1'))
model.add(MaxPooling2D((2,2),name='MaxPool_1'))
model.add(Conv2D(64, (3, 3), activation = 'relu',padding ='same', kernel_regularizer=keras.regularizers.l2(0.005), name='Conv_2'))
model.add(MaxPooling2D((2,2),name='MaxPool_2'))
model.add(Flatten(name='Flatten'))
model.add(Dropout(0.5,name='Dropout'))
model.add(Dense(64, kernel_initializer='normal', activation='relu', name='Dense_1'))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid', name='Dense_2'))
model.summary()

Model compile

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(x_train2, y_train2,validation_data=(x_test, y_test),batch_size=32, epochs=100 )

** Results ** Train: accuracy = 0.939577 ; loss = 0.134506 Test: accuracy = 0.767908 ; loss = 0.8002433

Solution 1:^[1]

Regularization is not a magical option that will just close the gap between train and test at any "weight". One way of thinking about this is that when you take the strength of regularistaion, so a cofficiant alpha (in your case =0.005) and then express the gap between train and test as a function of it, say f(x) (in your case f(0.005) = 0.94-0.76 = 0.18), then the only thing we know is that f(inf) = 0. In other words, as you increase regularization strength, the gap disappears (at the cost of trainin score going down). There is no one magical form of regularistaion, and there is no guarantee L2 is good for your problem. You can make the gap disappear by just making the weight higher, but it might lead to bot trian and test going very low.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'Why CNN model after regularizer L2 overfitting?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]