'Model Accuracy is High but Val_Accuracy is low
I'm trying to improve my val accuracy as it is very low. I have tried changing the batch_size, the number of images being used for validation and training. Added in extra dense levels but none of them have worked. The dataset I'm using has not been split up yet into Training and Validation which is what I have done using partitioning. I have given the values for the samples as you can see below and have tried to increase the VALIDATION_SAMPLES but when I do, my cluster keeps crashing.
TRAINING_SAMPLES = 10000
VALIDATION_SAMPLES = 2000
TEST_SAMPLES = 2000
IMG_WIDTH = 178
IMG_HEIGHT = 218
BATCH_SIZE = 32
NUM_EPOCHS = 20
def generate_df(partition, attr, num_samples):
df_ = df_par_attr[(df_par_attr['partition'] == partition)
& (df_par_attr[attr] == 0)].sample(int(num_samples/2))
df_ = pd.concat([df_,
df_par_attr[(df_par_attr['partition'] == partition)
& (df_par_attr[attr] == 1)].sample(int(num_samples/2))])
# for Training and Validation
if partition != 2:
x_ = np.array([load_reshape_img(images_folder + fname) for fname in df_.index])
x_ = x_.reshape(x_.shape[0], 218, 178, 3)
y_ = np_utils.to_categorical(df_[attr],2)
# for Test
else:
x_ = []
y_ = []
for index, target in df_.iterrows():
im = cv2.imread(images_folder + index)
im = cv2.resize(cv2.cvtColor(im, cv2.COLOR_BGR2RGB), (IMG_WIDTH, IMG_HEIGHT)).astype(np.float32) / 255.0
im = np.expand_dims(im, axis =0)
x_.append(im)
y_.append(target[attr])
return x_, y_
My training model is build after the partitioning which you can see below
# Train data
x_train, y_train = generate_df(0, 'Male', TRAINING_SAMPLES)
# Train - Data Preparation - Data Augmentation with generators
train_datagen = ImageDataGenerator(
preprocessing_function=preprocess_input,
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
)
train_datagen.fit(x_train)
train_generator = train_datagen.flow(
x_train, y_train,
batch_size=BATCH_SIZE,
)
The same also goes for the validation
# Validation Data
x_valid, y_valid = generate_df(1, 'Male', VALIDATION_SAMPLES)
# Validation - Data Preparation - Data Augmentation with generators
valid_datagen = ImageDataGenerator(
preprocessing_function=preprocess_input,
)
valid_datagen.fit(x_valid)
validation_generator = valid_datagen.flow(
x_valid, y_valid,
)
I tried playing around with the layers but got told that it wouldn't really affect your val_accuracy
x = inc_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation="relu")(x)
x = Dropout(0.5)(x)
x = Dense(256, activation="relu")(x)
predictions = Dense(2, activation="softmax")(x)
I tried using the 'adam' optimizer but it made no difference when compared to sgd
model_.compile(optimizer=SGD(lr=0.0001, momentum=0.9)
, loss='categorical_crossentropy'
, metrics=['accuracy'])
hist = model_.fit_generator(train_generator
, validation_data = (x_valid, y_valid)
, steps_per_epoch= TRAINING_SAMPLES/BATCH_SIZE
, epochs= NUM_EPOCHS
, callbacks=[checkpointer]
, verbose=1
)
Solution 1:[1]
Who ever told you modifying the model won't effect validation accuracy in most cases is dead wrong. The problem you have in your model is it is not deep enough to extract the features of the images. Below is the code I have used on hundreds of models and has proved to be very accurate with respect to achieving low training and validation loss and avoid over fitting
from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization, Flatten
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model, load_model
def make_model(img_img_size, class_count,lr=.001, trainable=True):
img_shape=(img_size[0], img_size[1], 3)
model_name='EfficientNetB3'
base_model=tf.keras.applications.efficientnet.EfficientNetB3(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max')
base_model.trainable=trainable
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
x=Dropout(rate=.45, seed=123)(x)
output=Dense(class_count, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy'])
return model, base_model # return the base_model so the callback can control its training state
TRAINING_SAMPLES = 10000
VALIDATION_SAMPLES = 2000
TEST_SAMPLES = 2000
IMG_WIDTH = 178
IMG_HEIGHT = 218
BATCH_SIZE = 32
NUM_EPOCHS = 20
img_size=(IMG_HEIGHT,IMG_WIDTH)
class_count=2
model, base_model=make_model(img_size, class_count, lr=.001, trainable=True)
I also recommend that you use two keras callbacks. One is to control the learning rate. Documentation for that is here. The other controls early stopping and saves the model with the lowest validation loss. Documentation for that is here. My recommended code for these callbacks is shown below
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=2,verbose=1)
estop=tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=4, verbose=1,restore_best_weights=True)
callbacks=[rlronp, estop]
put the above code prior to using model.fit. In model.fit set the parameter
callbacks=callbacks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Gerry P |