'Model Accuracy is High but Val_Accuracy is low

I'm trying to improve my val accuracy as it is very low. I have tried changing the batch_size, the number of images being used for validation and training. Added in extra dense levels but none of them have worked. The dataset I'm using has not been split up yet into Training and Validation which is what I have done using partitioning. I have given the values for the samples as you can see below and have tried to increase the VALIDATION_SAMPLES but when I do, my cluster keeps crashing.


    TRAINING_SAMPLES = 10000
    VALIDATION_SAMPLES = 2000
    TEST_SAMPLES = 2000
    IMG_WIDTH = 178
    IMG_HEIGHT = 218
    BATCH_SIZE = 32
    NUM_EPOCHS = 20


    def generate_df(partition, attr, num_samples):
            df_ = df_par_attr[(df_par_attr['partition'] == partition) 
                                   & (df_par_attr[attr] == 0)].sample(int(num_samples/2))
            df_ = pd.concat([df_,
                              df_par_attr[(df_par_attr['partition'] == partition) 
                                          & (df_par_attr[attr] == 1)].sample(int(num_samples/2))])
        
            # for Training and Validation
            if partition != 2:
                x_ = np.array([load_reshape_img(images_folder + fname) for fname in df_.index])
                x_ = x_.reshape(x_.shape[0], 218, 178, 3)
                y_ = np_utils.to_categorical(df_[attr],2)
            # for Test
            else:
                x_ = []
                y_ = []
        
                for index, target in df_.iterrows():
                    im = cv2.imread(images_folder + index)
                    im = cv2.resize(cv2.cvtColor(im, cv2.COLOR_BGR2RGB), (IMG_WIDTH, IMG_HEIGHT)).astype(np.float32) / 255.0
                    im = np.expand_dims(im, axis =0)
                    x_.append(im)
                    y_.append(target[attr])
        
            return x_, y_

My training model is build after the partitioning which you can see below


    # Train data
    x_train, y_train = generate_df(0, 'Male', TRAINING_SAMPLES)
    
    # Train - Data Preparation - Data Augmentation with generators
    train_datagen =  ImageDataGenerator(
      preprocessing_function=preprocess_input,
      rotation_range=30,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
    )
    
    train_datagen.fit(x_train)
    
    train_generator = train_datagen.flow(
    x_train, y_train,
    batch_size=BATCH_SIZE,
    )

The same also goes for the validation


    # Validation Data
    x_valid, y_valid = generate_df(1, 'Male', VALIDATION_SAMPLES)
    
    
    # Validation - Data Preparation - Data Augmentation with generators
    valid_datagen = ImageDataGenerator(
      preprocessing_function=preprocess_input,
    )
    
    valid_datagen.fit(x_valid)
    
    validation_generator = valid_datagen.flow(
    x_valid, y_valid,
    )

I tried playing around with the layers but got told that it wouldn't really affect your val_accuracy


    x = inc_model.output
    x = GlobalAveragePooling2D()(x)
    x = Dense(1024, activation="relu")(x)
    x = Dropout(0.5)(x)
    x = Dense(256, activation="relu")(x)
    predictions = Dense(2, activation="softmax")(x)

I tried using the 'adam' optimizer but it made no difference when compared to sgd


    model_.compile(optimizer=SGD(lr=0.0001, momentum=0.9)
                        , loss='categorical_crossentropy'
                        , metrics=['accuracy'])


    hist = model_.fit_generator(train_generator
                         , validation_data = (x_valid, y_valid)
                          , steps_per_epoch= TRAINING_SAMPLES/BATCH_SIZE
                          , epochs= NUM_EPOCHS
                          , callbacks=[checkpointer]
                          , verbose=1
                        )



Solution 1:[1]

Who ever told you modifying the model won't effect validation accuracy in most cases is dead wrong. The problem you have in your model is it is not deep enough to extract the features of the images. Below is the code I have used on hundreds of models and has proved to be very accurate with respect to achieving low training and validation loss and avoid over fitting

from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization, Flatten
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model, load_model
def make_model(img_img_size, class_count,lr=.001, trainable=True):
    img_shape=(img_size[0], img_size[1], 3)
    model_name='EfficientNetB3'
    base_model=tf.keras.applications.efficientnet.EfficientNetB3(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
    base_model.trainable=trainable
    x=base_model.output
    x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
    x = Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                    bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
    x=Dropout(rate=.45, seed=123)(x)        
    output=Dense(class_count, activation='softmax')(x)
    model=Model(inputs=base_model.input, outputs=output)
    model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy']) 
    return model, base_model # return the base_model so the callback can control its training state

TRAINING_SAMPLES = 10000
VALIDATION_SAMPLES = 2000
TEST_SAMPLES = 2000
IMG_WIDTH = 178
IMG_HEIGHT = 218
BATCH_SIZE = 32
NUM_EPOCHS = 20
img_size=(IMG_HEIGHT,IMG_WIDTH)
class_count=2
model, base_model=make_model(img_size, class_count, lr=.001, trainable=True)

I also recommend that you use two keras callbacks. One is to control the learning rate. Documentation for that is here. The other controls early stopping and saves the model with the lowest validation loss. Documentation for that is here. My recommended code for these callbacks is shown below

rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=2,verbose=1)
estop=tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=4, verbose=1,restore_best_weights=True)
callbacks=[rlronp, estop]

put the above code prior to using model.fit. In model.fit set the parameter

callbacks=callbacks

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gerry P