'Add MC Holdout layers to trained model in Keras
I am looking for a solution for a problem that has arisen when building a generic ANN for image classification in R. What I want to do is either:
- Design and compile a network with no MC Dropout layers that I train, with the possibility to then add MC Dropout layers for predictions.
- Design and compile a network with MC Dropout layers that I train, who I can deactivate/remove later on.
The reason I want it in this weird way, is that I wanna use MC Holdout for uncertainty quantification (i.e. using those layers while doing predictions), but for the same model I wanna try a different approach for uncertainty quantification. I feel like training two different models would be a bit unfair if network training differs. I've come to understand that you can use layers from one model in another, but I have not gotten this to work.
I have some generic code for the network here
library(keras)
library(tidyr)
# Data setup
mnist <- dataset_fashion_mnist()
train_images <- mnist$train$x
train_labels <- mnist$train$y
test_images <- mnist$test$x
test_labels <- mnist$test$y
# Reshape data
train_images <- array_reshape(train_images, c(60000, 28 * 28))
train_images <- train_images / 255
test_images <- array_reshape(test_images, c(10000, 28 * 28))
test_images <- test_images / 255
train_labels <- to_categorical(train_labels)
test_labels <- to_categorical(test_labels)
dropout_1 <- layer_dropout(rate = 0.25)
dropout_2 <- layer_dropout(rate = 0.25)
input <- layer_input(shape = c(784))
output <- input %>%
layer_dense(units = 16, activation = 'relu', input_shape = c(784)) %>%
dropout_1(training = TRUE) %>%
layer_dense(units = 16, activation = 'relu') %>%
dropout_2(training = TRUE) %>%
layer_dense(units = 10, activation = 'softmax')
model <- keras_model(input, output)
model %>%
compile(loss ="categorical_crossentropy",
optimizer = "adam",
metrics= c("accuracy"))
model %>% fit(train_images,train_labels,epochs=5,batch_size=128,verbose=1)
I have tried making a new network using the same layers and adding the MC Holdout layers as well, but performance was really poor so I guess it's not trained. Any hints or info would be really useful. Thanks.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|