'Finding the coordinates of an object in an image using dl4j
I started studying neural networks together with the dl4j framework. At the moment I am trying to make the first convolutional network. I have my own dataset where the file name specifies the coordinates of the object and the width with height (x,y,w,h). This is what I want to get as a result. Now I have difficulties in composing the output layer. I don't quite understand what I should get if the object I'm looking for is not found, the second thing I want is to get an integer value of the coordinate or size without scaling it. An example of my neural network is presented below:
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(rngseed)
.updater(new Adam(0.01))
.activation(Activation.IDENTITY)
.weightInit(WeightInit.XAVIER)
.list()
.layer(new ConvolutionLayer.Builder(new int[] {5, 5}, new int[] {1, 1}, new int[]{0, 0}).name("cnn1").nIn(1).nOut(64).biasInit(0).build())
.layer(new SubsamplingLayer.Builder(new int[] {2, 2}, new int[] {2, 2}).name("maxpool1").build())
.layer(new ConvolutionLayer.Builder(new int[] {5, 5}, new int[] {1, 1}, new int[]{0, 0}).name("cnn2").nIn(64).nOut(16).biasInit(0).build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.nOut(4)
.activation(Activation.TANH)
.build())
.setInputType(InputType.convolutional(60, 200, 1))
.build();
Build data set and study:
for (int e = 0; e < numEpochs; e++) {
File[] labels = trainData.listFiles();
for (int i = 0; i < labels.length; i++) {
File imageFile = labels[i];
BufferedImage image = ImageIO.read(imageFile);
INDArray input = loader.asMatrix(image).reshape(1, channels, height, width);
scaler.fit(new DataSet(input, null));
scaler.transform(input);
String name = imageFile.getName();
int x = Integer.parseInt(name.substring(1,name.indexOf('y')));
int y = Integer.parseInt(name.substring(name.indexOf('y')+1,name.indexOf('w')));
int w = Integer.parseInt(name.substring(name.indexOf('w')+1,name.indexOf('h')));
int h = Integer.parseInt(name.substring(name.indexOf('h')+1,name.lastIndexOf('.')));
double array[][] ={ { x, y,w,h}};
INDArray output = Nd4j.create(array);
network.fit(input, output);
}
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
