'How to get an image to array, Tensorflow 1.9
So I have to use Tensorflow 1.9 for system specific reasons. I want to train a cnn with a custom dataset consisting of images. The folder structure looks very much like this:
./
+ circles
- circle-0.jpg
- circle-1.jpg
- ...
+ hexagons
- hexagon-0.jpg
- hexagon-1.jpg
- ...
+ ...
So the example I have to work with uses MNIST and has these two particular lines of code:
mnist_dataset = tf.keras.datasets.mnist.load_data('mnist_data')
(x_train, y_train), (x_test, y_test) = mnist_dataset
In my work, I also have to use this data format (x_train, y_train), (x_test, y_test), which seems to be quite common. As far as I was able to find out up to now, the format of those datasets are: (image_data, label), and is something like ((60000, 28, 28), (60000,)), at least with the MNIST dataset. The image_data here is supposedly of dtype uint8 (according to this post). I was able to find out, that a tf.data.Dataset() object looks like the tuples I need here (image_data, label).
So far so good. But a few questions arise from this information which I wasn't able to figure out yet, and where I would kindly request your help:
(60000, 28, 28)means 60k a 28 x 28 image value array, right?- If 1. is right, how do I get my images (like in the directory structure I described above) into this format? Is there a function which yields an array that I can use like that?
- I know I need some kind of generator function which should get all the images with label, because in Tensorflow 1.9 the
tf.keras.utils.image_dataset_from_directory()does not seem to exist yet. - How do the labels actually look like? For example, with my directory structure, would I have something like this:
(A)
| File | Label |
|---|---|
| circle-0.jpg | circle |
| circle-233.jpg | circle |
| hexagon-1.jpg | hexagon |
| triangle-12.jpg | triangle |
or (B)
| File | Label |
|---|---|
| circle-0.jpg | circle-0 |
| circle-233.jpg | circle-233 |
| hexagon-1.jpg | hexagon-1 |
| triangle-12.jpg | triangle-12 |
, where the respective image is already converted to a "(60000, 28, 28)" format? It seems as if I need to create all my functions by myself, since there does not seem to be a good function which takes a directory structure like mine to a dataset which can be utilized by Tensorflow 1.9, or is there?. I know of the tf.keras.preprocessing.image.ImageDataGenerator and image_dataset_from_directory as well as flow_from_directory(), however, all of them don't seem to bring me my desired dataset value tuple format.
I would really appreciate any help!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
