'Custom Dataset class for hierarchical dataset using “Local Classifier per Parent Node” technique
I’m working on an image classification model for hierarchical dataset using PyTorch implementation of EfficientNet. I am seeking help on how to edit my torchvision.datasets.ImageFolder and any other part if necessary to implement Local Classifier per Parent Node technique used for hierarchical dataset.
So far I finished the basic model pipeline without any hierarchical technique by flattening all label classes in a single level:
├── seg_train
│ ├── golden_retriever_dog
│ ├── german_shepherd_dog
│ ├── persian_cat
│ ├── shorthair_cat
│ ├── golden_hamster
│ └── syrian_hamster
current filetree for training images. Each directory is an animal breed for dogs, cats, and hamsters filled with .png images.
├── seg_train
│ ├── dogs
│ │ ├── golden_retriever
│ │ └── german_shepherd
│ ├── cats
│ │ ├── persian
│ │ └── shorthair
│ └── hamster
│ │ ├── golden
│ │ └── syrian
This is the filetree I am working on to take advantage of the parent node’s information.
Here’s my short explanation of what I’m trying to accomplsih(LCPN) from this blog:
- instead of one classifier classifying six categories with my original setup, I want to train a single classifier to classify images into parent categories (dogs, cats, hamster) and three classifier for each parent category to classify those images into sub categories.
- For example, let’s say I have an image of a golden_retriever. The parent classifier will classify it into “dogs”(parent) category and the child classifier (child) will classify it into golden retriever label.
Currently, my load_dataset code looks like the following. I can have a different train_data loader for every parent node (dogs, cats) but I am having difficulty trying to code the loader for parent classifier, where “dogs” class should include all images from its sub categories.
def load_dataset():
train_data = torchvision.datasets.ImageFolder(
root="../input/seg_train", transform=train_transforms
)
test_data = torchvision.datasets.ImageFolder(
root="../input/seg_test", transform=test_transforms
)
dataloaders = data_loader(
train_data, test_data, valid_size=0.2, batch_size=BATCH_SIZE
)
# label of classes
classes = train_data.classes
# encoder and decoder to convert classes into integer
decoder = {}
for i in range(len(classes)):
decoder[classes[i]] = i
encoder = {}
for i in range(len(classes)):
encoder[i] = classes[i]
return train_loader, dataloaders, classes, encoder, inv_normalize
I would really appreciate if anyone can share their insights on my issue! I am not sure if this is the optimal way to accomplish this LCPN technique, so I welcome any alternative solutions as well!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
