'Get Filename (label) from directory path in google drive

I am working on a classification task on Google colab. The dataset I'm using for the task is on google drive and has the folder name as the label. e.g train/cat/img1.jpg, train/dog/img03.jpg

How can I extract the label from the folder name. I have tried using the code below but it is not extracting the folder name.

train_images = []
train_labels = []
for directory_path in glob.glob("/content/drive/My Drive/images/train/*"):
    label = directory_path.split("\\")[-1]
    print(label)
    for img_path in glob.glob(os.path.join(directory_path, "*.*")):
        print(img_path)
        img = cv2.imread(img_path, cv2.IMREAD_COLOR)
        img = cv2.resize(img, (SIZE,SIZE))
        img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        train_images.append(img)
        train_labels.append(label)

train_images = np.array(train_images)
train_labels = np.array(train_labels)


Solution 1:[1]

Since you know the parent folder id, you can use Files: list to get the list of files which have train as their parent folder, using q parameter:

'q': "'{TRAIN_FOLDER_ID}' in parents"

You'd just have to modify the example Listing files in Google Drive with a modified request:

filenames = drive.ListFile({
  'q': "'{TRAIN_FOLDER_ID}' in parents", 
  'fields': "files(name)"
}).GetList()

Reference:

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Iamblichus