'Tensorflow 2 - very high SHR on GPU training

I have a very high SHR usage and furthermore a way too long runtime for my process. I am only doing inference and have the following setup:

gpus = tf.config.list_physical_devices('GPU')
if gpus:
    # Limit Memory growth
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
    tf.config.set_visible_devices(gpus[arg.gpu], 'GPU')
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")

# Initialize Dataloader
dataset = UCFDataset('labels.csv', augmentation=arg.augm, factor=arg.factor)
dataloader = DataLoader(dataset=dataset, batch_size=1)

with tf.device(f'/device:GPU:{arg.gpu}'):
    # Initialize model
    model = I3D(labels=dataset.kinetics_labels)

    for step, (video, category, labels) in enumerate(tqdm(dataloader)):
        top_1_bool, top_3_bool, top_5_bool = model.predict(video, labels)
        ...

As you can see, the SHR usage is enormous. enter image description here

Also, after running the script, I get the following warning:

I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory.

Can anyone help me?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source