'Tensorflow Dataset won't output tensors to GPU memory
I have a list of Numpy arrays, arr_list. Because my arrays are all different shapes, I am trying to use the tf.data.Dataset.from_generator function to create a Tensorflow dataset using this list.
Here's my generator:
def generator_func():
for (a, b), c in arr_list:
a = tf.cast(a, dtype=tf.uint16)
b = tf.cast(b, dtype=tf.float16)
c = tf.cast(c, dtype=tf.float32)
yield (a, b), c
For context, a and b are the inputs to my model, and c is the output.
When I run the generator, everything works well and the tensors seem to live on the GPU as expected,
gen = generator_func()
(a, b), c = next(gen)
print(a.device)
print(b.device)
print(c.device)
"""
Output:
/job:localhost/replica:0/task:0/device:GPU:0
/job:localhost/replica:0/task:0/device:GPU:0
/job:localhost/replica:0/task:0/device:GPU:0
But this fails when I try to use datasets:
data_signature = (
(tf.TensorSpec(shape=(1, None), dtype=tf.uint16),
tf.TensorSpec(shape=(1, None, 300), dtype=tf.float16)),
tf.TensorSpec(shape=(1, None), dtype=tf.float32),
)
train = tf.data.Dataset.from_generator(generator_func, output_signature=data_signature)
for (a, b), c in train.take(1):
print(a.device)
print(b.device)
print(c.device)
"""
Output:
/job:localhost/replica:0/task:0/device:CPU:0
/job:localhost/replica:0/task:0/device:CPU:0
/job:localhost/replica:0/task:0/device:CPU:0
Sure, maybe the device scope changes when using Python generators. However, using the copy_to_device method does not fix the issue.
train = train.apply(tf.data.experimental.copy_to_device("/gpu:0"))
I have also unsuccessfully tried using the prefetch_to_device method.
How would I go about debugging this? Why won't the tensors coming out of the Tensorflow Dataset go to the GPU memory?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
