'Run a Python function on a GPU using Ray

I'm using a Python package called Ray to run the example shown below in parallel. The code is run on a machine with 80 CPU cores and 4 GPUs.

import ray
import time

ray.init()

@ray.remote
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(1000)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

Output from the above example is:

Elapsed time 13.09 s
[0, 1, 4, 9, 16] ... [990025, 992016, 994009, 996004, 998001]

Below is the same example, but I would like to run it on the GPU using the num_gpus parameter. The GPUs available on the machine are Nvidia Tesla V100.

import ray
import time

ray.init(num_gpus=1)

@ray.remote(num_gpus=1)
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(1000)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

The GPU example never completed and I terminated it after several minutes. I checked the resources available to Ray using import ray; ray.init(); ray.available_resources() and it reports 80 CPUs and 4 GPUs. So it seems that Ray knows about the available GPUs.

I modified the GPU example to run fewer executions by changing range(1000) to range(10). See the revised example below.

import ray
import time

ray.init(num_gpus=1)

@ray.remote(num_gpus=1)
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(10)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

The output from the revised GPU example is:

Elapsed time 10.06 s
[0, 1, 4, 9, 16] ... [25, 36, 49, 64, 81]

The revised GPU example completed, but it looks like Ray is not using the GPU in parallel. Is there something else I should do to get Ray to run on the GPU in parallel?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source