'Generating keypoint heatmaps in Tensorflow

I am trying to train a model for facial keypoints detection. This is a Stacked HourGlass model. It outputs 256x256x68 dimensional tensor. Each of the 68 outputs will have a hot region around a keypoint. I have defined the model and graph constructs fine. My problem is in generating the dataset.

I need to generate 256x256x68 dimensional labels_tensor from the 68x2 dimensional landmarks tensor. Although I can do it in numpy and save it in TFRecord, I would like to explore and see if its possible to do this at training time inside the tf.data.Dataset API's parse_function.

For each heatmap, I need to draw a gaussian at the x,y location of the corresponding landmark point.

Code

I have the following code inside parse_function:

# heatmaps
joints = tf.stack([points_x, points_y], axis=1)
heatmaps = _generate_heatmaps(joints, 1., IMG_DIM)

This is _generate_heatmaps function:

def _generate_heatmaps(joints, sigma, outres):
    npart = 68
    gtm = tf.placeholder(tf.float32, shape=[None, outres, outres, npart])
    gtmaps = tf.zeros_like(gtm)

    for i in range(npart):
        visibility = 1 
        if visibility > 0:
            gtmaps[:, :, :, i] = _draw_hotspot(gtmaps[:, :, :, i], joints[:, i, :], sigma)
    return gtmaps

The _draw_hotspot function:

def _draw_hotspot(img, pt, sigma, type='Gaussian'):
# Draw a 2D gaussian
# Adopted from https://github.com/anewell/pose-hg-train/blob/master/src/pypose/draw.py

# Check that any part of the gaussian is in-bounds
ul = [(pt[:,0] - 3 * sigma), (pt[:,1] - 3 * sigma)]
br = [(pt[:,0] + 3 * sigma + 1), (pt[:,1] + 3 * sigma + 1)]
# if (ul[0] >= img.shape[1] or ul[1] >= img.shape[0] or
#         br[0] < 0 or br[1] < 0):
#     # If not, just return the image as is
#     return img

# Generate gaussian
size = 6 * sigma + 1
x = np.arange(0, size, 1, float)
y = x[:, np.newaxis]
x0 = y0 = size // 2
# The gaussian is not normalized, we want the center value to equal 1
# if type == 'Gaussian':
g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
# elif type == 'Cauchy':
#     g = sigma / (((x - x0) ** 2 + (y - y0) ** 2 + sigma ** 2) ** 1.5)

# Usable gaussian range
g_x = [tf.clip_by_value(-1*ul[0], -100, 0)*-1, tf.minimum(br[0], img.shape[2].value) - ul[0]]
g_y = [tf.clip_by_value(-1*ul[1], -100, 0)*-1, tf.minimum(br[1], img.shape[1].value) - ul[1]]

g_x = tf.cast(g_x, tf.int64)
g_y = tf.cast(g_y, tf.int64)

# Image range
img_x = [tf.clip_by_value(ul[0], 0, img.shape[1].value), tf.clip_by_value(br[0], 0, img.shape[1].value)]
img_y = [tf.clip_by_value(ul[1], 0, img.shape[2].value), tf.clip_by_value(br[1], 0, img.shape[2].value)]

img_x = tf.cast(img_x, tf.int64)
img_y = tf.cast(img_y, tf.int64)

# img[img_y[0]:img_y[1], img_x[0]:img_x[1]] = g[g_y[0]:g_y[1], g_x[0]:g_x[1]]
img_slice = tf.image.extract_glimpse # ... stuck ...

return img

I need to convert this numpy code to tensorflow code img[img_y[0]:img_y[1], img_x[0]:img_x[1]] = g[g_y[0]:g_y[1], g_x[0]:g_x[1]]

Just that last line! Can anyone help?



Solution 1:[1]

Here's a way using SciPy, which you can work into your TF pipeline with a tf.py_func:

from scipy.stats import multivariate_normal

pos = np.dstack(np.mgrid[0:68:1, 0:68:1])
# hotspot at pixel (22, 43) with roughly 4-pixel radial spread
rv = multivariate_normal(mean=[22, 43], cov=4)
plt.imshow(rv.pdf(pos))

generated heatmap

Solution 2:[2]

TensorFlow doesn't support a tensor assignment like Python. Therefore, you have to come up with a way to do the same thing using only TensorFlow operations.

One way is to divide the image into 6 parts, replace one of them, and concatenate them back together.

So replace your line with the following code:

top = image[:img_y[0], ...]
middle = image[img_y[0]:img_y[1], ...]
bottom = image[img_y[1]:, ...]

left = middle[:, :img_x[0]]
right = middle[:, img_x[1]:]
center = g[g_y[0]:g_y[1], g_x[0]:g_x[1], tf.newaxis]

updated_middle = tf.concat([left, center, right], axis=1)
image = tf.concat([top, updated_middle, bottom], axis=0)

First, I divide along the y axis into 3 parts. The middle part is where the values should be updated. So I divide the middle part into another 3 parts along the x axis. Now I can throw away the center part of the original image and replace it with computed values. Then, I concatenate everything back together.

For completeness, here's the function:

@tf.function
def draw_gaussian_point(image, point, sigma):
    """
    Draw a 2D gaussian.

    Adapted from https://github.com/princeton-vl/pose-hg-train/blob/master/src/pypose/draw.py.

    Parameters
    ----------
    image   Input image of shape [height, width, 1]
    point   Point in format [x, y]
    sigma   Sigma param in Gaussian

    Returns
    -------
    updated_image  An image of shape [height, width, 1] with a gaussian point drawn in it.
    """
    tf.assert_rank(image, 3)
    tf.assert_rank(point, 1)
    tf.assert_rank(sigma, 0)

    # Check that any part of the gaussian is in-bounds
    ul = [int(point[0] - 3 * sigma), int(point[1] - 3 * sigma)]
    br = [int(point[0] + 3 * sigma + 1), int(point[1] + 3 * sigma + 1)]
    if (ul[0] > image.shape[1] or ul[1] >= image.shape[0] or
            br[0] < 0 or br[1] < 0):
        # If not, just return the image as is
        return image

    # Generate gaussian
    size = 6 * sigma + 1
    x = tf.range(0, size, dtype=tf.float32)
    y = x[:, tf.newaxis]
    x0 = y0 = size // 2
    # The gaussian is not normalized, we want the center value to equal 1
    g = tf.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))

    # Usable gaussian range
    g_x = max(0, -ul[0]), min(br[0], image.shape[1]) - ul[0]
    g_y = max(0, -ul[1]), min(br[1], image.shape[0]) - ul[1]
    # Image range
    img_x = max(0, ul[0]), min(br[0], image.shape[1])
    img_y = max(0, ul[1]), min(br[1], image.shape[0])

    top = image[:img_y[0], ...]
    middle = image[img_y[0]:img_y[1], ...]
    bottom = image[img_y[1]:, ...]

    left = middle[:, :img_x[0]]
    right = middle[:, img_x[1]:]
    center = g[g_y[0]:g_y[1], g_x[0]:g_x[1], tf.newaxis]

    updated_middle = tf.concat([left, center, right], axis=1)
    image = tf.concat([top, updated_middle, bottom], axis=0)
    return image

Solution 3:[3]

I know this question is older, but as I was searching for a solution I wrote one myself:

@tf.function
def gaussian(shape=[5, 5], peakValue = 1.0, center=[25.0, 25.0], spread=1.0):
  grid = tf.stack(
      tf.meshgrid(
          tf.range(shape[1], dtype=tf.float32), 
          tf.range(shape[1], dtype=tf.float32)
          ), 
          axis=-1
  )
  unstacked = tf.reshape(grid, [-1, 2])

  numerator = tf.square(center - unstacked)
  denominator = 2 * tf.square(spread)

  exponent = -tf.reduce_sum(numerator / denominator, axis=-1)

  unstacked_gaussian = peakValue * tf.exp(exponent)
  return tf.reshape(unstacked_gaussian, shape)

plt.imshow(gaussian(shape=[64, 64], peakValue=50.0, center=[30, 30], spread=1.0))

outputs:

Gaussian heatmap

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Ladislav Ondris
Solution 3 Maximilian