'Set k-largest elements of a tensor to zero in TensorFlow

I want to find k largest elements of each row of h and set zero value to those maximum elements.

I could be able to select the indexes of top most value of each row by using top_k function like:

top_k = tf.nn.top_k(h, 1)

But I could not use the indexes returned by top_k to update tensor.

How can I do that? Thanks in advance...



Solution 1:[1]

I was facing the opposite problem and wanted a operation which supported gradients. top_k does not support gradient propagation and hence a good way will be to implement the function in c++.

top_k c++ code is found here.

Your operation's kernel will look look like this:

template <typename T>
class MakeSparseOp : public OpKernel {
   public:
    explicit MakeSparseOp(OpKernelConstruction *context) : OpKernel(context) {}

    void Compute(OpKernelContext *context) override {
        // Grab the input tensors
        const auto &k_in = context->input(1);

        OP_REQUIRES(context, TensorShapeUtils::IsScalar(k_in.shape()),
                    errors::InvalidArgument("k must be scalar, got shape ",
                                            k_in.shape().DebugString()));

        int k = k_in.scalar<int32>()();
        OP_REQUIRES(context, k >= 0,
                    errors::InvalidArgument("Need k >= 0, got ", k));

        const Tensor &x_in = context->input(0);
        OP_REQUIRES(context, x_in.dims() >= 1,
                    errors::InvalidArgument("input must be >= 1-D, got shape ",
                                            x_in.shape().DebugString()));
        OP_REQUIRES(
            context, x_in.dim_size(x_in.dims() - 1) >= k,
            errors::InvalidArgument("input must have at least k columns"));



        // Flattening the input tensor
        const auto &x = x_in.flat_inner_dims<T>();

        const auto num_rows = x.dimension(0);
        const auto num_cols = x.dimension(1);

        TensorShape output_shape = x_in.shape();

        // Create an output tensor
        Tensor *x_out = nullptr;
        OP_REQUIRES_OK(context,
                       context->allocate_output(0, output_shape, &x_out));

        /*
         * Get the top k values along the first dimension for input
         */

        auto x_sparse = x_out->flat_inner_dims<T>();

        if (k == 0) return;  // Nothing to do

        // Using TopN to get the k max element
        gtl::TopN<std::pair<T, int32>> filter(k);

        x_sparse = x; // Copy all elements

        for (int r = 0; r < num_rows; r++) {
            // Processing a row at a time
            for (int32 c = 0; c < num_cols; c++) {
                // The second element is the negated index, so that lower-index
                // elements
                // are considered larger than higher-index elements in case of
                // ties.
                filter.push(std::make_pair(x(r, c), -c));

            }

            for (auto top_k_it = filter.unsorted_begin();
                 top_k_it != filter.unsorted_end(); ++top_k_it) {
                x_sparse(r, -top_k_it->second) = 0; // Set max k to zero
            }

            filter.Reset();
        }
    }
};

My implementation for a related problem is here.

Solution 2:[2]

With recent availability of scatter_nd_update function in tensorflow, here is a modified version of the answer from Oliver.

k = 2
val_to_replace_with = -333
x = tf.Variable([[6., 2., 0.], [0., 4., 5.]])  # of type tf.float32


values, indices = tf.nn.top_k(x, k, sorted=False)  # indices will be [[0, 1], [1, 2]], values will be [[6., 2.], [4., 5.]]
# We need to create full indices like [[0, 0], [0, 1], [1, 2], [1, 1]]
my_range = tf.expand_dims(tf.range(0, tf.shape(indices)[0]), 1)  # will be [[0], [1]]
my_range_repeated = tf.tile(my_range, [1, k])  # will be [[0, 0], [1, 1]]
# change shapes to [N, k, 1] and [N, k, 1], to concatenate into [N, k, 2]
full_indices = tf.concat([tf.expand_dims(my_range_repeated, -1), tf.expand_dims(indices, -1)], axis=2)
full_indices = tf.reshape(full_indices, [-1, 2])


# only significant modification -----------------------------------------------------------------
updates = val_to_replace_with + tf.zeros([tf.size(indices)], dtype=tf.float32)
c = tf.scatter_nd_update(x, full_indices, updates)
# only significant modification -----------------------------------------------------------------


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(c))

Solution 3:[3]

follow the Olivier Moindrot's idea, but implemented by tf's API.

x = tf.constant([[6., 2., 0.], [0., 4., 5.]])  # of type tf.float32
k = 2
values, indices = tf.nn.top_k(x, k, sorted=False)  # indices will be [[0, 1], [1, 2]], values will be [[6., 2.], [4., 5.]]


# We need to create full indices like [[0, 0], [0, 1], [1, 2], [1, 1]]
ii, _ = tf.meshgrid(tf.range(2), tf.range(k), indexing='ij')
full_indices = tf.reshape(tf.stack([ii, indices], axis=-1), [-1, len(x.shape)])


tf.tensor_scatter_nd_sub(x, full_indices, tf.reshape(values, -1))
"""
In [249]: tf.tensor_scatter_nd_sub(x, full_indices, tf.reshape(values, -1))
Out[249]: 
<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>
"""

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ARB
Solution 2 Batta
Solution 3