'Dataframes and unsupported Numpy.ndarrays in tensorflow

I'm very new to NLP and working on making a chatbot based on a few different tutorials. I think I vaguely understand the concepts of encoding text into vectors that can be labeled with a "type" or "tag".

I've gotten to the point where I'm trying to train the model, using some encoded data that in a pandas dataframe:

                                              pattern                   tag
0   [9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...              greeting
1   [10, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...              greeting
2   [12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...              greeting
3   [13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...              greeting
4   [2, 3, 4, 14, 15, 16, 17, 0, 0, 0, 0, 0, 0, 0,...          roll-request
5   [18, 19, 20, 4, 21, 22, 23, 24, 25, 26, 27, 28...          roll-request
6   [39, 40, 41, 5, 42, 43, 44, 45, 46, 6, 0, 0, 0...          roll-request
7   [47, 48, 49, 1, 7, 50, 51, 52, 53, 54, 55, 56,...          roll-request
8   [61, 6, 62, 63, 1, 7, 0, 0, 0, 0, 0, 0, 0, 0, ...          roll-request
9   [64, 65, 5, 66, 67, 68, 69, 70, 0, 0, 0, 0, 0,...          roll-request
10  [2, 3, 71, 72, 8, 73, 74, 75, 76, 77, 8, 78, 7...  testing-status-query

I know it's a tiny dataset but I'm just going for a proof-of-concept at this point. Here's a closer look at the pattern column:

[[ 9  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [10 11  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [12  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [13  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 2  3  4 14 15 16 17  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [18 19 20  4 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38]
 [39 40 41  5 42 43 44 45 46  6  0  0  0  0  0  0  0  0  0  0  0  0]
 [47 48 49  1  7 50 51 52 53 54 55 56 57 58 59  1 60  0  0  0  0  0]
 [61  6 62 63  1  7  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [64 65  5 66 67 68 69 70  0  0  0  0  0  0  0  0  0  0  0  0  0  0]
 [ 2  3 71 72  8 73 74 75 76 77  8 78 79 80 81  0  0  0  0  0  0  0]]

As you can see, everything's padded correctly I think? So I try to feed the columns directly from the dataframe (which one of the tutorials I'm looking at says is fine).

        testing, training = datasplit(encoded_df, 0.2)

        test_x = testing['pattern']
        test_y = testing['tag']

        train_x = training['pattern']
        print(train_x)
        train_y = training['tag']
        print(train_y)

        model = define_model(vocab_size=vocab_size, max_length=max_length)

        history = model.fit(train_x, train_y, epochs=8, verbose=1, validation_data=(
            test_x, test_y), callbacks=callbacks)

But I get this error

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

I've tried googling and it seems like it could be a couple different things. Does anyone see anything glaringly wrong with what I'm doing? Any tips for a first-timer?



Solution 1:[1]

The error indicates, that the input is of type np.array and not of type tf.Tensor.

You can use the function tf.convert_to_tensor to either convert a pd.DataFrame or a np.array to a tf.Tensor.

An example on how to convert pd.DataFrame to a tf.Tensor is available here:

https://www.tensorflow.org/api_docs/python/tf/convert_to_tensor

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ai2ys