'What does Keras Tokenizer num_words specify?

Given this piece of code:

from tensorflow.keras.preprocessing.text import Tokenizer

sentences = [
    'i love my dog',
    'I, love my cat',
    'You love my dog!'
]

tokenizer = Tokenizer(num_words = 1)
tokenizer.fit_on_texts(sentences)
word_index = tokenizer.word_index
print(word_index)

whether num_words=1 or num_words=100, I get the same output when I run this cell on my jupyter notebook, and I can't seem to understand what difference it makes in tokenization.

{'love': 1, 'my': 2, 'i': 3, 'dog': 4, 'cat': 5, 'you': 6}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'What does Keras Tokenizer num_words specify?

Sources

Related Questions