'nltk.pos_tag not picking up "will" as MD

I have the following sentence from my corpus as an example. "This is a test case with multiple sentences. Will you get it right?"

When using nltk.pos_tag, the word "Will" should be tagged as MD(modal), correct? I have the following code for this sentence:

def get_pos_tags(text) -> Counter:
    """ when given a string, returns a POS tag counter, using NLTK"""
    text = str(text)
    
    tokens = nltk.word_tokenize(text)
    tagged = nltk.pos_tag(tokens)
    count = Counter(tag for _, tag in tagged)

However, if I print the tags, I get the following:

[('This', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('test', 'NN'), ('case', 'NN'), ('with', 'IN'), ('a', 'DT'), ('two', 'CD'), ('singular', 'JJ'), ('nouns', 'NNS'), ('.', '.')]
[('This', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('test', 'NN'), ('case', 'NN'), ('with', 'IN'), ('multiple', 'JJ'), ('sentences', 'NNS'), ('.', '.'), ('Will', 'NNP'), ('you', 'PRP'), ('get', 'VBP'), ('it', 'PRP'), ('right', 'RB'), ('?', '.')].

As you can see, "Will" gets tagged as an NNP, when it should be MD according to the documentation. Any reason why this is happening?

Update: "Will" only gets tagged as MD if I lower it...even more strange. It's a fix but I still do not understand why that works.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'nltk.pos_tag not picking up "will" as MD

Sources

Related Questions