'Python NLTK - Tokenize sentences into words while removing numbers

hoping someone can assist with this! I have a list of sentences which is read from a text file. I am trying to tokenize the sentences into words, while also removing sentences while contain only numbers. There is no pattern for when the numbers will appear.

The sentences I have:

[
  ['                    1'], 
  ['This is a text file,'], 
  ['to keep the words,'],
  ['                    2'],
  ['Another line of the text:'],
  ['                    3']
]

Desired output:

[
  ['This', 'is', 'a', 'text', 'file,'], 
  ['to', 'keep', 'the', 'words,'],
  ['Another', 'line', 'of', 'the', 'text:'],
]

python nlp nltk

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Python NLTK - Tokenize sentences into words while removing numbers

Sources

Related Questions