'Spacy: Creating empty or error Token instance?

I'm collecting some of the tokens in a Dict for further use. The problem is that I need one token to play the role of None/NIL in case I don't find what I need in the doc to act as the no-value case i.e. still have all the attributes (the string value could be say some special char) ... i.e. act like Token, but not be a token from the doc.

Is there a way to create such Token ? Or may be copy some but modify .dep_, .pos_ etc.



Solution 1:[1]

One approach would be to create a doc containing just one character (e.g. a space or * or any other character of your choice), i.e. nlp('<special character>'), and take the token at index 0. For example, if your special character is a #, this would look like:

empty_token = nlp('#')[0]

empty_token is a normal token, so it has all the attributes. Running the code below produces the corresponding output:

print('text:', empty_token.text)
print('pos:', empty_token.pos_)
print('dep:', empty_token.dep_)

Output:

text: #
pos: SYM
dep: ROOT

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Joel Oduro-Afriyie