'How to debug the error message from a yapf formatting task?

I recently inherited a code base and am in the process of using yapf (and indirectly lib2to3) to format the code base. When I run the formatter I get the following output:

Traceback (most recent call last):
  File "/module/venv/lib/python3.9/site-packages/yapf/yapflib/yapf_api.py", line 183, in FormatCode
    tree = pytree_utils.ParseCodeToTree(unformatted_source)
  File "/module/venv/lib/python3.9/site-packages/yapf/yapflib/pytree_utils.py", line 125, in ParseCodeToTree
    tree = parser_driver.parse_string(code, debug=False)
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib2to3/pgen2/driver.py", line 103, in parse_string
    return self.parse_tokens(tokens, debug)
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib2to3/pgen2/driver.py", line 71, in parse_tokens
    if p.addtoken(type, value, (prefix, start)):
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib2to3/pgen2/parse.py", line 119, in addtoken
    ilabel = self.classify(type, value, context)
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib2to3/pgen2/parse.py", line 175, in classify
    raise ParseError("bad token", type, value, context)
lib2to3.pgen2.parse.ParseError: bad token: type=58, value='ִ', context=('', (325, 19))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/module/venv/lib/python3.9/site-packages/yapf/__init__.py", line 225, in _FormatFile
    reformatted_code, encoding, has_change = yapf_api.FormatFile(
  File "/module/venv/lib/python3.9/site-packages/yapf/yapflib/yapf_api.py", line 96, in FormatFile
    reformatted_source, changed = FormatCode(
  File "/module/venv/lib/python3.9/site-packages/yapf/yapflib/yapf_api.py", line 186, in FormatCode
    raise errors.YapfError(errors.FormatErrorMsg(e))
  File "/module/venv/lib/python3.9/site-packages/yapf/yapflib/errors.py", line 37, in FormatErrorMsg
    return '{}:{}:{}: {}'.format(e.args[1][0], e.args[1][1], e.args[1][2], e.msg)
IndexError: tuple index out of range

It seems that the error arises from a unicode character from the lib2to3.pgen2.parse.ParseError: bad token: type=58, value='ִ', context=('', (325, 19)) error, though I am unsure how to 1) find the source of the error and 2) how to fix it if I do find it.

As of now, I don't believe that unicode character is in the codebase; though, the codebase is fairly large so it is entirely possible that I am just missing it somewhere.

Does anyone have any experience parsing these types of errors or can point me in the right direction?

I found a few github issues describing this type of error, but they remain Open with no direction on finding the source.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source