'Wrong sequence of bytes after creating of a new code object

I'm trying to use PyCode_New() method of Python 3.10 C API.

I have a problem with the last argument of PyCode_New(). According to the description here: https://docs.python.org/3/c-api/code.html#c.PyCode_New it should contain a value of co_lnotab field of a new code object. This field is used to store info about offsets of bytescodes and corresponding offsets of lines of codes in a new code object. It should content a sequence of bytes. (an instance of bytes() data type in Python) The description of the content of this field is here: https://github.com/python/cpython/blob/3.10/Objects/lnotab_notes.txt.

According to the source code of cPython PyCode_New() simply calls PyCode_NewWithPosOnlyArgs() with the same arguments and this method assigns this argument as a value to the low-level co_lnotab field of a new code_object with no changes:

  1. https://github.com/python/cpython/blob/0b5f99ad1af0892a2d6043abd8aeb10d685d3844/Objects/codeobject.c#L275
  2. https://github.com/python/cpython/blob/0b5f99ad1af0892a2d6043abd8aeb10d685d3844/Objects/codeobject.c#L117
  3. https://github.com/python/cpython/blob/0b5f99ad1af0892a2d6043abd8aeb10d685d3844/Objects/codeobject.c#L262

To conclude, a value of co_lnotab field of a newly created code object, according to my current understanding, must be the same as the the value of the last argument of PyCode_New() function, but IT IS NOT THE SAME.

    Examples:
    Note: It is possible to use PyBytes_FromString() method of the Python C API
    to create a new instance of Python bytes() using a C++ string:
    https://docs.python.org/3/c-api/bytes.html#c.PyBytes_FromString

    Example 1:
    Value of the argument:                 PyBytes_FromString("\x06\x01"))
    Value of co_lnotab field of a newly created code object: b'\x00\x01'

    Example 2:
    Value of the argument:                 PyBytes_FromString("\x06\x01\x2C\x01")
    Value of co_lnotab field of a newly created code object: b'\x00\x01\x06\x01'

    Example 3:
    Value of the argument:                 PyBytes_FromString("\x06\x01\x2C\x01\x0F\x02")
    Value of co_lnotab field of a newly created code object: b'\x00\x01\x06\x01,\x02' (code of "," is equal to 0x2C)

    Example 4:
    Value of the argument:                 PyBytes_FromString("\x00\x01\x2C\x01"))
    Value of co_lnotab field of a newly created code object: b''

    Example 5:
    Value of the argument:                 PyBytes_FromString("\x12\x01\x08\x01\x18\x01\x18\x01\x18\x01\x18\x01\x18\x01\x0e\x01\x08\x01\x04\x01\x08\x01\x08\x01\n\x01\x08\x01\n\x01\x0c\x01"))
    Value of co_lnotab field of a newly created code object: b'\x00\x01\x12\x01\x08\x01\x18\x01\x18\x01\x18\x01\x18\x01\x18\x01\x0e\x01\x08\x01\x04\x01\x08\x01\x08\x01\n\x01\x08\x01\n\x01'

It looks like there are next regularities:

  1. The first byte is added automatically and equal to 0x00 at a time when it does not start with it
  2. All pairs of bytes are in a reverse order
  3. Last byte is truncated when 0x00 is added

To conclude, I do not understand the reason of such bytes order and can not find any clues of existence of a special source of code which changes the bytes order in the source code of cPython or in the documentation.

Could somebody explain the reasons of such changes in the bytes order?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source