'How does GitHub encode their graphQL cursors?

GitHub's graphql cursors are intentionally opaque, so they shouldn't ever be decode by a client. However I'd like to know their approach towards pagination, especially when combined with sorting.



Solution 1:[1]

There are multiple layers of encoding for the encoding used for pagination cursors used by GitHub. I will list them in order from the perspective of a decoder:

  1. The cursor string is encoded using URL safe base64 meaning it uses - and _ instead of + and /. This might be to have consistency with their REST based API.
  2. Decoding the base64 string gives us another string in the format of cursor:v2:[something] so the next step is decoding the something.
  3. The 'something' is a binary encoded piece of data containing the actual cursor properties. The first byte defines the cursor type:
    • 0x91 => We don't use any sorting, the cursor contains the length of the id field and the id itself. 0xcd seems to indicate a two-byte id, 0xce a four-byte id. This is followed by the id itself, which can be verified by decoding the base64 id graphql field.
    • 0x92 => A composite cursor containing the sorted property and the id. This is either a length-prefixed ordinal number or two bytes plus a string or ISO date string followed by the length-prefixed id.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mormund