'QR Code encoding (ISO 8859-1 vs "JIS8" vs UTF-8; ISO 18004:2000/2015 compatibility; encoding of backslash)

I have used multiple online QR Code generators to encode "\\ö/" (3 characters: U+005C, U+00F6, U+002F). I have verified the QR code using the Android app "QR & Barcode Scanner" and "https://zxing.org/w/decode.jspx". I have inspected the bytes reported by "https://zxing.org/w/decode.jspx". The following are the results and the questions I have about them:

0100 00000100 01011100 11000011 10110110 00101111 ...
8bit length 4 0x5C     0xC3     0xB6     0x2F     zeros and padding
                       \ UTF-8 for "ö" /
  • Why does this work (decode as U+005C, U+00F6, U+002F)?
  • Is 0x5C mapped to the Yen symbol in ISO 18004:2000 (as in "JIS8")?
  • Would mapping 0x5C to the Yen symbol not be incompatible with ISO18004:2015 (using ISO 8859-1, mapping 0x5C to the backslash)?
  • Why isn't 0x3C interpreted with ISO 8859-1 (according to ISO 18004:2015) as "Ã" (U+003C) and 0xB6 as "¶" (U+00B6)?
  • Why aren't they interpreted with "JIS8" (according to ISO 18004:2000) as "テ" (U+FF83) and "カ" (U+FF76)?
  • Why does ISO 18004:2015 claim that "Symbols complying with the requirements for QR Code Model 2, as defined in ISO/IEC 18004:2000, are readable with equipment complying with this International Standard" and "QR Code Model 2 symbols are fully compatible with QR Code reading systems"?
0111 00011010 0100 00000100 01011100 11000011 10110110 00101111 ...
ECI  26:UTF-8 8bit length 4 0x5C     0xC3     0xB6     0x2F     zeros and padding
  • Why does this work (decode as U+005C, U+00F6, U+002F)?
  • Why is the backslash (U+005C) not doubled?
  • Don't ISO 18004:2015 and ISO 18004:2000 explicitly say: "Where 5C[sub]HEX appears as true data it shall be doubled in the data string before encoding in symbols to which the ECI protocol applies"?
  • What does this mean in ISO 18004:2015: "When a single occurrence of 5C[sub]HEX is encountered in the input to the decoder, an ECI indicator is inserted followed by the ECI Designator. When a doubled 5C[sub]HEX is encountered, it is encoded as two 5C[sub]HEX"?
0111 00011010 0100 00000101 01011100 01011100 11000011 10110110 00101111 ...
ECI  26:UTF-8 8bit length 5 0x5C     0x5C     0xC3     0xB6     0x2F     zeros and padding
  • Why does this not work (decodes as U+005C, U+005C, U+00F6, U+002F)?
  • Shouldn't backslashes be doubled (see above)?

To me the most important of the above questions: (How) Can a backslash be encoded in a way that conforms to the standard and that allows reliable decoding?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source