'XSS - Encoding - htmlspecialchars and urlencode()/rawurlencode()

I plan to encode my user data output regarding XSS attacks.

For the output strings inside HTML I plan to use:

htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

For the output strings inside JS embedded in HTML I plan to use:

json_encode($string, JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS);

Now, I have some fields storing complete URLs/Emails from users, which will need to be shown correctly in the frontend. (These URLs can be for various sources, for example a link to a publication website or newspaper article links)

Now my question is: Do I have to treat these URL/Email strings differently and can the use of htmlspecialchars() and json_encode() break these URLs in their specific encoding environment (HTML, JS)?

I saw that the use of urlencode()/rawurlencode() is proposed for URLs, but should only be used on its fragments or parameters:

$url = "https://example.com?data=".urlencode($parameter);

So in my case, would I have to split the URL into the core and the parameters, and apply urlencode()/rawurlencode() only on the parameters? And what escaping would I need to apply on the core to be safe against XSS?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source