'preg_replace: Replace the word(s) from a html between html tags

Imagine we have these HTML contents:

<title>remove it, but not this</title>
<title>remove the title</title>

So when we call remover($search, $replace, $subject) we need to get the filtered string. Examples:

remover('remove it', 'new str', '<title>remove it, but not this</title>') // <title>new str, but not this</title>

remover('title', 'name', '<title>remove the title</title>') // <title>remove the name</title>

The main thing here: we have a HTML content with tags and nested tags. We need to replace the word(s) from that content without touching HTML tags.

I could find the regex, but this removes EVERYTHING between HTML tags

preg_replace("/(?<=>)[^><]+?(?=<)/", $newWord, $body);

So how modify that piece of code, so that search for the match between HTML elements and replace it with given value?



Solution 1:[1]

Here is a proposal which replaces from the <title> tag until the first character not accepted, in the first case the comma, and in the second case the <of the closing tag.

<?PHP
print(  
    preg_replace(
        "/<title>[a-zA-Z ]*/",
        "<title>",
        "<title>remove it, but not this</title>"
        )
    );
print("\n");
print(  
    preg_replace(
        "/<title>[a-zA-Z ]*/",
        "<title>",
        "<title>remove the title</title>"
        )
    );
?>

The output is

<title>, but not this</title>
<title></title>

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1