'Running a number of consecutive replacements on the same string

I found this example for substring replacement:

use std::str;
let string = "orange";
let new_string = str::replace(string, "or", "str");

If I want to run a number of consecutive replacements on the same string, for sanitization purposes, how can I do that without allocating a new variable for each replacement?

If you were to write idiomatic Rust, how would you write multiple chained substring replacements?



Solution 1:[1]

The regex engine can be used to do a single pass with multiple replacements of the string, though I would be surprised if this is actually more performant:

extern crate regex;

use regex::{Captures, Regex};

fn main() {
    let re = Regex::new("(or|e)").unwrap();
    let string = "orange";
    let result = re.replace_all(string, |cap: &Captures| {
        match &cap[0] {
            "or" => "str",
            "e" => "er",
            _ => panic!("We should never get here"),
        }.to_string()
    });
    println!("{}", result);
}

Solution 2:[2]

There is no way in the standard library to do this; it’s a tricky thing to get right with a large number of variations on how you would go about doing it, depending on a number of factors. You would need to write such a function yourself.

Solution 3:[3]

I would not use regex or .replace().replace().replace() or .maybe_replace().maybe_replace().maybe_replace() for this. They all have big flaws.

  • Regex is probably the most reasonable option but regexes are just a terrible terrible idea if you can at all avoid them. If your patterns come from user input then you're going to have to deal with escaping them which is a security nightmare.
  • .replace().replace().replace() is terrible for obvious reasons.
  • .maybe_replace().maybe_replace().maybe_replace() is only very slightly better than that, because it only improves efficiency when a pattern doesn't match. It doesn't avoid the repeated allocations if they all match, and in that case it is actually worse because it searches the strings twice.

There's a much better solution: Use the AhoCarasick crate. There's even an example in the readme:

use aho_corasick::AhoCorasick;

let patterns = &["fox", "brown", "quick"];
let haystack = "The quick brown fox.";
let replace_with = &["sloth", "grey", "slow"];

let ac = AhoCorasick::new(patterns);
let result = ac.replace_all(haystack, replace_with);
assert_eq!(result, "The slow grey sloth.");

for sanitization purposes

I should also say that blacklisting "bad" strings is completely the wrong way to do sanitisation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Shepmaster
Solution 2 Chris Morgan
Solution 3 Timmmm