'Is there a way to decrease space utilization and potentially improve time performance for string manipulation in my C++ code?
I'm working on a string manipulation code in C++ for my assignment, where I have to find a substring within a string and essentially replace it with a newly computed sequence based on other inputs (here, the string vector 'patterns'). I have been trying to find a better way to manipulate the string in order to increase efficiency (reduce space cost and possibly also reduce time complexity further). This is a sample of what I've done so far(Sorry if the strings look a bit weird, this was the best simple example I could come up with at the moment):
// some input variables
std::string& originalString ("__ BB ++ AA __ AA __ CC __ DD");
std::map<std::string, std::vector<std::string>> patternMap = {
{"AA", {"aaa","AAA"}},
{"BB", {"bbb","BBB"}},
};
std::size_t location = 0;
while ((location = originalString.find("__", location)) != std::string::npos) {
const std::string subString = originalString.substr(location+4, location+6);
std::map<std::string, std::vector<std::string>>::iterator patternsIterator = patternMap.find(subString);
if (patternsIterator != patternMap.end()) {
const std::vector<std::string> patterns = patternsIterator -> second;
std::string alternateSubString("*");
for (const std::string& item : patterns) {
alternateSubString.append(item + "__" );
}
alternateSubString.replace(alternateSubString.size() - 5, 5,"*");
originalString.replace(location+4, 2, alternateSubString);
location += alternateSubString.size();
}
++location;
}// Expected value of the original string should change to: "__ *bbb__BBB* ++ AA __ *aaa__AAA* __ CC __ DD"
'''
Any tips regarding a better way, maybe do away with the variable alternateSubString? Trying to achieve some space/time optimization here.
Solution 1:[1]
Your algorithm seems to be:
Scan the string until the next occurrence of
__is found.When found, the next word (e.g.
"BB") is replaced by building a new string using the vector of words inpatternMapThis vector of words involves concatenating each word found in
patternMapwith__and the entire replacement string that is built gets prefixed and postfixed with*. (e.g.{"bbb","BBB"}=>"*bbb__BBB*")
Feedback
Your implementation has a couple of instances of "backtracking" using the replace method to cover up a mistake made during the initial substitution loop.
It also has a few hardcoded length values that assumes spacing will be consistent between tokens. All of this makes it slightly hard to read.
unordered_map will be faster than map. Since you don't need enumeration order, you can use unordered_map. This is a micro-optimization if the size of your map is really small.
I propose reading the initial string into tokens and keeping track if the next token needs to be replaced with the words in the patternMap.
Consider this:
#include <algorithm>
#include <iostream>
#include <string>
#include <unordered_map>
#include <vector>
using namespace std;
typedef unordered_map<string, vector<string>> PatternMap;
// getNext returns a pair of strings:
// The leading whitespace preceeding a word and the word itself
pair<string, string> getNext(const string& s, size_t& location);
// makeMagicWord converts a vector of words
// into the associated magic replacement pattern
// e.g. {"AA", "BB", CC"} => "*AA__BB__CC*"
string makeMagicWord(const vector<string>& vec);
string ApplyPatterns(const std::string& original, const PatternMap& patterns)
{
string newString;
size_t location = 0;
bool replacing = false;
while (true) {
auto p = getNext(original, location);
string whitespace = p.first;
string token = p.second;
newString += whitespace;
if (token.empty()) {
break;
}
if (replacing) {
auto itor = patterns.find(token);
if (itor != patterns.end()) {
token = makeMagicWord(itor->second); // itor->second = patterns[token]
}
replacing = false;
}
else if (token == "__") {
replacing = true;
}
newString += token;
}
return newString;
}
The above code reads the whitespace/word tokens with a helper function, applies substitution as needed, and uses another helper function for making the replacement word. The implementation of those two helper functions is below:
Then the implementation of our two helpers is below:
pair<string, string> getNext(const string& s, size_t& location) {
string whitespace, token;
// consume leading whitespace
while (location < s.size() && ::isspace(s[location])) {
whitespace += s[location];
location++;
}
// consume the next token
while (location < s.size() && !::isspace(s[location])) {
token += s[location];
location++;
}
return { whitespace, token };
}
string makeMagicWord(const vector<string>& vec) {
std::string s = "*";
for (size_t i = 0; i < vec.size(); i++) {
s += ((i > 0) ? "__" : "");
s += vec[i];
}
s += "*";
return s;
}
Let's test it all out:
int main() {
string originalString("__ BB ++ AA __ AA __ CC __ DD");
unordered_map<std::string, std::vector<std::string>> patternMap = {
{"AA", {"aaa","AAA"}},
{"BB", {"bbb","BBB"}}};
string replacement = ApplyPatterns(originalString, patternMap);
cout << originalString << endl;
cout << replacement << endl;
return 0;
}
And when run, the above prints out:
__ BB ++ AA __ AA __ CC __ DD
__ *bbb__BBB* ++ AA __ *aaa__AAA* __ CC __ DD
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
