'How to translate scanf exact matching into modern c++ stringstream reading

I am currenlty working on a project and I'd like to use modern cpp instead of relying on old c for reading files. For context I'm trying to read wavefront obj files.

I have this old code snippet :

const char *line;
float x, y, z;
if(sscanf(line, "vn %f %f %f", &x, &y, &z) != 3)
    break; // quitting loop because couldn't scan line correctly

which I've translated into :

string line;
string word;
stringstream ss(line);
float x, y, z;
if (!(ss >> word >> x >> y >> z)) // "vn x y z"
    break; // quitting loop because couldn't scan line correctly

The difference is that I use a string to skip the first word but I'd like it to match "vn" the same as sscanf does. Is this possible with stringstream or should I keep relying on sscanf for exact pattern matching ?

Also I'm trying to translate

sscanf(line, " %d/%d/%d", &i1, &i2, &i3);

but I'm having a hard time which again orients me towards not using modern cpp for my file reader.



Solution 1:[1]

I'd argue that stringstream is not "modern C++"¹. (I'll also admit we don't have something as a good replacement for scanf; we do have std::format that is very nice, and replaces std::cout << shenanigans with a syntax reminescent of Python's format strings, and is much faster. Sadly, it hasn't hit standard libraries yet.)

I'll actually say that I think your sscanf code is cleaner, and saner than the stringstream code: sscanf leaves things in an easier-to-understand state when parsing the line fails.

There's libraries you can use to build a line parser; Boost::spirit::qi is probably the most well-known. You can do things like

auto success = (ss >> qi::phrase_match("vn " >> qi::double >> ' ' >> qi::double >> ' ' >> qi::double, x, y, z)); 

If this option does not fill you with joy, love for the world and faith in all-reaching beauty, you're exactly like me and find it a poor replacement for the compactness and ease of understanding the original scanf offers.

There's the std::regex library these days (since C++11), and that might very well be what you're looking for! However, you'd need to write expressions for "here comes a floating point number" yourself, and that is not cool at all, either.


¹ without wanting to hurt anyone, I think the iostreams library approach was the one single standard library feature that made C++ awkwarder to use than most contemporary languages for anything IO-related. And that has a lasting effect on code quality.

Solution 2:[2]

For anyone interested here is how I parsed my obj file with stringstream style instead of sscanf style

Old code:

for(;;)
{
    if(fgets(line_buffer, sizeof(line_buffer), in) == NULL)
    {
        error= false; // eof
        break;
    }
        
    // force endl
    line_buffer[sizeof(line_buffer) -1]= 0;
        
    // skip spaces
    char *line= line_buffer;
    while(*line && isspace(*line))
        line++;
        
    if(line[0] == 'v')
    {
        float x, y, z;
        if(line[1] == ' ')          // position x y z
        {
            if(sscanf(line, "v %f %f %f", &x, &y, &z) != 3)
                break;
            positions.push_back( vec3(x, y, z) );
        }
        else if(line[1] == 'n')     // normal x y z
        {
            if(sscanf(line, "vn %f %f %f", &x, &y, &z) != 3)
                break;
            normals.push_back( vec3(x, y, z) );
        }
        else if(line[1] == 't')     // texcoord x y
        {
            if(sscanf(line, "vt %f %f", &x, &y) != 2)
                break;
            texcoords.push_back( vec2(x, y) );
        }
    }
        
    else if(line[0] == 'f')
    {
        idp.clear();
        idt.clear();
        idn.clear();
            
        int next;
        for(line= line +1; ; line= line + next)
        {
            idp.push_back(0); 
            idt.push_back(0); 
            idn.push_back(0);         // 0: invalid index
                
            next= 0;
            if(sscanf(line, " %d/%d/%d %n", &idp.back(), &idt.back(), &idn.back(), &next) == 3) 
                continue;
            else if(sscanf(line, " %d/%d %n", &idp.back(), &idt.back(), &next) == 2)
                continue;
            else if(sscanf(line, " %d//%d %n", &idp.back(), &idn.back(), &next) == 2)
                continue;
            else if(sscanf(line, " %d %n", &idp.back(), &next) == 1)
                continue;
            else if(next == 0)      //endl
                break;
        }
}

New code: (with the @JerryCoffin code for std::istream& operator>>(std::istream& is, char const* s);)

std::string line;
while (getline(file, line))
{
    std::stringstream ss(line);
    std::string tag;

    // istream& operator>> skips whitespaces unless std::skipws is disable
    // ignore empty lines
    if (!(ss >> tag))
        continue;

    // ignore comments
    if (tag[0] == '#')
        continue;

    float x, y, z;
    if (tag == "v")
    {
        if (!(ss >> x >> y >> z))
            break;
        positions_tmp.emplace_back(x, y, z);
    }
    else if (tag == "vt")
    {
        if (!(ss >> x >> y))
            break;
        texcoords_tmp.emplace_back(x, y);
    }
    else if (tag == "vn")
    {
        if (!(ss >> x >> y >> z))
            break;
        normals_tmp.emplace_back(x, y, z);
    }
    else if (tag == "f")
    {
        for (int i = 0; i < 3; ++i)
        {
            unsigned int idp(0), idt(0), idn(0);

            // reset stringstream after each read so that it keeps current position when fail
            auto pos = ss.tellg();
            auto state = ss.rdstate() && ~std::ios_base::failbit;
            auto reset = [state, pos](std::stringstream& ss)
            { ss.clear(state); ss.seekg(pos); return !ss.fail(); };

            // will try to match either i//i or i/i/i or i/i or i (all .obj "f" configurations)
            if (reset(ss) && ss >> idp >> "//" >> idn) {}
            else if (reset(ss) && ss >> idp >> "/" >> idt >> "/" >> idn) {}
            else if (reset(ss) && ss >> idp >> "/" >> idt) {}
            else if (reset(ss) && ss >> idp) {}

            // do stuff with indexes
            // ...
        }
    }
}

I changed a bit of logic here and there but the trickiest part was on the f i/i/i i/i/i i/i/i lines where you have to check if it matches right. If it doesn't you have to reset the stringstream to it's original state and position (at least remove std::ios_base::failbit and and replace position).

I wanted to keep the aspect of one line = one case, so it might be a little harder to understand with the lambda function, but I could also reverse the logic and handle faces edge by edge (not sure if I'll use it though).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Marcus Müller
Solution 2 startresse