'How to read char/string one by one from a file and compare in C

this is my first time asking questions here. I'm currently learning C and Linux at the same time. I'm working on a simple c program that use system call only to read and write files. My problem now is, how can I read the file and compare the string/word are the same or not. An example here like this:

foo.txt contains:

hi
bye
bye
hi
hi

And bar.txt is empty.

After I do:

./myuniq foo.txt bar.txt

The result in bar.txt will be like:

hi
bye
hi

The result will just be like when we use uniq in Linux.

Here is my code:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

#define LINE_MAX 256

int main(int argc, char * argv[]){
    int wfd,rfd;
    size_t n;
    char temp[LINE_MAX];
    char buf[LINE_MAX];
    char buf2[LINE_MAX];
    char *ptr=buf;

    if(argc!=3){
        printf("Invalid useage: ./excutableFileName readFromThisFile writeToThisFile\n");
        return -1;
    }

    rfd=open(argv[1], O_RDONLY);
    if(rfd==-1){
        printf("Unable to read the file\n");
        return -1;
    }

    wfd=open(argv[2], O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
    if(wfd==-1){
        printf("Unable to write to the file\n");
        return -1;
    }

    while(n = read(rfd,buf,LINE_MAX)){
        write(wfd,buf,n);
    }

    close(rfd);
    close(wfd);
    return 0;
}

The code above will do the reading and writing with no issue. But I can't really figure out how to read char one by one in C style string under what condition of while loop.

I do know that I may need a pointer to travel inside of buf to find the next line '\n' and something like:

while(condi){
    if(*ptr == '\n'){
    strcpy(temp, buf);
    strcpy(buf, buf2);
    strcpy(buf2, temp);
}
else
    write(wfd,buf,n);

    *ptr++;
}

But I might be wrong since I can't get it to work. Any feedback might help. Thank you.

And again, it only can be use system call to accomplish this program. I do know there is a easier way to use FILE and fgets or something else to get this done. But that's not the case.



Solution 1:[1]

You only need one buffer that stores whatever the previous line contained.

The way this works for the current line is that before you add a character you test whether what you're adding is the same as what's already in there. If it's different, then the current line is marked as unique. When you reach the end of the line, you then know whether to output the buffer or not.

Implementing the above idea using standard input for simplicity (but it doesn't really matter how you read your characters):

int len = 0;
int dup = 0;
for (int c; (c = fgetc(stdin)) != EOF; )
{
    // Check for duplicate and store
    if (dup && buf[len] != c)
        dup = 0;
    buf[len++] = c;

    // Handle end of line
    if (c == '\n')
    {
        if (dup) printf("%s", buf);
        len = 0;
        dup = 1;
    }
}

See here that we use the dup flag to represent whether a line is duplicated. For the first line, clearly it is not, and all subsequent lines start off with the assumption they are duplicates. Then the only possibility is to remain a duplicate or be detected as unique when one character is different.

The comparison before store is actually avoiding tests against uninitialized buffer values too, by way of short-circuit evaluation. That's all managed by the dup flag -- you only test if you know the buffer is already good up to this point:

if (dup && buf[len] != c)
    dup = 0;

That's basically all you need. Now, you should definitely add some sanity to prevent buffer overflow. Or you may wish to use a dynamic buffer that grows as necessary.

An entire program that operates on standard I/O streams, plus handles arbitrary-length lines might look like this:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    size_t capacity = 15, len = 0;
    char *buf = malloc(capacity);
    
    for (int c, dup = 0; (c = fgetc(stdin)) != EOF || len > 0; )
    {
        // Grow buffer
        if (len == capacity)
        {
            capacity = (capacity * 2) + 1;
            char *newbuf = realloc(buf, capacity);
            if (!newbuf) break;
            buf = newbuf;
            dup = 0;
        }

        // NUL-terminate end of line, update duplicate-flag and store
        if (c == '\n' || c == EOF)
            c = '\0';
        if (dup && buf[len] != c)
            dup = 0;
        buf[len++] = c;

        // Output line if not a duplicate, and reset
        if (!c)
        {
            if (!dup)
                printf("%s\n", buf);
            len = 0;
            dup = 1;
        }
    }

    free(buf);
}

Demo here: https://godbolt.org/z/GzGz3nxMK

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 paddy