'Grep how do I check if a token comes before another in a fil?

I am trying to find among a bunch of files who hold SQL statements whether we ever SELECT from a table before we INSERT into it. It seems it should be a one-liner with Grep.

I've come up with grep -zl "FROM (\S*).*INSERT INTO \0

The -z treats the input as one line, and then the back reference does the rest.

However testing with echo "SELECT a FROM x INSERT INTO x VALUES(1);" | grep -zl "FROM (\S*).*INSERT INTO \0" produces no result.

In fact even echo "aa aa" | grep "(\S*) \0" returns nothing.

What am I missing?

grep

Solution 1:^[1]

First, let's solve it for x:

echo "SELECT a FROM x INSERT INTO x VALUES(1);" | grep -E "FROM (\S)*x.*INSERT INTO (\S)*x"

However, you may have many tables and you are interested about all of them. So, this is how you can list the table names:

select TABLE_NAME
from information_schema.tables;

Now, let's generate the grep for each table:

select CONCAT('sudo bash foo.sh "your script" ', TABLE_NAME)
from information_schema.tables;

and implement foo.sh as follows:

echo "$1" | grep -E "FROM (\S)*$2.*INSERT INTO (\S)*$2"

The query generates the grep for each table. Naturally, you can filter your query to a selection of tables instead and you might also need to handle cases like

select ... from yourschema.yourtable

select ... from `yourtable`

but start with the proof-of-concept I have given and see whether that's enough for you.

Solution 2:^[2]

`grep` solution:

Use -P option to use RegExp with Perl notation

grep -zPl "FROM ([[:alnum:]]+) INSERT INTO \1 VALUES"

Matching `SELECT` statement before `INSERT`, solution:

The reported problem is more complicated than described.

Assuming SELECT statements and corresponding INSERT statement are not in sequence.

For instance:

SELECT a FROM x1 
INSERT INTO x1 VALUES(1);
SELECT a FROM x2 
SELECT a FROM x3 
SELECT a FROM y1
SELECT a FROM x2 
INSERT INTO x3 VALUES(1);
INSERT INTO x2 VALUES(1);
INSERT INTO y1 VALUES(1);
INSERT INTO x3 VALUES(2);
SELECT a FROM y2
INSERT INTO y1 VALUES(1);

Only x3 and y1 are not matched. And there are nesting and duplicates.

We do not know ahead all table-names.

We need a stack data structure. Push-in every table-name in select select statement (no duplicates), pull-out every table-name in insert.

Implemented using gawk (standard awk in most Linux machines) associative array. Screening input SQL file once.

`gawk` script: `script.awk`

/SELECT .* FROM / { # for each line matching RegExp "SELECT .* FROM"
  # read table name from current line
  tableName = gensub(/.*FROM[[:space:]]+([[:alnum:]]+).*/,"\\1",1);
  # push-in tableName into associative array (used as stack)
  tableNamesStack[tableName] = 1; 
}
/INSERT INTO / { # for each line matching RegExp "INSERT INTO "
  # read table name from current line
  tableName = gensub(/.*INTO[[:space:]]+([[:alnum:]]+).*/,"\\1",1);
  # if current tableName is in stack
  if (tableName in tableNamesStack) {
    # pull-out current tableName from stack
    delete tableNamesStack[tableName];
  } else {
    # current tableName is missing from stack, report and continue.
    printf ("Unmatched INSERT statement in line %d, for table %s\n", NR, tableName);
  }
}

running `script.awk`

gawk -f script.awk input.sql 

Unmatched INSERT statement in line 9, for table x3
Unmatched INSERT statement in line 11, for table y1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Lajos Arpad
Solution 2	Dudi Boy

'Grep how do I check if a token comes before another in a fil?

Solution 1:[1]

Solution 2:[2]

grep solution:

Matching SELECT statement before INSERT, solution:

gawk script: script.awk

running script.awk