'How to apply 2 regex patterns simultaneously in grep

I would like to check for 2 patterns across a piece of text (a total of 4 delimiters).
Either one of the two patterns matches is fine. However when they qualify, I would like the results to be next to each other side by side.

This is the INPUT TEXT (it contains two separate long lines. Actual input text is much more. I have chosen relevent snippets as shown below):

Query Response #5: [2659] ID,(0010,0030) DA #8 [19801024] PatientÆs Birth Date,,,(0020,000D) UI #42 Station Title,,,>(0040,0002) DA #8 [20301212] Scheduled Procedure Step Start Date,,,>(0040,0003) TM #4Information

Query Response #6: ID, (0010,0030) DA #8 [19410203] PatientÆs Birth Date, Title,,,>(0040,0002) DA #8 [20210826] Scheduled Procedure Step Start Date, FIND]

This is the DESIRED OUTPUT:

19801024 20301212
19410203 20210826

There are 2 pairs of delimiters. The 1st set of delimiters is this:

(0010,0030) DA #8 [
] Patient

The 2nd set of delimiters is this:

(0040,0002) DA #8 [
] Scheduled Procedure Step Start Date

I am able to apply each pair of delimiter by itself. Specifically, when I do this:

grep -o -P "(?<=0010,0030\) DA #8 \[).*(?=\] Patient)"

I get this output:

19801024
19410203

When I apply this 2nd pair of delimiter like this below:

grep -o -P "(?<=0040,0002\) DA #8 \[).*(?=\] Scheduled Procedure Step Start Date)"

I get this output:

20301212
20210826

How do I issue a correct combined grep command such that the output result is as shown below? :

19801024 20301212
19410203 20210826

I tried this following approach without success:

grep -e "(?<=0040,0002\) DA #8 \[).*(?=\] Scheduled Procedure Step Start Date)" -e "(?<=0010,0030\) DA #8 \[).*(?=\] Patient)"

The error message I get is follows:

grep: Unmatched ) or \)

Thanks in advance. I hope my question is clear. (Please note I'm using grep under Windows10. The outer quatations marks have to be double quotation marks)



Solution 1:[1]

This is actually job for sed more than grep:

sed -E 's/.*\(0010,0030\) DA #8 \[([^]]+)\] Patient.*\(0040,0002\) DA #8 \[([^]]+)\] Scheduled Procedure Step Start Date.*/\1 \2/' file

19801024 20301212
19410203 20210826

Solution 2:[2]

With your shown samples only, please try following awk program. Written and tested in GNU awk should work in any awk. Simple explanation would be, making PatientÆs Birth Date as field separator for all lines then in main program checking 1st field if its equal to regex ^Query Response.*DA #8 \[[0-9]+\]$ then getting value between [ and ](excluding [ and ]) and saving it into val variable. Then checking condition if 2nd field matches to ^,.*DA #8[[:space:]]+ then again getting values between [ and ] and printing val variable and current $2's value, which is required output.

awk -F' PatientÆs Birth Date' '
$1~/^Query Response.*DA #8 \[[0-9]+\]$/{
  val=""
  gsub(/.*\[|\]$/,"",$1)
  val=$1
}
$2~/^,.*DA #8[[:space:]]+/{
  gsub(/.*\[|\].*/,"",$2)
  print val,$2
}
'  Input_file 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 anubhava
Solution 2 RavinderSingh13