'Filter rows of file based on match in a second file
I have two files with a list of genes (and some other information) and want to report a third file that contains genes from file 1 that are not found in file 2.
Here is the first file:
A01 23915164 23915314 Bra040361 AT3G07100 18.59019807
A02 1595601 1595688 Bra028560 AT5G10870 28.48729677
A02 20515443 20516351 Bra029367 AT5G23340 16.06774844
A03 14688282 14689512 Bra001273 AT3G07870 13.93575203
A06 472776 473620 Bra039661 AT1G53210 52.96447989
A08 2624078 2634861 Bra014189 AT1G49450 22.62572775
A09 30817052 30819592 Bra031704 AT1G10170 14.95032844
A10 532340 532466 Bra033282 AT1G01140 10.5095903
And here are the first 5 lines of the second file:
A01 3710694 3710789 Bra011117 11.85222101
A01 9480352 9483285 Bra026368 11.96344966
A01 23915165 23915314 Bra040361 18.59019807
A02 1595602 1595688 Bra028560 28.48729677
A02 1674981 1675077 Bra023317 17.90385707
For both files, the gene names are in column 4. I thought this would be pretty easy to do with something like awk, but am having trouble. Any suggestions for how I can achieve my desired filtering filtering?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
