'How to get elements of a collection that are not in another collection?
I am a Java programmer, but there is a task, where the better way to resolve it is to use (the more effective and suitable for the server) python, which is not familiar to me.
What about task? I have file, which contains sorted ids (~5 mln ids) in such format:
00000011-1f0e-4d89-b658-af53b36c882e
0000008a-5816-4324-82f6-9242a8867094
000000be-d08c-41b9-97f3-594d2660dfb5
000000f2-ea63-48c0-98f6-1dbb25f0249e
0000014d-f6b0-4b3e-b767-14cd2495fd81
00000155-ec3b-4d1a-a3ae-28e95cfc79c7
00000231-65f9-424a-bf03-1d3cbefc6c40
00000281-cb21-4d3c-ba13-874161962567
000002be-6e9d-455d-aa16-49e2ac242868
00000375-4d9a-4dd6-8e0c-38e5c2134a3c
00000383-fc20-4154-921c-c187bb3f6628
000003fc-7a06-4525-a12a-df64732324e5
00000420-af64-4015-9bc4-6b9e18b86183
00000476-1bf9-4608-8979-d60ecd5b368b
...
Also I have another file, which contains ~60 mln sorted ids. The format is the same.
I need to read all ids from the first file to variable for example l1 and read all ids from the second file to variable for example l2. After that I want to find all elements of the l1, which are absent in l2 and write them to the third file. The first files are many, that is why I must repeat these actions from time to time.
Tell me, please, what is the best way to choose for solving this problem, which object types to use for l1 and l2 (the lists of ids are sorted) and what will the python script look like all in all?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
