'Most fastest way to find rows in one table which are not present in other table

I have 2 tables , one called "Booking" and another called "Assignments" there is no relation between these two tables

"Booking" tables contains columns "Resource" ,"start date", "end date", "hours"

"Assignments" table contains columns "Resource" , "start date" , "end date", "hours"

"Booking" table rows are daily records , means start and end date will always be same and hours will be less than 8(working hours in a day)

"Assignment" table rows can be multi day records , means start and end date can be same or different , example start date 01-02-2022 and end date 03-02-2022 and hours can be 24 hours(max 8 hours per day)

Now I have to find all booking rows for a "Resource" , which do not have equivalent assignment row for a particular day and hours

There can be complex scenario in which booking for "resource a" for "1-2-2022" is for "8 hours" , but assignment row for same date and resource is just 6 hours , so I need to find those missing hours too.

What is the fastest and most efficient way to do it? Records can be in millions and preferred programming language is C# , if C# cannot give me a way to process all this in Max 30 minutes ,other approaches are also welcome , like storing these 2 tables in a SQL database and run a query on it , to get results in a 3rd table

Thanks for your help



Solution 1:[1]

I would use the backend to do such processes. An index on a Checksum on each table would be efficient. Start by creating the Booking and Assignment tables.

Add the checksums:

ALTER TABLE Booking
ADD Booking_Checksum AS CHECKSUM( [Resource] ,[start date], [end date], [hours]) PERSISTED;

ALTER TABLE Assignment
ADD Assignment_Checksum AS CHECKSUM( [Resource] ,[start date], [end date], [hours]) PERSISTED;

Add indexes:

CREATE NONCLUSTERED INDEX [IX_Booking_Checksum] ON [dbo].[Booking]
(   Booking_Checksum ASC )

CREATE NONCLUSTERED INDEX [IX_Assignment_Checksum] ON [dbo].[Assignment]
(   Assignment_Checksum ASC )

All in Booking but not Assignment:

Select b.*
From Booking b
left join Assignment a on b.Booking_Checksum = a.Assignment_Checksum 
where a.Assignment_Checksum is null

All in Assignment but not in Booking:

Select a.*
From Assignment a
left join Booking b on b.Booking_Checksum = a.Assignment_Checksum 
where b.Booking_Checksum is null

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cloudsafe