'Anti-Join two tables and replace null with date on the other table
I have two tables
- UserTable
| Username |
|---|
| User1 |
| User2 |
| User3 |
- UsageTable
| Username | Date |
|---|---|
| User1 | 1-2-22 |
| User2 | 2-2-22 |
| User1 | 3-2-22 |
| User2 | 3-2-22 |
I need who did not used the tool date wise.
Expected output:
| Username | Date |
|---|---|
| User2 | 1-2-22 |
| User3 | 1-2-22 |
| User1 | 2-2-22 |
| User3 | 2-2-22 |
| User3 | 3-2-22 |
I tried joining (Right join) the table but i am getting the user name properly but not the date (getting NULL).
select a.username,b.username,b.date from
(select distinct date, b.username username
from UsageTable
) b
right join
toolusers a
on
b.username = a.username
Solution 1:[1]
You can use an anti-join:
select u.usernamte, d.date
from usertable u
cross join (select distinct date as dt from usagetable) d
left join usagetable ut on ut.username = u.username and ut.date = d.dt
where ut.username is null
order by d.date, u.username
Solution 2:[2]
The issue here is that you don't have a table of just dates. So you need to generate that yourself.
There's two solutions here...either you want to find all Users who had no usage within a specified range. Or you want to find users who had no usage on days where there were other users using the system.
This might be confusing...but basically...if no one had usage on 2022-02-01, and you tried to use a DISTINCT to grab that list of dates...then you'll return no rows for that day, when you'd actually want is a list of all users.
I'm going to provide an answer based on what I would consider to be the most likely scenario, which is to find all users with no usage within a specified date range.
The first thing I do is generate a table that has a row for each day I want to check for.
DECLARE @DateRangeStart date = '2022-02-01',
@DateRangeEnd date = '2022-02-03';
-- FYI, this tally table generator code only produces 101 records total
IF OBJECT_ID('tempdb..#daterange','U') IS NOT NULL DROP TABLE #daterange; --SELECT * FROM #daterange
WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(x)) -- 10
, c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y) -- 10 * 10
, c3(rn) AS (SELECT 0 UNION ALL SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM c2) -- Add zero record, and row numbers
SELECT DateValue = DATEADD(DAY, x.rn, @DateRangeStart)
INTO #daterange
FROM c3 x
WHERE x.rn <= DATEDIFF(DAY, @DateRangeStart, @DateRangeEnd)
I know this looks complicated, but it's just a common way to generate a list of numbers, sometimes called a tally table. Which I'm then using to generate all dates that fall within a range. Some people like to use system tables. There's many ways to do it.
The main idea is that you just want a table with date values you can use.
Then the query is simple...
SELECT u.Username, d.DateValue
FROM #User u
CROSS JOIN #daterange d
WHERE NOT EXISTS (SELECT * FROM #Usage ug WHERE ug.Username = u.Username AND ug.DateValue = d.DateValue)
I'm cross joining our list of dates, to the list of users. This gives us every possible combination of Username + Date.
Then I add the NOT EXISTS() check which says to exclude any users who have a record for that date in the Usage table.
For reference, here is my sample data setup queries:
IF OBJECT_ID('tempdb..#User','U') IS NOT NULL DROP TABLE #User; --SELECT * FROM #User
CREATE TABLE #User (
Username varchar(20) NOT NULL,
);
INSERT INTO #User (Username)
VALUES ('User1'), ('User2'), ('User3')
IF OBJECT_ID('tempdb..#Usage','U') IS NOT NULL DROP TABLE #Usage; --SELECT * FROM #Usage
CREATE TABLE #Usage (
Username varchar(20) NOT NULL,
DateValue date NOT NULL,
);
INSERT INTO #Usage (Username, DateValue)
VALUES ('User1', '2022-02-01'), ('User2', '2022-02-02'), ('User1', '2022-02-03'), ('User2', '2022-02-03');
Solution 3:[3]
The date range should be derived first between minimum and maximum dates or if a separate date table can be created. Then do cartesian products between date and user table and do left join with Usage table and look for null values in where clause. This how i did it:
create table UserTable(Username varchar(10));
create table UsageTable(Username varchar(10), UsageDate Date);
insert into UserTable values ('User1');
insert into UserTable values ('User2');
insert into UserTable values ('User3');
insert into UsageTable values ('User1','1-FEB-2022');
insert into UsageTable values ('User2','2-FEB-2022');
insert into UsageTable values ('User1','3-FEB-2022');
insert into UsageTable values ('User2','3-FEB-2022');
commit;
with rnge as (select min(UsageDate) min_date, max(UsageDate) max_date from UsageTable),
dt as (select generate_series(min_date,max_date,'1 day') as dt from rnge),
Usr as (select Username, dt from dt, UserTable)
select Usr.* from Usr left join UsageTable usg on usr.username = usg.username
and usr.dt = usg.UsageDate
where usg.username is null;
Note: above sql works in postgres to generate the date range. however, you would use below to generate the date range in oracle. replace the dt table by below with some changes: in oracle this is how date range is generated:
select
to_date('04-01-2016','dd-mm-yyyy') + lvl
from
(select level - 1 lvl
from
dual
connect by
level <= (to_date('10-01-2015','dd-mm-yyyy') - to_date('04-01-2016','dd-mm-yyyy'))+ 1);
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | The Impaler |
| Solution 2 | Chad Baldwin |
| Solution 3 | PraveenP |
