'How do I assess and create a conditional output where a rows' DateTime value sits between an earlier row's Start/End DateTime?
I'm a little stumped.
I need to come up with a query to exclude an entry where :
For the same 'ClientN' value, if the Start Time is after a previous start time AND before a previous End Time, then it's excluded (mentioned below as exclude)
But, if a second time extends beyond the first entry, then the extension of the time is counted (mentioned in the table as 'partial include').
Below is some data that has had this rule manually applied.
I was planning on using PowerQuery for this, but I am open to other ideas.
The premise is to remove instances of 'double counts' against the same project/client. (DT is in d/mm/yyyy HH:mm)
| Start Time | End Time | Duration | ClientN | ProjectN | Decision |
|---|---|---|---|---|---|
| 6/01/2022 12:30 | 6/01/2022 13:12 | 0.00:42:00 | Client1 | Project6 | include |
| 6/01/2022 13:14 | 6/01/2022 13:44 | 0.00:30:18 | Client1 | Project6 | include |
| 6/01/2022 13:44 | 6/01/2022 14:18 | 0.00:33:36 | Client2 | Project3 | include |
| 6/01/2022 14:18 | 6/01/2022 15:05 | 0.00:47:24 | Client1 | Project6 | include |
| 6/01/2022 15:12 | 6/01/2022 15:39 | 0.00:26:06 | Client2 | Project3 | include |
| 6/01/2022 21:00 | 6/01/2022 22:00 | 0.01:00:00 | Client2 | Project5 | include |
| 7/01/2022 8:30 | 7/01/2022 9:00 | 0.00:30:00 | Client1 | Project1 | include |
| 7/01/2022 12:00 | 7/01/2022 13:00 | 0.01:00:00 | Client1 | Project7 | include |
| 7/01/2022 13:30 | 7/01/2022 14:30 | 0.01:00:00 | Client1 | Project6 | include |
| 7/01/2022 17:10 | 7/01/2022 17:42 | 0.00:32:12 | Client2 | Project8 | include |
| 10/01/2022 13:45 | 10/01/2022 14:45 | 0.01:00:00 | Client1 | Project6 | include |
| 10/01/2022 13:46 | 10/01/2022 13:59 | 0.00:12:53 | Client1 | Project1 | exclude |
| 10/01/2022 13:59 | 10/01/2022 14:00 | 0.00:01:20 | Client2 | Project2 | include |
| 10/01/2022 14:00 | 10/01/2022 14:09 | 0.00:08:18 | Client1 | Project1 | exclude |
| 10/01/2022 14:09 | 10/01/2022 14:10 | 0.00:01:20 | Client2 | Project2 | include |
| 10/01/2022 14:09 | 10/01/2022 14:11 | 0.00:02:02 | Client1 | Project1 | exclude |
| 10/01/2022 14:11 | 10/01/2022 14:12 | 0.00:00:38 | Client2 | Project2 | include |
| 10/01/2022 14:12 | 10/01/2022 14:31 | 0.00:19:36 | Client1 | Project1 | exclude |
| 10/01/2022 14:31 | 10/01/2022 14:32 | 0.00:00:59 | Client2 | Project2 | include |
| 10/01/2022 14:32 | 10/01/2022 14:32 | 0.00:00:02 | Client1 | Project1 | exclude |
| 10/01/2022 14:32 | 10/01/2022 14:33 | 0.00:01:12 | Client2 | Project2 | include |
| 10/01/2022 14:33 | 10/01/2022 14:34 | 0.00:00:59 | Client1 | Project1 | exclude |
| 10/01/2022 14:34 | 10/01/2022 14:39 | 0.00:04:20 | Client2 | Project2 | include |
| 10/01/2022 14:39 | 10/01/2022 14:42 | 0.00:03:14 | Client1 | Project1 | exclude |
| 10/01/2022 14:43 | 10/01/2022 14:55 | 0.00:12:14 | Client1 | Project1 | partial include |
I was playing around with formulas:
Column =
VAR __Current = [end time]
VAR __PreviousDate = MAXX(FILTER('detailedreportextracts__small',[start Time] < EARLIER([start Time] ) && [ClientN] = earlier([ClientN]) ),[start Time])
var __Previous = MAXX(filter(detailedreportextracts__small,[start Time]= __PreviousDate),[end Time])
return
__Current - __Previous
The issue appears to be that it only looks back one data row.
So, where I see a negative value on the first exclusion (this is correct), I would also think I would see negative values below this line on the lines that I have marked as 'exclude'.
I might have the columns in the wrong spot in the formula. I did swap them around, but then the row calculated value didn't work for me.
After fluffing around for another few hours with tireless googling, I've been trying to play with: https://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/339586
Pulling apart the DAX queries as they stand, so I can understand them better. I'm not sure if it's the best way to do it, but I'm trying to extract the 2:45pm (circled) into another column, to then check if the [end time] is < the earlier larger end time. But, for whatever reason, the 'earlier' function doesn't seem to look back very many rows earlier.
In my latest attempt: How to Compare the Current Row to the Previous Row by Category Using DAX Again, my results seem to concur that I'm not able to look back further rows to maintain that 'max' value found earlier.
Any help would be greatly appreciated.
Solution 1:[1]
See if this works for you. Assumes Table1 includes everything EXCEPT the decision column. It will basically generate a table of non-overlapping time spent by project/client
Based on aggregate continuous "transitive" overlapping time intervals
function process
(xtable)=>
// for each group, compare each list against all lists in column Custom, and merge those that overlap
let Source= Table.Buffer(xtable),
#"Added Custom"= Table.AddColumn(
Source,
"Custom2",
each let
begin = [Custom],
myclient=[ClientN],
myproject=[ProjectN]
in
List.Accumulate (
Table.SelectRows(Source,each [ClientN]=myclient and [ProjectN]=myproject)[Custom],
begin,
(state,current)=> if List.ContainsAny(state,current) then List.Distinct(List.Combine({current,state})) else state
)
),
// count the number of changes made from original version. If this is not zero, we will recurse the changes
x= List.Sum(List.Transform(List.Positions(#"Added Custom"[Custom]), each if #"Added Custom"[Custom]{_} = #"Added Custom"[Custom2]{_} then 0 else 1)),
RemovePrioCustom= Table.RemoveColumns(#"Added Custom",{"Custom"}),
AddNewCustom= Table.RenameColumns(RemovePrioCustom,{{"Custom2", "Custom"}}),
recursive = if x=0 then AddNewCustom else @process( AddNewCustom)
in recursive
code
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start Time", type datetime}, {"End Time", type datetime}, {"Duration", type duration}, {"ClientN", type text}, {"ProjectN", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
(i)=>Table.SelectRows(#"Added Index", each [ClientN]=i[ClientN] and [ProjectN]=i[ProjectN] and
([Start Time]>=i[Start Time] and [End Time]<=i[End Time] or
[Start Time]<=i[Start Time] and [End Time]>=i[End Time] or
[Start Time]<=i[Start Time] and [End Time]<=i[End Time] and [End Time]>=i[Start Time]or
[Start Time]>=i[Start Time] and [End Time]>=i[End Time] and [Start Time] <=i[End Time])
)[Index]
),
MergeOverlap= process(#"Added Custom"),
#"Added Custom1" = Table.AddColumn(MergeOverlap, "StartMin", each List.Min(List.Transform([Custom], each MergeOverlap[Start Time]{_})),type datetime),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "EndMax", each List.Max(List.Transform([Custom], each #"Added Custom1"[End Time]{_})), type datetime),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Start Time", "End Time" ,"Duration","Index", "Custom"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"ClientN", "ProjectN","StartMin", "EndMax"}),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"ClientN", Order.Ascending}, {"ProjectN", Order.Ascending}, {"StartMin", Order.Ascending}})
in #"Sorted Rows"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |


