'Longest Consecutive Subsequence in a stream of integers

Given a stream of array, for each element we have to find the longest consecutive subsequence till now

For example:

10 -> 1

12 -> 1

11 -> 3 (10,11,12)

20 -> 3

What I think: Create a set and for each incoming number, check what if its the start/end of the LCS. Take the max possible length.

Since this is of O(N^2), is there any possible way to reduce the TC? I don't need the code but just a way to optimize the algo



Solution 1:[1]

You can do this in O(n log n) time with a balanced BST using the ideas from this post.

An incoming number x will be treated as an interval [x, x+1]. The BST stores the boundary points of the disjoint intervals; each leaf node in the BST should also store whether it's a start or end, and the length of the interval it's part of.

When a point x comes in, search the BST for the largest element u <= x. If it's present and it's a 'start point', continue. Also search for the smallest element x + 1 <= v. If it's present and it's an 'end point', continue.

Otherwise, there's several cases. If x is an end point and x+1 is a start point, you should remove both from the BST, forming a new interval whose size is the sum of the two combined intervals + 1. If one of (x is an end point, x+1 is a start point) is true, remove that point, and extend the appropriate interval by 1 to the right or left, respectively. If [x, x+1] is already contained in an interval, continue. Otherwise, add x and x+1 to the BST as a start and end, respectively, of size 1.

On each addition, you'll check whether the new interval is larger than the previous largest: if so, and u, v are the boundary points, your longest consecutive subsequence is u, u+1, ... v-1.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 kcsquared