'Solving the "firstDuplicate" question in Python
I'm trying to solve the following challenge from codesignal.com:
Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, return -1.
Example
For a = [2, 1, 3, 5, 3, 2], the output should be
firstDuplicate(a) = 3.
There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than the second occurrence of 2 does, so the answer is 3.
For a = [2, 4, 3, 5, 1], the output should be
firstDuplicate(a) = -1.
The execution time limit is 4 seconds.
The guaranteed constraints were:
1 ≤ a.length ≤ 10^5, and
1 ≤ a[i] ≤ a.length
So my code was:
def firstDuplicate(a):
b = a
if len(list(set(a))) == len(a):
return -1
n = 0
answer = -1
starting_distance = float("inf")
while n!=len(a):
value = a[n]
if a.count(value) > 1:
place_of_first_number = a.index(value)
a[place_of_first_number] = 'string'
place_of_second_number = a.index(value)
if place_of_second_number < starting_distance:
starting_distance = place_of_second_number
answer = value
a=b
n+=1
if n == len(a)-1:
return answer
return answer
Out of the 22 tests the site had, I passed all of them up to #21, because the test list was large and the execution time exceeded 4 seconds. What are some tips for reducing the execution time, while keeping the the code more or less the same?
Solution 1:[1]
As @erip has pointed out in the comments, you can iterate through the list, add items to a set, and if the item is already in a set, it is a duplicate that has the lowest index, so you can simply return the item; or return -1 if you get to the end of the loop without finding a duplicate:
def firstDuplicate(a):
seen = set()
for i in a:
if i in seen:
return i
seen.add(i)
return -1
Solution 2:[2]
Create a new set and find its already in the new list, if its there return the element:
def firstDuplicate(a):
dup = set()
for i in range(len(a)):
if a[i] in dup:
return a[i]
else:
dup.add(a[i])
return -1
Solution 3:[3]
This is just an idea, I didn't verify it but it should work. It seems there's no memory limit but just a time limit. Therefore using space to trade time is probably a practical way to do this. The computation complexity is O(n). This algorithm also depends on the condition that the number range is between 1 to len(a).
def first_duplicate(a):
len_a = len(a)
b = [len_a + 1] * len_a
for i, n in enumerate(a):
n0 = n - 1
if b[n0] == len_a + 1:
b[n0] = len_a
elif b[n0] == len_a:
b[n0] = i
min_i = len_a
min_n = -1
for n0, i in enumerate(b):
if i < min_i:
min_i = i
min_n = n0 + 1
return min_n
Update:
This solution is not as fast as the set() solution by @blhsing. However, it may not be the same if it was implemented in C - it's kinda unfair since set() is a built-in function which was implemented in C as other core functions of CPython.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | blhsing |
| Solution 2 | Velkumaran A P |
| Solution 3 |
