'Finding peak in a large data set with python
I want to know if there is a way to eliminate points that are not close to the peak. For example if I have a data set with 10 million points and the peak is around 5 million, how could i get rid of points that are no where near close to the peak so i can narrow down where my index point resides
Solution 1:[1]
You need first to define what is the range of numbers that are close to the peak. Let's assume you specify a threshold number, so you can keep only the elements that are close to the peak with distance at most threshold by using Numpy with condition. For example:
import numpy as np
data_size = 10000000
max_possible_peak = 5000000
data = np.random.rand(data_size) * max_possible_peak
threshold = 100
peak = max(data)
data_near = data[data > peak-threshold]
Solution 2:[2]
A native loop solution:
# I suppose it is meant that small numbers are being deleted. If the peak is already known
a=0
i=0
arbitrary = -777# a number to use for deletion. Matches the array member type and is not initially in the array (check it)
import array
ar=array.array("i")
for ii in range(100000 ):
ar.append(ii )
print(len(ar))
while a<100000 and i<100000 :
if ar[i]<98887 :# defined what is close to the peaks.
ar[i ]=arbitrary
a+=1
i+=1
while arbitrary in ar:
del ar[ar.index(arbitrary )]
print(str(len(ar)) +"\n (length) ")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Tom Hirshberg |
| Solution 2 |
