'Finding peak in a large data set with python

I want to know if there is a way to eliminate points that are not close to the peak. For example if I have a data set with 10 million points and the peak is around 5 million, how could i get rid of points that are no where near close to the peak so i can narrow down where my index point resides



Solution 1:[1]

You need first to define what is the range of numbers that are close to the peak. Let's assume you specify a threshold number, so you can keep only the elements that are close to the peak with distance at most threshold by using Numpy with condition. For example:

import numpy as np
data_size = 10000000
max_possible_peak = 5000000
data = np.random.rand(data_size) * max_possible_peak
threshold = 100
peak = max(data)
data_near = data[data > peak-threshold]

Solution 2:[2]

A native loop solution:

   # I suppose it is meant that small  numbers are being deleted. If the peak is already known 
     a=0
     i=0
     arbitrary = -777# a number to use for deletion. Matches the array member type and is not initially in the array (check it) 
     import array

     ar=array.array("i") 
     for ii in range(100000 ): 
                         ar.append(ii )
     print(len(ar))
     while a<100000 and i<100000 :
                   if ar[i]<98887 :# defined what is close to the peaks.
                      ar[i ]=arbitrary
                   a+=1 
                   i+=1  
     while arbitrary in ar:
           del ar[ar.index(arbitrary )] 
     print(str(len(ar)) +"\n (length)  ")
    

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tom Hirshberg
Solution 2