'smooth signal and find peaks
Given I have an X and Y array such that:
X = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
and
Y = np.array([-19.9, -19.6, -17.6, -15.9, -19.9, -18.4, -17.7, -16.6, -19.5, -20.4, -17.6, -15.9])
I get a plot like:
Here there are 3 very clear peaks that I can see. I can fit this data using:
# fit polynomial
z = np.polyfit(X1, Y, 8)
f = np.poly1d(z)
# calculate new x's and y's
x_new = np.linspace(X[0], X[-1], 100)
y_new = f(x_new)
and I can get the following which shows the change in signal over the course of a year - in this case in rice agriculture and the number of agricultural cycles (3 peaks) :
Here I use scipy.signal.argrelextrema to find the peaks and troughs of the curve. However, to get a curve with a good fit is a very 'manual' approach and I have to interpret the data by eye first, in order to choose the polynomial order. I will be repeating this process on many datasets (100,000's) so won't be able to do this manually each time.
Furthermore, the number of peaks I have is likely to change. In fact my ultimate goal here is to categorize the datasets I have into the number of peaks I can detect. There are also cases where the signal has more noise.
I have looked into scipy.signal.find_peaks (and related algorithms) but this finds every peak and not just the major ones, particularly in noisier data. I have also looked into savgol filters and gaussian filters and am able to get a result but often have to specify the order of the polynomial etc, which is likely to change with the number of peaks.
Is there a way to smooth a signal to get an approximation of the number of peaks without having to manually specify polynomial orders etc? Is there an algorithm/method available that can detect general trends without too much user input?
I'm also open to alternative methods if there is a better method than curve fitting. I fear that the result I get out will only be as good as what I put in, and so any general curve fitting approaches will deliver poorer results.
Solution 1:[1]
Try the findpeaks library. It contains various methods for finding peaks and valleys in 1D vectors and 2D-arrays (or images).
pip install findpeaks
from findpeaks import findpeaks
X = [-19.9, -19.6, -17.6, -15.9, -19.9, -18.4, -17.7, -16.6, -19.5, -20.4, -17.6, -15.9]
# Initialize
fp = findpeaks(lookahead=1)
# Make the fit
results1 = fp.fit(X)
results1['df']
# x y labx valley peak labx_topology valley_topology peak_topology persistence
# 0 0 -19.9 1.0 True False 1.0 True False
# 1 1 -19.6 1.0 False False 1.0 False False
# 2 2 -17.6 1.0 False False 1.0 False False
# 3 3 -15.9 1.0 False True 1.0 False True
# 4 4 -19.9 1.0 False False 2.0 True False
# 5 5 -18.4 2.0 True False 2.0 False False
# 6 6 -17.7 2.0 False False 2.0 False False
# 7 7 -16.6 2.0 False True 2.0 False True
# 8 8 -19.5 2.0 False False 2.0 False False
# 9 9 -20.4 3.0 True False 2.0 False False
# 10 10 -17.6 3.0 False False 2.0 False False
# 11 11 -15.9 3.0 True False 2.0 True False
# Make plot
fp.plot()
# Initialize
fp = findpeaks(lookahead=1, interpolate=10)
# Make the fit
results2 = fp.fit(X)
# Results
results1['df']
# Make plot
fp.plot()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |




