'How to fit a sine curve to a small dataset

I have been struggling for apparently no reason trying to fit a sin function to a small dataset that resembles a sinusoid. I've looked at many other questions and tried different libraries and can't seem to find any glaring mistake in my code. Also in many answers people are fitting a function onto data where y = f(x); but I'm retrieving both of my lists independently from stellar spectra.

These are the lists for reference:

time = np.array([2454294.5084288 , 2454298.37039515, 2454298.6022165 ,
   2454299.34790096, 2454299.60750029, 2454300.35176022,
   2454300.61361622, 2454301.36130122, 2454301.57111912,
   2454301.57540159, 2454301.57978822, 2454301.5842906 ,
   2454301.58873511, 2454302.38635047, 2454302.59553152,
   2454303.41548415, 2454303.56765036, 2454303.61479213,
   2454304.38528718, 2454305.54043812, 2454306.36761011,
   2454306.58025083, 2454306.60772791, 2454307.36686591,
   2454307.49460991, 2454307.58258509, 2454308.3698358 ,
   2454308.59468672, 2454309.40004997, 2454309.51208756,
   2454310.43078368, 2454310.6091061 , 2454311.40121502,
   2454311.5702085 , 2454312.39758274, 2454312.54580053,
   2454313.52984047, 2454313.61734047, 2454314.37609003,
   2454315.56721061, 2454316.39218499, 2454316.5672538 ,
   2454317.49410168, 2454317.6280825 , 2454318.32944441,
   2454318.56913047])
velocities = np.array([-2.08468951, -2.26117398, -2.44703149, -2.10149768, -2.09835213,
   -2.20540079, -2.4221183 , -2.1394637 , -2.0841663 , -2.2458154 ,
   -2.06177386, -2.47993416, -2.13462117, -2.26602791, -2.47359571,
   -2.19834895, -2.17976339, -2.37745005, -2.48849617, -2.15875901,
   -2.27674409, -2.39054554, -2.34029665, -2.09267843, -2.20338104,
   -2.49483926, -2.08860222, -2.26816951, -2.08516229, -2.34925637,
   -2.09381667, -2.21849357, -2.43438148, -2.28439031, -2.43506056,
   -2.16953358, -2.24405359, -2.10093237, -2.33155007, -2.37739938,
   -2.42468714, -2.19635302, -2.368558  , -2.45959665, -2.13392004,
   -2.25268181]

These are radial velocities of a star observed at different times. When plotted they look like this: Plotted Data

This is then the code I'm using to fit a test sine on the data:

x = time
y = velocities

def sin_fit(x, A, w):
    return A * np.sin(w * x)

popt, pcov = curve_fit(sin_fit,x,y) #try to calculate exoplanet parameters with these data

xfit = np.arange(min(x),max(x),0.1)

fit = sin_fit(xfit,*popt)


mod = plt.figure()
plt.xlabel("Time (G. Days)")
plt.ylabel("Radial Velocity")
plt.scatter(x,[i for i in y],color="b",label="Data")
plt.plot(x,[i for i in y],color="b",alpha=0.2)
plt.plot(xfit,fit,color="r",label="Model Fit")
plt.legend()
mod.savefig("Data with sin fit.png")
plt.show()

I thought this was right, and it seems right by looking at other answers, but then this is what I get:

Data with model sine

What am I doing wrong?

Thank you in advanceee



Solution 1:[1]

I guess it's due the sin_fit function is not able to fit the data at all. The sin function per default whirls around y=0 while your data whirls somewhere around y=-2.3.

I tried your code and extended the sin_fit with an offset, yielding way better results (althought looking not too perfect):

def sin_fit(x, A, w, offset):
    return A * np.sin(w * x)  + offset

with this the function has at least a chance to fit

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ric Hard