'Fourier Transform - strange results

I'm trying to make some example of FFTs. The idea here is to have 3 wavelengths for 3 different musical notes (A, C, E), add them together (to form the aminor chord) and then do an FFT to retrieve the original frequencies.

import numpy as np
import matplotlib.pyplot as plt
import scipy.fft


def generate_sine_wave(freq, sample_rate, duration):
    x = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    frequencies = x * freq
    # 2pi because np.sin takes radians
    y = np.sin(2 * np.pi * frequencies)
    return x, y


def main():
    # Frequency of note in Aminor chord (A, C, E)
    # note_names = ('A', 'C', 'E')
    # fs = (440, 261.63, 329.63)
    fs = (27.50, 16.35, 20.60)
    # duration, in seconds.
    duration = .5
    # sample rate. determines how many data points the signal uses to represent
    # the sine wave per second. So if the signal had a sample rate of 10 Hz and
    # was a five-second sine wave, then it would have 10 * 5 = 50 data points.
    sample_rate = 1000

    fig, ax = plt.subplots(5)
    all_wavelengths = []
    # Create a linspace, with N samples from 0 to duration
    # x = np.linspace(0.0, T, N)
    for i, f in enumerate(fs):
        x, y = generate_sine_wave(f, sample_rate, duration)
        # y = np.sin(2 * np.pi * F * x)
        all_wavelengths.append(y)
        ax[i].plot(x, y)

    # sum of all notes
    aminor = np.sum(all_wavelengths, axis=0)
    ax[i].plot(x, aminor)

    yf = np.abs(scipy.fft.rfft(aminor))
    xf = scipy.fft.rfftfreq(int(sample_rate * duration), 1 / sample_rate)
    ax[i + 1].plot(xf, yf)
    ax[i + 1].vlines(fs, ymin=0, ymax=(max(yf)), color='purple')

    plt.show()


if __name__ == '__main__':
    main()

However, the FFT plot (last subplot) does not have the proper peak frequencies (highlighted through vertical purple lines). Why is that?

Plot



Solution 1:[1]

The FFT will only recover the contained frequencies exactly if the sampling window covers a multiple of the signal's period. Otherwise, if there is a "remainder", the frequency peaks will deviate from the exact values.

Since your A-minor signal contains three distinct frequencies, 27.50, 16.35, 20.60 Hz, you need a sampling duration which covers a multiple of the period for each of those components. In order to find that duration, you can compute the least common multiple of each of the fractional parts of the frequencies:

>>> import math
>>> math.lcm(50, 35, 60, 100)
2100

Note that we're including 100 here because the multiple also needs to satisfy the condition to sample a whole period. The above result implies that for a duration of 21 seconds, the frequencies will be recovered perfectly. Of course, any other multiple of 21 seconds will work as well. The following plot is obtained for a duration of 21 seconds:

Example FFT spectrum

Solution 2:[2]

I think that - within margin of error - the results do in-fact match your frequencies:

You can see in your frequency plot that the closest frequency in the plot to your actual frequencies do indeed have the highest amplitude.

However, because this is a DFT algorithm, and so the frequencies being returned are discrete, they don't exactly match the frequencies you used to construct your sample.

What you can try is making your sample size (ie the number of time points in your input data) either longer and/or a multiple of your input wavelengths. That should increase the frequency resolution and/or move the sampled output frequencies closer to input frequencies.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 a_guest
Solution 2 Richard