'Conversion between librosa.load() & pydub.AudioSegment.raw_data
I'm slightly stuck as I can't work out how to convert from np.ndarray provided by pydub.AudioSegment.raw_data to the data provided from librosa.load().
I'm executing FFT on the data and for some reason it's not accurate with pydub.AudioSegment. I need to convert the data to the same type as librosa.load() provides but can't work out the difference.
Here's by code:
import librosa
import numpy as np
import pydub
if __name__ == '__main__':
filename = "data/E.wav"
data, sr = librosa.load(filename)
print("Librosa data: {}".format(data))
aseg = pydub.AudioSegment.from_file(filename)
aseg_data = np.frombuffer(aseg.raw_data, dtype=np.float64)
print("AudioSegment data: {}".format(aseg_data))
And my output is:
Librosa data: [-0.00208831 -0.00306132 -0.00268072 ... 0.00057438 0.00082464 0.00097628]
AudioSegment data: [-7.37285249e+306 -7.37286320e+306 -7.02174591e+306 ... 3.19859460e-3084.45027928e-308 4.03300579e-308]
Link to audio I'm using - link
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
