'Real time audio acquisition using pyAudio (problem with CHUNK size)

I am having an issue when I run the code below. The goal is to develop an app that acheives real time sound acquisition. I have set the CHUNK (frame) size to 320 using 16KHz sampling rate, hence, frame duration of 0.02 s. The issue when I record, the result (the content of the variable "many") contains some glitch sounds or noise. When I double the CHUNK, the problem disapears. The value 0.02 depends on the nature of the problem I am trying to resolve. It is required to set to 0.02. Do you have any suggestions?

import pyaudio
import struct
import numpy as np
import matplotlib.pyplot as plt
import time
import IPython.display as ipd

CHUNK = int(1*320) 
FORMAT = pyaudio.paFloat32
CHANNELS = 1
RATE = 16000

p = pyaudio.PyAudio()

chosen_device_index = 1
for x in range(0,p.get_device_count()):
     info = p.get_device_info_by_index(x)
     #print p.get_device_info_by_index(x)
     if info["name"] == "pulse":
          chosen_device_index = info["index"]
          print("Chosen index: ", chosen_device_index)

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input_device_index=chosen_device_index,
                input=True,
                output=False,
                frames_per_buffer=CHUNK) 

plt.ion()
%matplotlib qt
fig, ax = plt.subplots()

x = np.arange(0, CHUNK)
data = stream.read(CHUNK)

print(len(data))

data_ = struct.unpack(str(CHUNK) + 'f', data)
line, = ax.plot(x, data_)
ax.set_ylim([-1,1])

many = []
while True:
    data = struct.unpack(str(CHUNK) + 'f', stream.read(CHUNK))
    line.set_ydata(data)
    fig.canvas.draw()
    fig.canvas.flush_events()
    many= np.concatenate((many, data),axis=None)

ipd.Audio(many,rate = 16000)


Solution 1:[1]

From the conversation between you can fdcpp, it seems true that the piece of code

line.set_ydata(data)
fig.canvas.draw()
fig.canvas.flush_events()
many= np.concatenate((many, data),axis=None)

takes more than 0.02 s to run. That's why when the next CHUNK size data comes, your code hasn't been ready to receive it, which causes input overflow.

There are different ways to bypass it. But I agree with fdcpp that the best way to solve this problem is to think about your end goal. For example, you can separate the processing of receiving audio data from processing the data, i.e., your line, fig code. One process just receives and stores the audio data, while the other process takes the stored data and draws it.

But please keep in mind that as long as the drawing part takes more than 0.02 s, you cannot achieve "real-time" as you wanted.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Chang