'Can I do recognition from numpy array in python SpeechRecognition?
I'm recording a numpy array dt and then writing it in .wav by code like this:
dt = np.int16(dt/np.max(np.abs(dt)) * 32767)
scipy.io.wavfile.write("tmp.wav", samplerate, dt)
after that I read it and recognize by code
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("tmp.wav") as source:
audio_text = r.listen(source)
return r.recognize_google(audio_text, language = lang)
Can I do recognition from numpy array without using wav? Cuz it takes excess time
Solution 1:[1]
Assuming this is the module you are using, and according to its documentation, you can pass any file-like object to AudioFile(). File-like objects are objects that support read and write operations.
You should be able to stick the byte representation of the wav file into a io.BytesIO object, which supports these operations, and pass that into your speech recognition module. scipy.io.wavfile.write() supports writing to such file-like objects.
I don't have the package or any WAV files to test it, but let me know if something like this works:
wav_bytes = io.BytesIO()
scipy.io.wavfile.write(wav_bytes, samplerate, dt)
with sr.AudioFile(wav_bytes) as source:
...
Solution 2:[2]
You can create an audio data object first with AudioData, this is the source that the recognizer needs as a file-like object:
import io
from scipy.io.wavfile import write
import speech_recognition
byte_io = io.BytesIO(bytes())
write(byte_io, sr, audio_array)
result_bytes = byte_io.read()
audio_data = speech_recognition.AudioData(result_bytes, sr, 2)
r = speech_recognition.Recognizer()
text = r.recognize_google(audio_data)
audio_array is a 1-D numpy.ndarray with int16 values and sr is the sampling rate.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | anroesti |
| Solution 2 | H_Barrio |
