'A code that splits an audio file on channels and then split it on silence

so i have an audio file that is a call where every person is talking on a channel ex:1st person is talking on the left channel, 2nd person is talking on the right channel. what i am trying to do is to convert the audio file to a conversation style text. ex: person1:hey,person2:hey,person1:how are you doing,person2:im good thank you. what i am struggling with is how to split the audio file on channels and in the same time on silence. here is my code for splitting on silence,i dont know how to add the splitting on channels code to it. code:

import speech_recognition as sr 
from scipy.io import wavfile
import os 
from pydub import AudioSegment
from pydub.silence import split_on_silence
r = sr.Recognizer()
def get_large_audio_transcription(path):
"""
Splitting the large audio file into chunks
and apply speech recognition on each of these chunks
"""
# open the audio file using pydub
sound = AudioSegment.from_wav(path)  
# split audio sound where silence is 700 miliseconds or more and get chunks
chunks = split_on_silence(sound,
    # experiment with this value for your target audio file
    min_silence_len = 400,
    # adjust this per requirement
    silence_thresh = sound.dBFS-14,
    # keep the silence for 1 second, adjustable as well
    keep_silence=400,
)
folder_name = "audio-chunks"
# create a directory to store the audio chunks
if not os.path.isdir(folder_name):
    os.mkdir(folder_name)
whole_text = ""
# process each chunk 
for i, audio_chunk in enumerate(chunks, start=1):
    # export audio chunk and save it in
    # the `folder_name` directory.
    chunk_filename = os.path.join(folder_name, f"chunk{i}.wav")
    audio_chunk.export(chunk_filename, format="wav")
    # recognize the chunk
    with sr.AudioFile(chunk_filename) as source:
        audio_listened = r.record(source)
        # try converting it to text
        try:
            text = r.recognize_google(audio_listened,language='tr-TR')
        except sr.UnknownValueError as e:
            print("Error:", str(e))
        else:
            text = f"{text.capitalize()}. "
            print(chunk_filename, ":", text)
            whole_text += text
# return the text for all chunks detected
return whole_text
get_large_audio_transcription('audio.wav')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'A code that splits an audio file on channels and then split it on silence

Sources

Related Questions