'google speech to text API - incorrect end timestamp

I was trying to analyze how does googles speech to text perform for a web oriented service. I tried to keep two parameters in consideration when it came to testing the speed.

  1. How long does it take for the response to be sent (time taken).
  2. How long does google take to process an audio (call back time).

For the time taken portion I use pythons time library to time the response().

 start_time=time.time()
        ## Step 3. Transcribing the RecognitionAudio objects
        response_standard = speech_client.recognize(
            config=config_mp3,
            audio=audio_mp3
        )
        results_list.append([response_standard,time.time()-start_time,file])
        

I get the time taken to transcribe as a response from the server through the API.

The issue that I'm facing is that the time taken is less than the time it takes to transcribe audio to text as shown. I'm not sure why this is happening as the response is only appended after its successfully retrieved from the server.

enter image description here

EDIT: I suspect the end time that is being returned from the speach.recognize() method returns the time for which audio was present in the file. Not sure why its returning this as the docs don't specify it.



Solution 1:[1]

This is an expected behavior if you are using Synchronous audio recognition because it receives the result after all the audio has been sent and processed. If you are using Asynchronous audio recognition, this would not be happening because it receives the result while sending the audio.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jose Gutierrez Paliza