'com.microsoft.cognitiveservices.speech.SpeechSynthesizer fails with "USP error: timeout waiting for the first audio chunk"

so im trying to implement microsoft tts on android using Official Docs

my code looks like :

class TextToSpeech(val context: Context) {
  private val speechConfig: SpeechConfig = SpeechConfig.fromSubscription(SPEECH_SUBSCRIPTION_KEY, "southeastasia")
  private var speechSynthesizer: SpeechSynthesizer

  init {
    speechConfig.speechSynthesisLanguage = "fa-IR"
    speechConfig.speechSynthesisVoiceName = "fa-IR-DilaraNeural"
    speechConfig.enableAudioLogging()
    val audioConfig = AudioConfig.fromDefaultSpeakerOutput()
    speechSynthesizer = SpeechSynthesizer(speechConfig, audioConfig)
   }


  fun speak(pText: String) {

    speechSynthesizer.SynthesisStarted.addEventListener { _, _ ->
      Log.d(TAG, "speak: SynthesisStarted")
    }
    speechSynthesizer.SynthesisCompleted.addEventListener { _, _ ->
      Log.d(TAG, "speak: SynthesisCompleted")
    }
    speechSynthesizer.SynthesisCanceled.addEventListener { any: Any, speechSynthesisEventArgs: SpeechSynthesisEventArgs ->
      val details = SpeechSynthesisCancellationDetails.fromResult(speechSynthesisEventArgs.result)
      Log.d(TAG, "speak: SynthesisCanceled")
    }
    speechSynthesizer.Synthesizing.addEventListener { _, _ ->
      Log.d(TAG, "speak: Synthesizing")
    }
    speechSynthesizer.SpeakText(text)

  }

}

The Problem is when i call speak method the "SynthesisStarted" will trigger and then after a few seconds the "SynthesisCanceled" going triggered with following result

CancellationReason:Error ErrorCode: ServiceTimeout ErrorDetails:USP error: timeout waiting for the first audio chunk



Solution 1:[1]

There is a web socket connection problem. First need to check with how long you want to make the connection alive even there is no recognition of input.

For access token, the access token should be sent to the service as the Authorization: Bearer header. Each access token is valid for 10 minutes. You can get a new token at any time, however, to minimize network traffic and latency, we recommend using the same token for nine minutes.

The below is the example of short audio scenario.

 POST /cognitiveservices/v1 HTTP/1.1
 Authorization: Bearer YOUR_ACCESS_TOKEN
 Host: westus.stt.speech.microsoft.com
 Content-type: application/ssml+xml
 Content-Length: 199
 Connection: Keep-Alive
    
 // Message body here...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SairamTadepalli-MT