'How to get proper PCM stream from Microsoft.CognitiveServices.Speech in C#
I am trying to use Azure TTS with discord but I can't get the stream from Azure TTS to Discord I use Discord.Net (https://discordnet.dev/guides/voice/sending-voice.html)
public static async Task<MemoryStream> GetTTSStream(string text)
{
var config = SpeechConfig.FromSubscription("", "");
using SpeechSynthesizer synthesizer = new(config, null);
SpeechSynthesisResult result = await synthesizer.SpeakTextAsync(text).ConfigureAwait(false);
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
var audioStream = AudioDataStream.FromResult(result);
var buffer = result.AudioData;
return new MemoryStream(buffer);
}
else if (result.Reason == ResultReason.Canceled)
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
StringBuilder sb = new StringBuilder();
sb.AppendLine($"CANCELED: Reason={cancellation.Reason}");
sb.AppendLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
sb.AppendLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Logger.Warning(sb.ToString());
}
return null;
}
Solution 1:[1]
As suggested by ramr-msft | Microsoft Docs:
You can try Speech Synthesis sample to pull audio output stream.
// Speech synthesis to pull audio output stream.
public static async Task SynthesisToPullAudioOutputStreamAsync()
{
// Creates an instance of a speech config with specified subscription key and service region.
// Replace with your own subscription key and service region (e.g., "westus").
var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
// Creates an audio out stream.
using (var stream = AudioOutputStream.CreatePullStream())
{
// Creates a speech synthesizer using audio stream output.
using (var streamConfig = AudioConfig.FromStreamOutput(stream))
using (var synthesizer = new SpeechSynthesizer(config, streamConfig))
{
while (true)
{
// Receives a text from console input and synthesize it to pull audio output stream.
Console.WriteLine("Enter some text that you want to synthesize, or enter empty text to exit.");
Console.Write("> ");
string text = Console.ReadLine();
if (string.IsNullOrEmpty(text))
{
break;
}
using (var result = await synthesizer.SpeakTextAsync(text))
{
if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
Console.WriteLine($"Speech synthesized for text [{text}], and the audio was written to output stream.");
}
else if (result.Reason == ResultReason.Canceled)
{
var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
if (cancellation.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
Console.WriteLine($"CANCELED: Did you update the subscription info?");
}
}
}
}
}
// Reads(pulls) data from the stream
byte[] buffer = new byte[32000];
uint filledSize = 0;
uint totalSize = 0;
while ((filledSize = stream.Read(buffer)) > 0)
{
Console.WriteLine($"{filledSize} bytes received.");
totalSize += filledSize;
}
Console.WriteLine($"Totally {totalSize} bytes received.");
}
}
References: How to get PCM stream from Azure's SpeechSynthesizer - Microsoft Q&A and cognitive-services-speech-sdk/speech_synthesis_samples.cs at master · Azure-Samples/cognitive-services-speech-sdk · GitHub
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MadhurajVadde-MT |
