Over here at Microsoft in the Netherlands we have limited spots for employees to park their car right below the building. In collaboration with Schiphol and SkiData, we received access to the actual counter for our garage, resulting in a Windows Phone app to check how many spots are left for employees and determine whether or not to go straight to the alternate parking lot. With the release of Windows Phone 8.1, I decided to update the app to include Geofence triggers to automatically show a toast notification when nearing the office (resulting in my previous post: Trigger a Background Task using Geofence on Windows Phone 8.1).
One of the existing features in the app is to call out the actual count using speech synthesis on a refresh. Previously, we had access to the Microsoft.Xna.Framework.Audio.SoundEffect.Play to play an audio fragment without interrupting the current playing audio/music and the Windows.Phone.Speech.Synthesis.SpeechSynthesizer.SpeakTextAsync method to directly synthesize the text from a string. Now that we’ve moved to Windows Runtime, things are a bit different and to play audio in a Universal App without interrupting the stream requires some additional thought.
New Windows.Media.SpeechSynthesis API
Now that Windows Phone also has access to the Windows Runtime APIs, first found on Windows 8, we can use the new Windows.Media.SpeechSynthesis namespace to achieve text-to-speech (TTS). As shown in the example on MSDN, we can easily generate a SpeechSynthesisStream from a string and then play the stream using a MediaElement:
1 2 3 4 5 6 7 8 9 10 11 12 |
// The media object for controlling and playing audio. MediaElement mediaElement = this.media; // The object for controlling the speech synthesis engine (voice). var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer(); // Generate the audio stream from plain text. SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync("Hello World"); // Send the stream to the media object. mediaElement.SetSource(stream, stream.ContentType); mediaElement.Play(); |
The challenge here is that using a MediaElement results in the current audio/music that’s being played on the device getting stopped to play our own audio. In this case, we don’t want to do that, as it’s just a simple number being called out, so we’ll have to find an alternative way of playing the SpeechSynthesisStream that the SpeechSynthesizer produces.
Enter DirectX and XAudio2
After some research, it was evident that the DirectX XAudio2 APIs would provide me with the functionality I need. Now, if you know me at all (professionally), you probably know that I’m a “managed code” kind of guy and my C++ skills are weak to say the least. Luckily, I have some great colleagues in Sweden (Anders Thun and Simon Jäger) who helped out tremendously in creating a Windows Runtime Component that can be used in Universal Apps.
An alternative could be to use SharpDX, which provides managed APIs for DirectX, but at the time I tested it, they didn’t provide the functionality/APIs I required in my scenario for Windows Runtime.
AudioHelper Universal Windows Runtime Component
As mentioned, Anders created a Universal Windows Runtime Component that can be used in Universal Apps to either play a WAV file or speech synthesis using the DirectX XAudio2 APIs. For convenience, we’ve put the source up on GitHub for easy consumption. There are two ways you can use the component:
Play a WAV file
- Add a WAV file to your project and set its Build Action to Content
- Use the AudioPlayer’s PlayAudio method, passing in the path to the WAV file, to play the audio
1 |
AudioHelper.AudioPlayer.Instance.PlayAudio(@"HelloWorld.wav"); |
Play a fragment using the speech synthesis engine
- Use the SpeechSynthesizer to create the stream, read the resulting stream into a byte buffer and play the resulting buffer using the AudioPlayer
NOTE: Make sure you enable the Microphone capability in the manifest to use speech synthesis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
// The object for controlling the speech synthesis engine (voice). var synth = new Windows.Media.SpeechSynthesis.SpeechSynthesizer(); // Generate the audio stream from plain text. var synthesizedText = "Hello World"; SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync(synthesizedText); // Initialize the byte buffer byte[] bytes = new byte[stream.Size]; IBuffer buffer = bytes.AsBuffer(); // Read the stream into the buffer await stream.ReadAsync(buffer, (uint)stream.Size, InputStreamOptions.None); AudioPlayer.Instance.PlayAudio(synthesizedText, buffer); |
Hope this helps anyone requiring similar functionality. Head over to GitHub to download, clone or contribute to the Windows Runtime Component. As always, share your thoughts/comments/feedback in the comments below or find us on Twitter.
Pingback: Play audio in a Universal App without interrupting the stream - Rajen's Technical Tidbits - Site Home - MSDN Blogs
Good to see this as Universal app. Nice one.
Pingback: Dark Core | //Build June 15, 2014
Pingback: Play audio in a Universal App without interrupting the stream
May be a stupid question, but is there any way to use this in a C# project or do I need to rewrite the whole class in C#?
OK, this was a stupid question. I was looking at code for too long.
Are there any other requirements for this though? I can’t get it to compile in my universal project in VS2013.
Hey Alex,
That is indeed exactly the purpose, use this in a C# app as a Windows Runtime Component. It shouldn’t require anything special, could you share the error messages you’re getting?
Hi Rajen,
Does the player work on window phone 8.1 emulator?
Hi, it definitely should. Are you having issues with it?
Ok, it works perfectly! There was a problem with my wav-file.
One more question. When I call it this way:
void PlaySound()
{
AudioHelper.AudioPlayer.Instance.PlayAudio(“Sounds/Blop.wav”);
}
the compiler outputs the warning:
“Because this call is not awaited, execution of the current method continues before the call is completed. Consider applying the ‘await’ operator to the result of the call.”
My purpose is to fire sound asynchronously and not wait it.
How should I do it without warning?
Thank you.
Glad you sorted it out. That warning is pretty common for awaitable calls that are not awaited, you can either ignore it or suppress it: http://msdn.microsoft.com/en-us/library/jj715718.aspx
Rajen,
Thanks for the example – I was wondering though, are you able to use this from a background task? I tried it with a simple time zone change background task – and no audio is played.
Thoughts?
Thanks,
– Jason
Hi Jason, this approach does not work in the background. How to play background audio is outlined here: http://msdn.microsoft.com/en-us/library/windows/apps/xaml/jj841209.aspx and here: http://msdn.microsoft.com/en-us/library/windows/apps/xaml/dn642090.aspx
Rajen – thanks. That’s what I figured. I was hoping to be able to play a TTS prompt during a geofence transition. Looks like that might not be possible. Like you – I didn’t want to stop the current audio playback, I just wanted the audible prompts to play.
Anyway – this sample is great in any case. Thank you for sharing.
I guess the best you could achieve is pop up a toast notification (http://blog.rajenki.com/2014/04/trigger-background-task-using-geofence-windows-phone-8-1/) and then deep-link that to a page that would do the TTS. Obviously it would require the user to tap on the notification for it to work.
I can’t get it to compile in my universal project in VS2013. I get the following errors:
error LNK2019: unresolved external symbol __imp__XAudio2Create@12 referenced in function “private: void __cdecl AudioXna::AudioPlayer::Initialize(unsigned int)” (?Initialize@AudioPlayer@AudioXna@@A$AAAXI@Z)
Any ideas?
Hi Dave,
I checked with Anders and he mentions you’re missing a linker input. Add xaudio2.lib to the list of import libs in Properites->Configuration Properties->Linker->Input->Additional Dependencies in the AudioHelper Project and it should get resolved.
I tried selecting the Solution node in the Solution Explorer. I’m afraid that I don’t get anything after “Properites->Configuration Properties” except a dialog box that shows project contexts. I used to be a C++ programmer about 15 years ago, so I do know what a linker is, I just don’t know how to invoke it using Visual Studio. 🙂
Hi Dave, you should follow the instructions on the AudioHelper Project, not on the Solution itself. Hope that helps!
It worked on the nodes “AudioHelper.Windows (Windows 8.1)” and “AudioHelper.WindowsPhone (Window Phone 8.1)” in the Solution Explorer.
Thanks so much!.
Thank you for this code! I was pulling my hair trying all variations of the MediaElement properties….
It works great!
Glad it was helpful!