×
Namespaces

Variants
Actions
Revision as of 13:28, 8 November 2012 by chintandave_er (Talk | contribs)

Voice Commands in Windows Phone 8

From Nokia Developer Wiki
Jump to: navigation, search

This article explains how to use voice commands in an application to perform actions, tasks or any initiative work.

Note.pngNote: This is a community entry in the Windows Phone 8 Wiki Competition 2012Q4.

WP Metro Icon Multimedia.png
SignpostIcon XAML 40.png
WP Metro Icon WP8.png
Article Metadata
Code ExampleTested with
SDK: windows phone 8 sdk
Devices(s): windows phone 8 emulator
Compatibility
Platform(s): windows phone
Windows Phone 8
Article
Created: mehul_raje (06 Nov 2012)
Last edited: chintandave_er (08 Nov 2012)

Contents

Introduction

Windows phone 8 comes with very great speech functionality which includes

  1. voice commands
  2. speech recognition
  3. text-to-speech, means now developer is able to add above mentioned functionality in his app.

Using voice command functionality user is able to make his application responsible to respond voice commands.

With the help of phrase link to the specific application page application can perform specific task or any initiative work. You can simply check voice command functionality in your windows phone 7.5 and above by just holding start key.

Refer following screen shots

Basic voice command listening UI Basic what can i say UI
Listening.png Whatcanisay.png

How to achieve

To add voice command functionality in an application following steps needs to be followed

  1. Create voice command file
  2. Install voice command file in OS
  3. Write code to handle navigation handling and command execution.

Capabilities Require

ID_CAP_NETWORKING
ID_CAP_SPEECH_RECOGNITION
ID_CAP_MICROPHONE

Create Voice command definition file

How to create Voice command definition file

  1. Right click the windows phone project in visual studio
  2. Go to Add New Item
  3. Select Voice Command Definition from list
  4. Give proper name to the VCD file.

Refer following screen shot.

Add new VCD file in project
Addnewvcd.png

By default visual studio provides basic template of VCD file, now you need to update this VCD file with your own voice commands and there intended action. Typical voice command file contains

  1. Example phrase to show how user can invoke command
  2. The words or phrase that can be recognized to invoke command
  3. Action taken after the particular command is invoked with the help of page navigation.
  4. The text to display or speak to respond user after command recognized(note it uses TTS functionality to speak text)

Voice Command Definition file in detail

lets consider below example of VCD file

<?xml version="1.0" encoding="utf-8"?>
 
<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.0">
<CommandSet xml:lang="en-US" Name="MyVoiceCommands">
<CommandPrefix>hello</CommandPrefix>
<Example>hello what is current date</Example>
 
<Command Name="Util">
<Example> What is current time </Example>
<ListenFor> what [is] current {options}</ListenFor>
<Feedback>showing current {options}</Feedback>
<Navigate Target="Target.xaml"/>
</Command>
 
<Command Name="Calculator">
<Example> 2 plus 3 </Example>
<ListenFor>{operand1} {operation} {operand2} </ListenFor>
<Feedback> Showing {operand1} {operation} {operand2}</Feedback>
<Navigate Target="Target.xaml"/>
</Command>
 
 
<PhraseList Label="options">
<Item>time</Item>
<Item>date</Item>
<Item>battery percentage</Item>
</PhraseList>
 
<PhraseList Label="operation">
<Item>plus</Item>
<Item>minus</Item>
</PhraseList>
 
<PhraseList Label="operand1">
<Item>1</Item>
<Item>2</Item>
</PhraseList>
 
<PhraseList Label="operand2">
<Item>1</Item>
<Item>2</Item>
</PhraseList>
 
</CommandSet>
</VoiceCommands>

VoiceCommands: This is root element of voice command definition file it can contains one or more CommandSet element each of which contains voice commands.

CommandSet: This element serves as container for all voice command, user can define actual voice commands under it for single language defined for xml:lang attribute of CommandSet, user can also specify Name for CommandSet optionally but it becomes useful for modifying CommandSet’s PhraseList at runtime.

CommandPrefix: This is child element of VoiceCommands which can be optional, but if present must be the first child element of CommandSet ,This element is used for giving name for application which user can speak in order to give voice command hence it is useful for the application which have long and difficult to pronounce names. In above example command prefix is “hello”, means if user wants to give command to this application then he must use “hello” word at the start of any command.

Command: This is compulsory child element for CommandSet Element This element defines action which needs to be taken after receiving voice command. Each command corresponds to specific page in your application, you can have multiple commands.

Example: This is compulsory child element of command, it can be used to help user for what he can say as in “What can I Say” screen. You can view this in supported command list of application, please refer following screen shot, it shows Did you know? screen after successfully registering above vcd file, this screen clearly shows example which you mentioned in VCD file.

Did you know Screen
Diduknow.png

Listenfor: Compulsory child element of Command element, it defines text which Application recognizes for command. As shown in above VCD file you can use curly and square braces. Curly braces represents Phrase list and square braces represent optional text means user does not have any compulsion for speaking text defined in square bracket. Single ListenFor element may refer to one or more PhraseList as shown in above example of vcd file Calculator command refers to three PhraseList namely operand1, operation, operand2 whereas Util command refers single PhraseList namely options.

Feedback: Compulsory child element for Command element, it contains text which is displayed and spoken using text to speech when command is recognized. Feedback element can contain reference to the PhraseList element but in that case every ListenFor element in the containing Command element must also reference to the same PhraseList element.
Refer following screen shot, which shows feedback element after command gets recognized.

FeedBack UI
Feedback.png

Navigate: Compulsory child element for Command element, The Target attribute defined contains page which loads after command is recognized. If you omit Target element then app launches with default Mainpage, you can also specify query string for the page. Eg. <Navigate Target=”Target.xaml?key=value”/>

PhraseList: Optional child of the CommandSet element ,It require Label value which Listenfor, Feedback element use to reference the PhraseList means those which are enclosed in curly braces. It contains Item elements each of which defines word that can be used to initiate command that references PhraseList.

Item: Optional child element of CommandSet element. It contains one or more words that can be recognized to initiate voice command.

Installing/Registering voice command definition file

private async void RegisterVoiceCommands()
{
await VoiceCommandService.InstallCommandSetsFromFileAsync( new Uri("ms-appx:///vcd.xml", UriKind.RelativeOrAbsolute));
}

When voice command for app is recognized the target page gets launched and URI for that page contains recognized commands in the form of QueryString. You can use OnNavaigatedTo method to extract parameter from QueryString using NavigationContext. If the page is launched from voice command then Query String contains voiceCommandName parameter in it. voiceCommandName parameter is useful in case when more than one commands targets same page,in this case voiceCommandName is used to figure out which command user launched.


Modify Voice Command Definition file

In most of the cases we want PhraseList Items element to be populated at run time using webservice or some other source, To achieve this functionality SDK gives you ability to modify voice command definition file Below is code snippets showing how to modify VCD file. Suppose we want to add more items to operand1 and operand2, so we just called ModifyPhraseList function twice from Button click event. refer the following code

private void Button_Click_1(object sender, RoutedEventArgs e)
{
ModifyPhraseList("MyVoiceCommands", "operand1");
ModifyPhraseList("MyVoiceCommands", "operand2");
}
private async void ModifyPhraseList(string commandsetname,string phraselistname)
{
string[] numbers = new string[]{"1","2","3","4","5","6","7","8","9","10"};
var vcs = VoiceCommandService.InstalledCommandSets[commandsetname];
await vcs.UpdatePhraseListAsync(phraselistname, numbers);
}

Following is list of some commands which user can invoke using above vcd file

  1. hello what is current date
  2. hello what is current battery percentage.
  3. hello two plus two.
  4. hello two minus two.
Command KeyWord Specification
hello what is current date hello CommandPrefix
current date PhraseList “options”
hello two plus two hello CommandPrefix
two PhraseList “operand1”
plus PhraseList “operation”
two PhraseList “operand2”

Handling navigation

When application recognizes voice command then target page is launched and recognized parameters are included in query string.
OnNavigatedTo method in target page is used to extract parameters, if application is launched from voice command then a parameter named voiceCommandName exist in query string so checking its existence is better way to identify whether page is launched from voice command.
Suppose user says “hello what is current date” then the query string for the same is

[0]={[voiceCommandName,Util]}

[1]={[reco,hello what is current date]}
[2]={[options,date]}

And suppose user says “hello two plus two” then query string is

[0]={[voiceCommandName,Calculator]}

[1]={[reco,hello 2 plus 2]}
[2]={[operand2,2]}
[3]={[operation,plus]}
[4]={[operand1,2]}

voiceCommandName parameter contains command name.
reco parameter contains whole text which is expected command .
options, operand2, operation, operand1 contains recognized words from phrase list.

Refer following code snippet to know how to retrieve parametrs from query string for above mentioned vcd file.

protected override void OnNavigatedTo(NavigationEventArgs e)
{
if (NavigationContext.QueryString.ContainsKey("voiceCommandName"))
{
string voicecommandname = NavigationContext.QueryString["voiceCommandName"];
switch (voicecommandname)
{
case "Util":
string options = NavigationContext.QueryString["options"];
//do something
break;
 
case "Calculator":
string number1 = NavigationContext.QueryString["operand1"];
string number2 = NavigationContext.QueryString["operand2"];
string oprator = NavigationContext.QueryString["operation"];
try
{
//do something
}
catch (Exception ex)
{
Console.WriteLine(ex.Message+"\n"+ex.StackTrace);
}
break;
 
default:
break;
}
}
}

Note switch case is used because application may have number of commands which targets same page, so it become useful to know which command user has given.

Localization of voice commands

You can specify commands in voice command definition file for various languages.
You can create multiple CommandSets and set xml:lang to the language you want, to test application for particular language make sure that CommandSet language must match with language selected on phone.

Example

Find complete working example here File:WinPhoneVoiceCommands.zip

584 page views in the last 30 days.
×