Semantic Interpretation
Semantic interpretation tags provide the mechanism for returning grammar match data to the VoiceXML application.
Grammars for the Tellme Platform must conform to the W3C’s Semantic Interpretation for Speech Recognition (SISR) 1.0 standard.
The SISR specification determines how the <tag> elements are used to convert the result generated by an SRGS speech grammar processor into an ECMAScript (JavaScript) object that can be processed by the VoiceXML application.
This chapter includes the following topics:
- <tag> syntax
- Using the <tag> element
- How rules can return values
- Rule variables
- Retrieving values from referenced rules
- The rules object
- Using concatenation with repeats
<tag> syntax
According to the SISR specification, the semantic interpretation <tag>
elements can have one of two syntaxes:
- The Script tag syntax, enabled by setting the
<grammar>
element’stag-format
attribute to"semantics/1.0"
, defines the content in the<tag>
elements to be semantic ECMAScript (Compact version). - The String Literal tag syntax, enabled by setting the
<grammar>
element’stag-format
attribute to"semantics/1.0-literals"
, defines the content in the<tag>
elements to be string literals.
Note
The Script tag syntax is recommended for all but very simple grammars.
To use the Script tag syntax, your <grammar>
element header must include the tag-format
attribute, like this:
<grammar tag-format="semantics/1.0">.......</grammar>
![]() |
---|
When using the Script tag syntax, an entire ECMAScript program can be placed between the <tag> tags (in such a case each statement must terminate with a semi-colon). However, the best practice is to keep the embedded ECMAScript very simple. |
Using the <tag> element
As noted above, <tag>
elements are used to convert the result generated by an SRGS speech grammar processor into an ECMAScript object that can be processed by the VoiceXML application. For example:
<item> yeah<tag>out="yes";</tag></item>
out="yes";
is an ECMAScript statement assigning the string "yes" to the output variable when the speaker says "yeah.".
Note
The semi-colon ending the ECMAScript statement is not necessary, but it is good practice. If additional ECMAScript statements are included in the tag, however, then the semi-colon is necessary to delimit the individual statements.
Scripts in <tag>
elements are executed only if the <rule>
or <item>
containing it provides a match.
How rules can return values
Rule variables
Every <rule>
element has a Rule Variable. When tag-format= "semantics/1.0"
, the Rule Variable is named out
. The variable out
is implicitly declared as an empty object before the first tag in the rule is executed. The <tag>
element (which is compact ECMAScript) can either:
- assign a primitive value like a number or string (for example,
out="george";
), which converts theout
object to an ordinary variable with the nameout
- add properties to the object, for example,
out.firstName="george";
Warning
If no <tag>
element is used, the rule simply returns the text of words that were recognized.
Retrieving values from referenced rules
The rules object
When using tag-format = "semantics/1.0"
, there is a global rules
object that has properties that hold the Rule Variable for every visible rule. The Rule Variable for any visible rule is contained in rules.rulename
, where rulename
is the name of the rule. Therefore, in complex grammars, the Rule variable (out
) for every rule is available. Since the Rule Variable is an object, you can define properties for it—for example, rules.foodChoice.hot_dog
, where foodChoice
is the name of a rule.
rules.latest()
The rules
object has a method, rules.latest()
, that captures the most recent grammar match made at any given point in time. This can be used to collect more than one match from the same grammar. The example below demonstrates this. The example also shows how to return the grammar match as a string in the form "src=sfo^dst=lax"
. Tellme grammars typically return matches in strings like this, using the caret (^) as a delimiter.
<grammar mode="voice"
root="top"
tag-format="semantics/1.0"
version="1.0"
xml:lang="en-US">
<rule id="top" scope="public">
<item>
<item repeat="0-1">
i want to go
</item>
from
<item>
<ruleref uri="#CityName"/>
<tag> out = "src=" + rules.latest(); </tag>
</item>
to
<item>
<ruleref uri="#CityName"/>
<tag> out += "^dst=" + rules.latest(); </tag>
</item>
</item>
</rule>
<rule id="CityName" scope="private">
<one-of>
<item>
san francisco
<tag>out = "sfo";</tag>
</item>
<item>
los angeles
<tag>out = "lax";</tag>
</item>
<item>
new orleans
<tag>out = "msy";</tag>
</item>
</one-of>
</rule>
</grammar>
The speech recognition engine finds a match between the speaker's utterance and this grammar only if the speaker says "I want to go from," followed by one of the three cities, followed by "to," and finally followed by another of the three cities. When a match occurs, the out
variable contains the string "src=city1^dst=city2
".
Using concatenation with repeats
When a grammar is invoked and a match is found, the out
and rules.latest()
variables are populated with the match. If the same grammar is invoked again, the contents of these variables are replaced.
When the same grammar is repeatedly invoked, for example to obtain a string of digits, you must concatenate each new digit with the cumulative sequence of digits, as follows:
<field name="phoneNumber">
<prompt>
what is your phone number with area code first
</prompt>
<grammar mode="voice" xml:lang="en-US"
tag-format="semantics/1.0"
version="1.0" root="phoneNum">
<rule id="phoneNum">
<tag>out=""</tag>
<item repeat="10">
<ruleref uri="http://www.ourgrammars.com/digits.grxml"/>
<tag>out += rules.latest( );</tag>
</item>
</rule>
</grammar>
<filled>........</filled>
</field>
Note
The line <tag>out += rules.latest()</tag>
could have been written <tag>out += rules.digits;</tag>
, where digits
is the name of the rule in digits.grxml
.