You are reading help file online using chmlib.com
|
Grammar rules are elements that SAPI 5-compliant speech recognition (SR) engines use to restrict the possible word or sentence choices during the SR process. SR engines employ grammar rules to control the elements of sentence construction using the predetermined list of recognized word or phrase choices. This list of recognized words or phrase choices contained in the grammar rules forms the basis of the SR engine vocabulary.
The phrase or sentence uses each grammar rule element to determine the recognition path. For example, examine the phrase describing travel plans, "I would like to drive from Seattle to New York," and note that there are elements that determine the resulting information. In this example, a person is planning to drive to New York from Seattle. This is a very simple illustration of what could be a very complex problem. Determining the same travel plans without limiting the method, direction, and travel destination would result in an infinite number of travel options.
The resulting information can be determined by restricting the available choices for a given sentence. Using this method, the resulting information can be composed only from certain choices, thus eliminating the possibility of an infinite number of travel plan combinations.
I would like to drive from Seattle to New York. | | | | | [Method] | | | | / \ | | | | Fly Drive | | | | | | | | [Direction] | | | / \ | | | From To | | | | | | [City] | | Seattle | | New York | | Los Angeles | | Albuquerque | | | | [Direction] | / \ | To From | | [City] Seattle New York Los Angeles Albuquerque
The elements of interest in the example phrase are as follows:
- Method of travel (fly or drive), specifically "drive"
- Travel direction (from or to), specifically "from"
- The city of origin for the travel plan (from), specifically "Seattle"
- Travel direction compliment (from or to), specifically "to"
- The city of destination for the travel plan (to), specifically "New York"
The information can also be displayed as a graph of states and arcs, where each arc can have text (or semantic tags/properties) attached. The valid phrases are the unique paths through the graph, starting at the root and ending at a terminal state. Each state is denoted by the term (root node, interim node, and null) for the terminal node. The spoken text is denoted by words surrounded by quotation marks. The semantic property names are denoted by bold, block quoted words.
(root node) | |"I would like to" | | (interim node) /\ / \ "drive"/ \"fly" [METHOD] \ / \ / \/ (interim node) /\ "from"/ \"to" [DIRECTION] \ / \/ (interim node) /\ _____/ \_____ / \ / \ / \ / \ [CITY_1] / | / \ | | | \ "Seattle"| "New| |"Los |"Albuquerque" | York"| |Angeles" | | | | / | | | / \ / \ / \ / \ / \___\ \___/ \ / \ / \/ (interim node) /\ "from"/ \"to" [DIRECTION] \ / \/ (interim node) /\ _____/ \_____ / \ / \ / \ / \ / | / \ | | | \ "Seattle"| "New| |"Los |"Albuquerque" | York"| |Angeles" | | | | / | | | / [CITY_2] \ / \ / \ / \ / \___\ \___/ \ / \ / \/ (NULL)
If the user speaks the following phrase:
I would like to travel from Seattle to New York.
Grammar rules become concatenated phrase elements. These phrase elements are limited to the defined set of grammars. Control can be significantly improved over the resulting information by restricting the input choice to a limited set of possibilities. Otherwise, obtaining the travel plan information from the same sample phrase, "I would like to travel from Seattle to New York," would be considerably more ambiguous.
The complexity of parsing the same sentence increases exponentially without using a defined set of choices. Imagine the possible number of combinations in a sentence that is not restricted to a finite list of combinations. For example, examine the possible choice combinations by moving the mouse over the following sentence.
To display the available choice selections in the example phrase, move the mouse over the underlined text below:
"I want to—(unknown travel method)—(unknown travel direction)—(unknown city)—(unknown travel direction) (unknown city)." The amount of predictable information is significantly reduced without the ability to constrain the available choices within a sentence.
The semantic structure (using name/value pairs) is:
[METHOD="drive"], [DIRECTION="from"], [CITY_1="Seattle"], [DIRECTION="to"],
[CITY_2="New York"]
By parsing the semantic structure, the application can easily and accurately analyze the content of the original phrase, without parsing or analyzing individual words. The application developer can then write application logic to perform specific actions based on the previously mentioned semantic names, and specialize the action based on the values of each semantic property. The grammar author can add to or delete from the lists of words, without breaking the application logic.
Grammar rules apply to the following:
--------------------+ Animal +--- Non-terminal node | --------------------+ | /--+--\ -----------+ Cat--------/ \------Dog +--- Non-terminal node | | -----------+ | | | | -----------+ +-- Burmese +-- Airedale | +-- Himalayan +-- Poodle +--- Terminal nodes +-- Persian +-- Schnauzer | +-- Siamese +-- Whippet | -----------+
The text format grammar XML tags follow block scope methods that are similar to HTML tags. That is, each tag has an opening tag and a corresponding closing tag. There is more information about XML syntax in the Grammar XML Schema section.
XML tag syntax | Contents |
---|---|
<sometag NAME="some_name" VAL="some_value"> | Start of "sometag" tag scope which includes the name and value information. |
</sometag> | End of the "sometag" scope. |
You are reading help file online using chmlib.com
|