Chapter 1
The SAPI 5 Reference API is an excellent guide for programming speech applications. With the large number of methods, interfaces, structures, and enumerations SAPI 5 offers, the reference API is a required document. However, those who are new to speech applications may be a little lost at first, as the reference API makes little attempt to weave all the parts together.
The goal of this book is to help you write properly structured SAPI 5 applications using a series of examples called Coffee. The application uses a coffee shop motif designed so that you may place orders, talk to management, or buy items at the store. Since each application builds upon the previous one, you need to understand each step before moving on to the next one.
There are two prerequisites for using these examples. First, that you generally understand graphical interface programming and specifically understand native Windows programming. Although you will concentrate on SAPI 5 topics, the Coffee examples are Windows applications. The majority of the code in the samples is a framework and is used only for running the application. It handles keyboard, mouse, screen updates, and other processing messages. Unless there is a relevance to SAPI 5, much of the code is will not be discussed.
Second, you should have some experience with C/C++ programming. The intent is to keep programming simple and consistent. There are other models and languages for programming Windows including Visual Basic, JAVA and Microsoft Foundation Class (MFC). MFC adds a layer of complexity that you do not need at the moment. After you gain proficiency with SAPI 5 code, theory, and implementation, feel free to change approaches.
Additionally, SAPI 5 is component object model (COM) based. Although COM proficiency is not required, some understanding of it is. To further simplify COM, SAPI 5 uses active template library (ATL) functions to complement COM. You need a basic understanding of COM smart pointer CComPtr. Additional information about these topics may be found in MSDN or in the myriad of books available through popular bookstores.
Each chapter describes the important concepts introduced. The discussion follows the code and provides examples from the Coffee code itself. The first Coffee example is the foundation of the entire process and is slightly longer than other chapters. The narrative is a brief overview and presents enough material to complete the example, though not enough to exhaust the subject.
Each Coffee chapter builds on the previous examples. Since no code is ever removed, changes from chapter to chapter may be found by comparing files or entire folders using a difference engine. However, in general, any application may be used.
You will need a copy of Visual C 6.0 with Service Pack 3 or later version. In general, any 32-bit C compiler will work. However, the samples assume Microsoft Visual Studio 6.0 SP3 or later.
SAPI 5 Installation source. This may be a SAPI developer’s CD or the installer through another source (if available) such as the Microsoft Web site. Regardless, you are encouraged to install SAPI 5 only through an installation package. The SAPI CD installs all the required components including the required dynamic link libraries (DLL), headers, registry entries, and other resources.
In addition to having up-to-date source files and examples, you need the reference API to look up interfaces and methods. Even though the examples are well commented and the tutorials contain narratives about the calls and approaches, the reference API explains each call in detail and to a greater extent than is possible in other sources.
To configure your system for speech recognition, go to Speech properties in Control Panel and click the Speech Recognition tab. Speak into your microphone and observe the volume meter in the microphone window; if the meter registers the volume level the microphone works. Then click the Text-to-Speech tab. To test the audio output, click Preview Voice. The text in this section will be spoken, highlighting the words as they are spoken. If this is the case, then audio output also works. If neither works, see the Troubleshooting section.
It is also recommended that you use the microphone training wizard. From the Speech Recognition tab, click Train Profile. The training wizard instructs you in microphone placement and input level adjustment so that SAPI is able to recognize your commands. For the Coffee examples, the list of commands is quite limited and this training is not required. Speak clearly and deliberately into the microphone and Coffee should be able to recognize your words.