`

Nuance - Dragon Mobile SDK - Speech Kit

 
阅读更多

http://dragonmobile.nuancemobiledeveloper.com/public/Help/DragonMobileSDKReference_iOS/SpeechKit_Guide/Basics.html

 

Speech Kit Basics

The Speech Kit framework allows you to add voice recognition and text-to-speech services to your applications easily and quickly. This framework provides access to speech processing components hosted on a server through a clean asynchronous network service API, minimizing overhead and resource consumption. The Speech Kit framework lets you provide fast voice search, dictation, and high-quality, multilingual text-to-speech functionality in your application.

Speech Kit Architecture

The Speech Kit framework is a full-featured, high-level framework that automatically manages all the required low-level services.

../_images/speech_kit_architecture.png

Speech Kit Architecture

At the application level, there are two main components available to the developer: the recognizer and the text-to-speech synthesizer.

In the framework there are several coordinated processes:

  • The framework fully manages the audio system for recording and playback.
  • The networking component manages the connection to the server and, at the start of a new request, automatically re-establishes connections that have timed-out.
  • The end-of-speech detector determines when the user has stopped speaking and automatically stops recording.
  • The encoding component compresses and decompresses the streaming audio to reduce bandwidth requirements and decrease latency.

The server system is responsible for the majority of the work in the speech processing cycle. The complete recognition or synthesis procedure is performed on the server, consuming or producing the streaming audio. In addition, the server manages authentication as configured through the developer portal.

Using Speech Kit

You can use the Speech Kit framework in the same way that you use any of the standard iPhone frameworks such as Foundation or UIKit. The only difference is that the Speech Kit framework is a static framework and is entirely contained in your compiled application. This does not affect you as a developer except that you must be certain that you and any other developers working on your application all use the same release of Speech Kit. You can easily ensure this by including the entire framework in your application and your source control.

The Speech Kit framework depends on some core iPhone OS frameworks that you must include as dependencies in your application so that they are available at run time. In addition to Foundation, you must add the System Configuration and Audio Toolbox frameworks to your Xcode project, as follows:

  1. Start by selecting the Frameworks group within your project.
  2. Then right-click or command-click Frameworks and, from the menu, select Add ‣ Existing frameworks... .
  3. Finally, select the required frameworks and click Add . The frameworks appear in the Frameworks folder (see figure below).

To start using the SpeechKit framework, add it to your new or existing project, as follows:

  1. Open your project and select the group where you want the Speech Kit framework to be stored, for example:file:Frameworks .
  2. From the menu select Project ‣ Add to Project....
  3. Then find the framework SpeechKit.framework where you extracted the Dragon Mobile SDK and select Add .
  4. To ensure that the Speech Kit framework is stored in your project and is not referencing the location where you found it, select Copy items... and then select Add .
  5. You should now see the Speech Kit framework in your project, which you can expand to view the public headers.
../_images/required_frameworks.png

Frameworks Required for Speech Kit

The Speech Kit framework provides one top-level header, which provides access to the complete API including classes and constants. You should import the Speech Kit header in all source files where you intend to use Speech Kit services:

#import <SpeechKit/SpeechKit.h>


You are now ready to start using recognition and text-to-speech services.

Speech Kit Errors

While using the Speech Kit framework, you will occasionally encounter errors. In this framework there is a custom NSError domain, SKSpeechErrorDomain , which includes special error codes and messages to support your development and use. In all cases, errors have a valid localized description set, which may prove useful in development and, in some cases, may be presented to the user.

There are effectively two types of errors that can be expected in this framework.

  • The first type are service connection errors and include the SKServerConnectionError and SKServerRetryError codes. These errors indicate that there is some kind of failure in the connection with the speech server. The failure may be temporary, and it can be solved by retrying the query. The error may be the result of an authorization failure or some other network problem.
  • The second type are speech processing errors and include the SKRecognizerError and SKVocalizerError codes. These errors indicate a problem with the speech request, ranging from a text format issue to an audio detection failure.

It is essential to always monitor for errors, as signal conditions may generate errors even in a correctly implemented application. The application’s user interface needs to respond appropriately and elegantly to ensure a robust user experience.

分享到:
评论

相关推荐

    粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe.baiduyun_ok

    粤语语言包非常难找,这里补充一个资源,粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe.baiduyun 把后缀扩展名删除 粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe, 然后解压后...

    粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe.baiduyun

    粤语语言包非常难找,这里补充一个资源,粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe.baiduyun 把后缀扩展名删除 粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe, 然后解压后...

    cognos案列~Nuance-Watson (HK)

    ### Cognos案例分析:Nuance-Watson (HK) Limited #### 概述 Nuance-Watson (HK) Limited作为香港国际机场领先的零售运营商,管理着47家世界级免税店,提供包括手表珠宝、时尚配饰、香水化妆品、视听电子设备、...

    粤语TTS.rar

    压缩包内的文件名“粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe”揭示了一些关键信息。首先,“NextUp”、“Nuance”和“Scansoft”是知名的语音技术公司,特别是Nuance,它在TTS领域有很高的...

    微软TTS--C#语音生成

    在IT领域,文本转语音(TTS,Text-to-Speech)技术是一种将文字信息转化为可听见的语音输出的技术。微软提供了强大的TTS引擎,开发者可以通过编程接口(API)来实现这一功能,尤其在C#语言中,有丰富的库支持进行...

    TTS-dotdot(科大企业版音库批量转换程序).rar

    在现代信息技术领域,文本转语音(Text-to-Speech,简称TTS)技术已经广泛应用,为人们提供了便捷的交互方式。科大讯飞作为中国领先的智能语音技术提供商,其企业版音库批量转换程序——TTS-dotdot,成为了众多企业...

    微软语音引擎TTS--C#DEMO.rar

    微软语音引擎(Text-to-Speech, 简称TTS)是微软开发的一种技术,它允许计算机将文本转换成自然的语音输出。这种技术在各种应用程序中都有广泛的应用,如语音导航、电子阅读器、自动客服系统等。C#是一种常用的编程...

    潜力无限粤港澳大湾区城市群京东--201809.pdf

    潜力无限粤港澳大湾区城市群京东--201809.pdf

    Nuance Omnipage SDK v19 帮助文档

    Nuance Omnipage SDK v19 帮助文档 OCR 效果还可以的

    讯飞tts-3.0-粤语apk.zip

    在Android平台,开发者需要导入相应的SDK库,并按照API文档进行编程,调用TextToSpeech类的方法来实现文字转语音。同时,“语音播报”和“文字转语音”是这个功能的核心,它们使得应用不仅限于视觉交互,还能通过...

    NextUp_ScanSoft_RealSpeak_TTS_-_Kyoko_22kHz__Japanese_Voice_part2

    NextUp_ScanSoft_RealSpeak_TTS_-_Kyoko_22kHz__Japanese_Voice_part2

    消费电子中的Nuance语音识别技术

     语音识别技术,Automatic Speech Recognition,简称ASR,是一种让机器听懂人类语言的技术。语言是人类进行信息交流的最主要、最长用、最直接的方式。语音识别技术是实现人机对话的一项重大突破,在国外近年来发展...

    NextUp_ScanSoft_RealSpeak_TTS_-_Kyoko_22kHz__Japanese_Voice_part1

    NextUp_ScanSoft_RealSpeak_TTS_-_Kyoko_22kHz__Japanese_Voice_part1

    时尚文音(TTS)——文本转换成语音的利器

    时尚文音是一款强大的文本转换成语音(TTS,Text-to-Speech)工具,它具有高效、易用的特点,尤其适合那些需要将大量文字内容转化为可听音频的用户。这款软件体积小巧,却蕴含了丰富的功能,能将各种文本格式如TXT等...

    Win7系统TTS修复-采用原生Win7提取-色修复_2018-02-06.zip

    由于win7Ghost版本或者简版,将tts移除了,导致文本转语音无法成功,本工具是在win7SP1系统中提取的tts引擎文件,适合32bit和64bit,安装时请关闭各种安全卫士,需要写入注册表信息。

    cn-Sin-Ji.zip

    压缩包内的文件包括“Usp10.dll”,“粤语NextUp-Nuance-Scansoft-TTS-Chinese-Cantonese-F-Sin-Ji.exe”和“使用说明.txt”。 首先,我们来解析这些文件名: 1. **Usp10.dll**:这是一个动态链接库(DLL)文件,...

    Nuance语音识别技术

    Nuance语音识别技术是Automatic Speech Recognition (ASR)领域中的领军者,致力于让机器理解并处理人类的语言。这项技术在信息交流中起着至关重要的作用,因为它使人与机器之间的交互变得更加自然和便捷。语音识别...

    中文语音包ScanSoft_MeiLing_ChineseMandarinVoice

    TTS(Text-to-Speech)技术是其核心,它允许计算机将文本数据转化为可听见的语音输出,极大地便利了视觉障碍者以及那些在特定情况下无法阅读屏幕文字的用户。MeiLing是这款语音包中的语音引擎,它的设计目标是模仿...

    VRRcognition_nuance_DEMO_语音识别_vr_

    Nuance Dragon是Nuance的旗舰产品,它提供了高度准确的自然语言处理能力,能够理解和执行复杂的语音命令。在“VRRcognition.py”这个Python脚本中,可能包含了与Nuance API的接口,用于接收和处理用户的语音输入。...

Global site tag (gtag.js) - Google Analytics