[Android]Nuance SREC native engine vs Google Voice Search engine

laiyangdeli

浏览: 1508244 次
性别:
来自: 南京

最近访客更多访客>>

bitzgx

u012363178

二冲2010

u012361334

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Android
TTS&ASR

http://dmfs.org/handsfree/?engines

http://dmfs.org/handsfree/?assistants

Many (if not most or even all) recent Android phones feature (at least) two speech recognition approaches. The first one is the Nuance SREC package which is included in the Android sources and hopefully in all Android devices. The second one is Google Voice Search. If you didn't install any voice recognition software chances are good your device will use the Google Voice Search by default, but some users may have to install it themselves (look for Google Voice Search in the Market).

Note: Handsfree-Lite does not feature the native engine.

The following comparison does not claim to be complete or even true under all circumstances. It is just what I noticed during development and may help the user to decide which engine to use.

In this comparison I will refer to the Nuance engine as the native engine and the Google engine as the default engine despite of what will be installed on your device. In the following list the first paragraph always refers to the native engine and the second one to the default engine.

languages

As of Android 2.2 the native engine is delivered with en-us locale only. So only English words are reliably recognized. Though you can try to transcribe a word in your language using English pronunciation (e.g. 'unnamen' works amazingly well for the German word 'annehmen' to answer calls ;-) )

Handsfree sets up the default engine to your default language (the one you set up in your phones settings). This should work with most languages. Please understand that I can't guarantee for any language to be supported. You may try to install 3rd party software to add support for your language.

offline recognition

The native engine is implemented on your device and does not need a working internet connection.

The default engine uses Google's computing power to recognize your spoken words. So it doesn't work without an internet connection. This can be a problem in areas with weak infrastructure. Also watch out your costs if you don't have a flat rate!

accuracy

The native engine is grammar based and will always return one of the commands you entered. Even if you didn't say anything or something completely different. Handsfree will accept the result only if the engine is very confident that the words match. Anyway, under circumstances (e.g. music, traffic noise, people talking, reverb or echos) this engine may understand the wrong command and Handsfree will take the wrong action. Even if it is quiet but your pronunciation is not clear enough or to fast this may happen. If this happens to often to you, please contact me .

The default engine is a free text engine. This means it tries to recognize literally what you said. Under bad conditions (see last paragraph) and even under good conditions it can understand a different word than you said. Thats why Handsfree features assistants to learn all those alternatives (see assistants for details). The bright side is, the probability of false actions is very low (depending on the commands you've chosen).

speed

The native engine is pretty fast.

The default engine has to transmit some data via internet and wait for the results. This takes some time.

user interface

The native engine is without user interface by default and does not show any annoying popups.

The default engine always opens a popup dialog (you know, the one with the microphone). During the ringing of an incoming call this dialog blocks the accept and reject buttons on the screen. So you first have to cancel the recognition dialog on order to manage your call manually. And even worse, if it didn't understand you (because you didn't say anything), it sometimes shows a dialog you have to acknowledge, preventing Handsfree from starting another recognition.

recommendation

If you're comfortable with using English words try the native engine. It does it's job pretty well. If you get too many wrong results use the default engine.

The default engine on most Android devices (see engines ) is a free text engine. It tries to recognize exactly all words you said. Often this doesn't work. Reasons might be fast or unclear pronunciation or music, traffic or any other noise in the background. The result is a variety of recognized words although you always say the same command.

Handsfree features three assistants to learn all those words the recognition engine returns for your commands. (see the screenshot below)

screenshot 5

To teach Handsfree your commands as good as possible, start one of the assistants and press the 'Learn command' button. Then say your command as if you would get a call. The assistant shows the recognized word and adds it to the list.

Repeat the procedure multiple times and also under different conditions like inside, outside, in your car, on your bike, with soft music in the background, with and without a wired headset, in the morning, during a marathon and so on. You will see you get a long list of words. The more you train Handsfree the better the result at an actual call will be.

If you accidentally recorded something that wasn't meant as a command you always can remove it from the list by touching it until a context menu pops up. This menu has an option to delete the word or even the whole list (in case you want to change your command).

If you get warnings about connection problems, try to wait a few seconds or minutes and try again later.

分享到：

与SREC有关的代码 | Getting Started with the NDK

2012-03-02 20:30
浏览 2435
评论(0)
分类:移动开发
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论