- 浏览: 1508117 次
- 性别:
- 来自: 南京
-
文章分类
- 全部博客 (419)
- XMPP (19)
- Android (180)
- Java (59)
- Network (4)
- HTML5 (13)
- Eclipse (9)
- SCM (23)
- C/C++ (4)
- UML (4)
- Libjingle (15)
- Tools&Softwares (29)
- Linphone (5)
- Linux&UNIX (6)
- Windows (18)
- Google (10)
- MISC (3)
- SIP (6)
- SQLite (5)
- Security (4)
- Opensource (29)
- Online (2)
- 文章 (3)
- MemoryLeak (10)
- Decompile (5)
- Ruby (1)
- Image (1)
- Bat (4)
- TTS&ASR (28)
- Multimedia (1)
- iOS (20)
- Asciiflow - ASCII Flow Diagram Tool.htm (1)
- Networking (1)
- DLNA&UPnP (2)
- Chrome (2)
- CI (1)
- SmartHome (0)
- CloudComputing (1)
- NodeJS (3)
- MachineLearning (2)
最新评论
-
bzhao:
点赞123!
Windows的adb shell中使用vi不乱码方法及AdbPutty -
wahahachuang8:
我觉得这种东西自己开发太麻烦了,就别自己捣鼓了,找个第三方,方 ...
HTML5 WebSocket 技术介绍 -
obehavior:
view.setOnTouchListenerview是什么
[转]android 一直在最前面的浮动窗口效果 -
wutenghua:
[转]android 一直在最前面的浮动窗口效果 -
zee3.lin:
Sorry~~
When I build "call ...
Step by Step about How to Build libjingle 0.4
http://sensoryinc.com/blog/?p=494
An interesting blog post (from PC World) came out following Apple’s iPhone 4s intro with Siri. I think everyone knows what Siri is…it’s the Apple acquisition that has turned into a big part of the Apple user experience. Siri technology allows a user to not only search but control various aspects of a smartphone by voice in a “natural language” manner.
The blog post depicts a looming showdown between Sensory and Apple’s Siri. It is quite kind to Sensory, pointing out our near-flawless performance in noise and how TrulyHandsfree™ does not require button presses. While those points are true, Sensory is certainly NOT a competitor to Siri. We do partner with companies like Vlingo that might be considered a Siri competitor, but Sensory’s TrulyHandsfree is just the first part of a multi-stage process for creating a true Voice User Interface.
Here is the basic process:
It’s just that first step that Sensory does better than anyone else. However, it’s an important step that requires a few critical characteristics:
- Extremely fast response time. Since it basically competes with a button press, it has to have a similar or faster response time. Because TrulyHandsfree uses a probabilistic approach, it can respond without having to wait for the recognizer to determine if the word is even finished! Slow response times lead users to speak before the Step 2 recognizer is ready to listen, which is a major cause of failure.
- Low power consumption. If it’s always on and always listening, it can’t be a power hog. Sensory can perform wake-up triggers with as little as 15 MIPS, and has the ability to operate in the 1-10mA range on today’s smartphones.
-
Highly accurate with poor S/N ratios.
This means several things:
- Works in high noise. TrulyHandsfree Voice Control performs flawlessly in extremely loud environments, including music playing in the background or even outdoors in downtown Portland !
- Works without a microphone in close proximity. TrulyHandsfree is responsive even at distances of 20 feet (in a relatively quiet environment) and at arms length in noise. This is critical because many VUI based applications of the future will become commonplace in a wide variety of consumer electronics devices, and users won’t want to get up and walk over to their devices to control them.
Companies like Nuance, Vlingo, Google and Microsoft are pretty good at the second step, which is a more powerful (often cloud-based) recognition system.
The third step “Understanding Meaning” is what the original Siri was all about. This was an AI component developed under DARPA funding at SRI and later spun off and acquired by Apple. Apple is rumored to be using Nuance as the “Step 2” in Siri.
Vlingo does a really nice job of implementing Steps 1-3 (using Sensory as its partner for Step 1.) I’m sure Google, Microsoft, Apple and Nuance all have efforts underway in the area of AI and natural language understanding. It’s really not that different than what they have needed for text-based “meaning” recognition during traditional searches.
The SEARCH in Step 4 is done via typical search engines (Google, Microsoft, Apple) and I’d guess Vlingo and other independent players (are there any still around???) have developed partnerships in these areas.
Step 5 is basically a good quality TTS engine. Providers like Nuance, Ivona, ATT, NeoSpeech, and Acapella all have nice TTS engines, and I believe Apple, Microsoft and Google all have in-house solutions as well!
The important point in comparing Sensory’s technology is that we provide the logical entryway to a successful Voice User Interface experience–with a lightning-fast voice trigger that replaces tactile button presses. It is a given that noise immunity and extremely high accuracy are also required, and Trulyhandsfree accomplishes this without requiring a prohibitive amount of power to function reliably and consistently.
AND…while we appreciate the comparison to the most profitable company on the planet, we’d like to focus on what we do better…making Truly Hands-Free really mean Trulyhandsfree™.
发表评论
-
Voice detection for Android
2012-07-23 11:39 2359Here it is, my fist JAVA applic ... -
Google hired one of Nuance soft engineers to help work around all Nuance patents
2012-07-10 14:33 1102很有趣的消息: http://forums.macrumor ... -
The Voice Browser Working Group
2012-07-04 14:38 1983http://www.w3.org/Voice/ ... -
Nuance网站
2012-07-04 14:19 1322http://www.nuance.com/ http: ... -
Nuance HTTP Services
2012-07-03 13:57 987http://dragonmobile.nuancemobil ... -
Nuance - Dragon Mobile SDK - Speech Kit Library Guide (for Android)
2012-07-03 13:09 6515Speech Kit Library Gu ... -
Nuance - Dragon Mobile SDK - Speech Kit
2012-07-02 15:57 1424http://dragonmobile.nuancemobil ... -
Nuance’s Dragon ID Lets You Unlock Your Smartphone Or Tablet By Talking To It
2012-07-02 11:22 1155http://techcrunch.com/2012/06/0 ... -
Android 4.1 Jelly Bean adds Offline Voice Typing
2012-06-28 14:38 1417Google has added offline vo ... -
The http request header of Vlingo request
2012-05-22 21:48 1177Cache-Control no-cache,no-store ... -
三星已经禁止运行在其他手机上的S Voice应用访问服务器了
2012-05-22 09:45 1280S Voice刚被破解不久,三星就采取行动,禁止运行在其他手机 ... -
三星的S Voice应用
2012-05-21 14:58 1098三星的S Voice应用原来不是自己的技术,应该一点自己的技术 ... -
Samsung S Voice
2012-05-21 12:52 1002三星Galaxy S III的S Voice应用已经被提取出来 ... -
The response from Vlingo
2012-05-14 16:53 1039<?xml version="1.0" ... -
eyes-free - Speech Enabled Eyes-Free Android Applications
2012-04-06 14:01 1132http://code.google.com/p/eyes-f ... -
Biometric Identification (生物特征识别)
2012-03-27 14:58 1266What is Biometric Identificat ... -
详解wave头格式(尽可能详细并附代码)
2012-03-25 21:43 14694参考网址一:http://blog.csdn.net/sshc ... -
关于数字音频处理的一些常识
2012-03-23 10:25 1322数字音频处理技术http://apps.hi.baidu.co ... -
[AndroidTips]调用TextToSpeech朗读的时候如何中间停顿
2012-03-21 23:27 2886TTS在句子中间会停顿,你也可以通过在任何字符串中加点&quo ... -
The speech energy endpointer implementation from Chrome
2012-03-14 19:26 1175http://src.chromium.org/svn/tru ...
相关推荐
这一特性得益于Sensory的TrulyHandsFree语音控制和识别软件,该软件提供了语音搜索、自定义声控命令、说话人验证和身份识别,且支持多种语言。 DA7322和DA7323的波束成形技术允许麦克风在不同位置灵活布置,适应端...
【Python】聊天机器人测试框架_pgj
Python微专业-项目实战_hy4
23种设计模式_hy4
消息中间件源码学习(打注释学习)_hy5
python网络爬虫按月爬cctv新闻30分的视频_hy4
ApacheMINA(MultipurposeInfrastructureforNetworkApplications)_hy4
Python微专业-项目实战_hy5
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。
该项目是一款基于微信小程序的东源赛事报名系统设计源码,包含550个文件,涵盖120个JavaScript文件、105个JSON配置文件、104个WXSS样式文件、99个WXML模板文件、80个TypeScript文件、24个PNG图片文件、9个WXS组件文件、5个JPG图片文件、1个Git忽略文件和1个Markdown文件。系统集成了报名、缴费、赛事抽签、晋级等多个功能模块,旨在为用户提供便捷、高效、流畅的赛事报名及后续管理流程体验。
SpringBoot分布式事务_hy4
一个使用和风天气API获取天气信息并通过SMTP发送到邮箱的python小项目[参赛项目]_hy4
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。 4.适用对象:计算机,电子信息工程、数学等专业的大学生课程设计、期末大作业和毕业设计。
该优化后的项目描述为:本项目是一款基于Material Design风格的Boat APP Java版启动器美化设计源码,包含4680个文件,涵盖1778个Java类文件、900个PNG图片文件、797个XML布局文件、556个DEX可执行文件、291个JSON配置文件、75个Java源代码文件、68个SO库文件、35个文本文件、34个JAR库文件、23个其他类型文件和文件类型不明的23个文件。此启动器专为Android设备上运行的Minecraft Java版打造,旨在提升用户体验。
该项目是基于kunpeng芯片的prefetch_tuning性能参数调整设计源码,包含21个文件,包括14个Shell脚本、2个Markdown文档、2个C语言源文件、1个许可证文件、1个Makefile文件和1个头文件。主要使用C语言编写,辅以Shell和C语言进行辅助操作。
Datadevelopengine数据研发引擎,用可视化的组件编排后台数据处理逻辑,配合消息触发、定时任务和res_hy5
李宏毅机器学习2020课程的相关代码_hy4
【C#】WebSocket为微信小程序等提供独立的WebSocket服务器端环境
基于Python的百度百科爬虫_hy4
1.版本:matlab2014/2019a/2024a 2.附赠案例数据可直接运行matlab程序。 3.代码特点:参数化编程、参数可方便更改、代码编程思路清晰、注释明细。