- 浏览: 1501906 次
- 性别:
- 来自: 南京
文章分类
- 全部博客 (419)
- XMPP (19)
- Android (180)
- Java (59)
- Network (4)
- HTML5 (13)
- Eclipse (9)
- SCM (23)
- C/C++ (4)
- UML (4)
- Libjingle (15)
- Tools&Softwares (29)
- Linphone (5)
- Linux&UNIX (6)
- Windows (18)
- Google (10)
- MISC (3)
- SIP (6)
- SQLite (5)
- Security (4)
- Opensource (29)
- Online (2)
- 文章 (3)
- MemoryLeak (10)
- Decompile (5)
- Ruby (1)
- Image (1)
- Bat (4)
- TTS&ASR (28)
- Multimedia (1)
- iOS (20)
- Asciiflow - ASCII Flow Diagram Tool.htm (1)
- Networking (1)
- DLNA&UPnP (2)
- Chrome (2)
- CI (1)
- SmartHome (0)
- CloudComputing (1)
- NodeJS (3)
- MachineLearning (2)
最新评论
-
bzhao:
点赞123!
Windows的adb shell中使用vi不乱码方法及AdbPutty -
wahahachuang8:
我觉得这种东西自己开发太麻烦了,就别自己捣鼓了,找个第三方,方 ...
HTML5 WebSocket 技术介绍 -
obehavior:
view.setOnTouchListenerview是什么
[转]android 一直在最前面的浮动窗口效果 -
wutenghua:
[转]android 一直在最前面的浮动窗口效果 -
zee3.lin:
Sorry~~
When I build "call ...
Step by Step about How to Build libjingle 0.4
http://sensoryinc.com/blog/?p=494
An interesting blog post (from PC World) came out following Apple’s iPhone 4s intro with Siri. I think everyone knows what Siri is…it’s the Apple acquisition that has turned into a big part of the Apple user experience. Siri technology allows a user to not only search but control various aspects of a smartphone by voice in a “natural language” manner.
The blog post depicts a looming showdown between Sensory and Apple’s Siri. It is quite kind to Sensory, pointing out our near-flawless performance in noise and how TrulyHandsfree™ does not require button presses. While those points are true, Sensory is certainly NOT a competitor to Siri. We do partner with companies like Vlingo that might be considered a Siri competitor, but Sensory’s TrulyHandsfree is just the first part of a multi-stage process for creating a true Voice User Interface.
Here is the basic process:
It’s just that first step that Sensory does better than anyone else. However, it’s an important step that requires a few critical characteristics:
- Extremely fast response time. Since it basically competes with a button press, it has to have a similar or faster response time. Because TrulyHandsfree uses a probabilistic approach, it can respond without having to wait for the recognizer to determine if the word is even finished! Slow response times lead users to speak before the Step 2 recognizer is ready to listen, which is a major cause of failure.
- Low power consumption. If it’s always on and always listening, it can’t be a power hog. Sensory can perform wake-up triggers with as little as 15 MIPS, and has the ability to operate in the 1-10mA range on today’s smartphones.
-
Highly accurate with poor S/N ratios.
This means several things:
- Works in high noise. TrulyHandsfree Voice Control performs flawlessly in extremely loud environments, including music playing in the background or even outdoors in downtown Portland !
- Works without a microphone in close proximity. TrulyHandsfree is responsive even at distances of 20 feet (in a relatively quiet environment) and at arms length in noise. This is critical because many VUI based applications of the future will become commonplace in a wide variety of consumer electronics devices, and users won’t want to get up and walk over to their devices to control them.
Companies like Nuance, Vlingo, Google and Microsoft are pretty good at the second step, which is a more powerful (often cloud-based) recognition system.
The third step “Understanding Meaning” is what the original Siri was all about. This was an AI component developed under DARPA funding at SRI and later spun off and acquired by Apple. Apple is rumored to be using Nuance as the “Step 2” in Siri.
Vlingo does a really nice job of implementing Steps 1-3 (using Sensory as its partner for Step 1.) I’m sure Google, Microsoft, Apple and Nuance all have efforts underway in the area of AI and natural language understanding. It’s really not that different than what they have needed for text-based “meaning” recognition during traditional searches.
The SEARCH in Step 4 is done via typical search engines (Google, Microsoft, Apple) and I’d guess Vlingo and other independent players (are there any still around???) have developed partnerships in these areas.
Step 5 is basically a good quality TTS engine. Providers like Nuance, Ivona, ATT, NeoSpeech, and Acapella all have nice TTS engines, and I believe Apple, Microsoft and Google all have in-house solutions as well!
The important point in comparing Sensory’s technology is that we provide the logical entryway to a successful Voice User Interface experience–with a lightning-fast voice trigger that replaces tactile button presses. It is a given that noise immunity and extremely high accuracy are also required, and Trulyhandsfree accomplishes this without requiring a prohibitive amount of power to function reliably and consistently.
AND…while we appreciate the comparison to the most profitable company on the planet, we’d like to focus on what we do better…making Truly Hands-Free really mean Trulyhandsfree™.
发表评论
-
Voice detection for Android
2012-07-23 11:39 2341Here it is, my fist JAVA applic ... -
Google hired one of Nuance soft engineers to help work around all Nuance patents
2012-07-10 14:33 1096很有趣的消息: http://forums.macrumor ... -
The Voice Browser Working Group
2012-07-04 14:38 1976http://www.w3.org/Voice/ ... -
Nuance网站
2012-07-04 14:19 1300http://www.nuance.com/ http: ... -
Nuance HTTP Services
2012-07-03 13:57 977http://dragonmobile.nuancemobil ... -
Nuance - Dragon Mobile SDK - Speech Kit Library Guide (for Android)
2012-07-03 13:09 6507Speech Kit Library Gu ... -
Nuance - Dragon Mobile SDK - Speech Kit
2012-07-02 15:57 1414http://dragonmobile.nuancemobil ... -
Nuance’s Dragon ID Lets You Unlock Your Smartphone Or Tablet By Talking To It
2012-07-02 11:22 1144http://techcrunch.com/2012/06/0 ... -
Android 4.1 Jelly Bean adds Offline Voice Typing
2012-06-28 14:38 1407Google has added offline vo ... -
The http request header of Vlingo request
2012-05-22 21:48 1172Cache-Control no-cache,no-store ... -
三星已经禁止运行在其他手机上的S Voice应用访问服务器了
2012-05-22 09:45 1271S Voice刚被破解不久,三星就采取行动,禁止运行在其他手机 ... -
三星的S Voice应用
2012-05-21 14:58 1086三星的S Voice应用原来不是自己的技术,应该一点自己的技术 ... -
Samsung S Voice
2012-05-21 12:52 991三星Galaxy S III的S Voice应用已经被提取出来 ... -
The response from Vlingo
2012-05-14 16:53 1027<?xml version="1.0" ... -
eyes-free - Speech Enabled Eyes-Free Android Applications
2012-04-06 14:01 1121http://code.google.com/p/eyes-f ... -
Biometric Identification (生物特征识别)
2012-03-27 14:58 1253What is Biometric Identificat ... -
详解wave头格式(尽可能详细并附代码)
2012-03-25 21:43 14671参考网址一:http://blog.csdn.net/sshc ... -
关于数字音频处理的一些常识
2012-03-23 10:25 1308数字音频处理技术http://apps.hi.baidu.co ... -
[AndroidTips]调用TextToSpeech朗读的时候如何中间停顿
2012-03-21 23:27 2845TTS在句子中间会停顿,你也可以通过在任何字符串中加点&quo ... -
The speech energy endpointer implementation from Chrome
2012-03-14 19:26 1160http://src.chromium.org/svn/tru ...
相关推荐
这一特性得益于Sensory的TrulyHandsFree语音控制和识别软件,该软件提供了语音搜索、自定义声控命令、说话人验证和身份识别,且支持多种语言。 DA7322和DA7323的波束成形技术允许麦克风在不同位置灵活布置,适应端...
ta_lib-0.5.1-cp312-cp312-win32.whl
课程设计 在线实时的斗兽棋游戏,时间赶,粗暴的使用jQuery + websoket 实现实时H5对战游戏 + java.zip课程设计
ta_lib-0.5.1-cp310-cp310-win_amd64.whl
基于springboot+vue物流系统源码数据库文档.zip
GEE训练教程——Landsat5、8和Sentinel-2、DEM和各2哦想指数下载
知识图谱
333498005787635解决keil下载失败的文件.zip
【微信机器人原理与实现】 微信机器人是通过模拟微信客户端的行为,自动处理消息、发送消息的程序。在Python中实现微信机器人的主要库是WeChatBot,它提供了丰富的接口,允许开发者方便地进行微信消息的接收与发送。这个项目标题中的"基于python实现的微信机器人源码"指的是使用Python编程语言编写的微信机器人程序。 1. **Python基础**:Python是一种高级编程语言,以其简洁的语法和强大的功能深受开发者喜爱。在实现微信机器人时,你需要熟悉Python的基本语法、数据类型、函数、类以及异常处理等概念。 2. **微信API与WeChatBot库**:微信为开发者提供了微信公共平台和微信开放平台,可以获取到必要的API来实现机器人功能。WeChatBot库是Python中一个用于微信开发的第三方库,它封装了微信的API,简化了消息处理的流程。使用WeChatBot,开发者可以快速搭建起一个微信机器人。 3. **微信OAuth2.0授权**:为了能够接入微信,首先需要通过OAuth2.0协议获取用户的授权。用户授权后,机器人可以获取到微信用户的身份信息,从而进行
基于springboot实验室研究生信息管理系统源码数据库文档.zip
张力控制,色标跟踪,多轴同步,电子凸轮,横切等工艺控制案例。
在Python编程环境中,处理Microsoft Word文档是一项常见的任务。Python提供了几个库来实现这一目标,如`python-docx`,它可以让我们创建、修改和操作.docx文件。本教程将重点介绍如何利用Python进行Word文档的合并、格式转换以及转换为PDF。 1. **合并Word文档(merge4docx)** 合并多个Word文档是一项实用的功能,特别是在处理大量报告或文档集合时。在Python中,可以使用`python-docx`库实现。我们需要导入`docx`模块,然后读取每个文档并将其内容插入到主文档中。以下是一个基本示例: ```python from docx import Document def merge4docx(file_list, output_file): main_doc = Document() for file in file_list: doc = Document(file) for paragraph in doc.paragraphs: main_doc.add_paragraph(paragraph.text) m
基于springboot+Javaweb的二手图书交易系统源码数据库文档.zip
基于springboot餐品美食论坛源码数据库文档.zip
基于springboot亚运会志愿者管理系统源码数据库文档.zip
使用WPF的数据样式绑定,切换对象数据值来完成控件动态切换背景渐变动画效果。 使用动画样式渲染比线程修改性能消耗更低更稳定
基于SpringBoot的企业客源关系管理系统源码数据库文档.zip
基于springboot+vue的桂林旅游网站系统源码数据库文档.zip
基于springboot嗨玩旅游网站源码数据库文档.zip
基于springboot的流浪动物管理系统源码数据库文档.zip