- 浏览: 1504470 次
- 性别:
- 来自: 南京
文章分类
- 全部博客 (419)
- XMPP (19)
- Android (180)
- Java (59)
- Network (4)
- HTML5 (13)
- Eclipse (9)
- SCM (23)
- C/C++ (4)
- UML (4)
- Libjingle (15)
- Tools&Softwares (29)
- Linphone (5)
- Linux&UNIX (6)
- Windows (18)
- Google (10)
- MISC (3)
- SIP (6)
- SQLite (5)
- Security (4)
- Opensource (29)
- Online (2)
- 文章 (3)
- MemoryLeak (10)
- Decompile (5)
- Ruby (1)
- Image (1)
- Bat (4)
- TTS&ASR (28)
- Multimedia (1)
- iOS (20)
- Asciiflow - ASCII Flow Diagram Tool.htm (1)
- Networking (1)
- DLNA&UPnP (2)
- Chrome (2)
- CI (1)
- SmartHome (0)
- CloudComputing (1)
- NodeJS (3)
- MachineLearning (2)
最新评论
-
bzhao:
点赞123!
Windows的adb shell中使用vi不乱码方法及AdbPutty -
wahahachuang8:
我觉得这种东西自己开发太麻烦了,就别自己捣鼓了,找个第三方,方 ...
HTML5 WebSocket 技术介绍 -
obehavior:
view.setOnTouchListenerview是什么
[转]android 一直在最前面的浮动窗口效果 -
wutenghua:
[转]android 一直在最前面的浮动窗口效果 -
zee3.lin:
Sorry~~
When I build "call ...
Step by Step about How to Build libjingle 0.4
http://sensoryinc.com/blog/?p=494
An interesting blog post (from PC World) came out following Apple’s iPhone 4s intro with Siri. I think everyone knows what Siri is…it’s the Apple acquisition that has turned into a big part of the Apple user experience. Siri technology allows a user to not only search but control various aspects of a smartphone by voice in a “natural language” manner.
The blog post depicts a looming showdown between Sensory and Apple’s Siri. It is quite kind to Sensory, pointing out our near-flawless performance in noise and how TrulyHandsfree™ does not require button presses. While those points are true, Sensory is certainly NOT a competitor to Siri. We do partner with companies like Vlingo that might be considered a Siri competitor, but Sensory’s TrulyHandsfree is just the first part of a multi-stage process for creating a true Voice User Interface.
Here is the basic process:
It’s just that first step that Sensory does better than anyone else. However, it’s an important step that requires a few critical characteristics:
- Extremely fast response time. Since it basically competes with a button press, it has to have a similar or faster response time. Because TrulyHandsfree uses a probabilistic approach, it can respond without having to wait for the recognizer to determine if the word is even finished! Slow response times lead users to speak before the Step 2 recognizer is ready to listen, which is a major cause of failure.
- Low power consumption. If it’s always on and always listening, it can’t be a power hog. Sensory can perform wake-up triggers with as little as 15 MIPS, and has the ability to operate in the 1-10mA range on today’s smartphones.
-
Highly accurate with poor S/N ratios.
This means several things:
- Works in high noise. TrulyHandsfree Voice Control performs flawlessly in extremely loud environments, including music playing in the background or even outdoors in downtown Portland !
- Works without a microphone in close proximity. TrulyHandsfree is responsive even at distances of 20 feet (in a relatively quiet environment) and at arms length in noise. This is critical because many VUI based applications of the future will become commonplace in a wide variety of consumer electronics devices, and users won’t want to get up and walk over to their devices to control them.
Companies like Nuance, Vlingo, Google and Microsoft are pretty good at the second step, which is a more powerful (often cloud-based) recognition system.
The third step “Understanding Meaning” is what the original Siri was all about. This was an AI component developed under DARPA funding at SRI and later spun off and acquired by Apple. Apple is rumored to be using Nuance as the “Step 2” in Siri.
Vlingo does a really nice job of implementing Steps 1-3 (using Sensory as its partner for Step 1.) I’m sure Google, Microsoft, Apple and Nuance all have efforts underway in the area of AI and natural language understanding. It’s really not that different than what they have needed for text-based “meaning” recognition during traditional searches.
The SEARCH in Step 4 is done via typical search engines (Google, Microsoft, Apple) and I’d guess Vlingo and other independent players (are there any still around???) have developed partnerships in these areas.
Step 5 is basically a good quality TTS engine. Providers like Nuance, Ivona, ATT, NeoSpeech, and Acapella all have nice TTS engines, and I believe Apple, Microsoft and Google all have in-house solutions as well!
The important point in comparing Sensory’s technology is that we provide the logical entryway to a successful Voice User Interface experience–with a lightning-fast voice trigger that replaces tactile button presses. It is a given that noise immunity and extremely high accuracy are also required, and Trulyhandsfree accomplishes this without requiring a prohibitive amount of power to function reliably and consistently.
AND…while we appreciate the comparison to the most profitable company on the planet, we’d like to focus on what we do better…making Truly Hands-Free really mean Trulyhandsfree™.
发表评论
-
Voice detection for Android
2012-07-23 11:39 2347Here it is, my fist JAVA applic ... -
Google hired one of Nuance soft engineers to help work around all Nuance patents
2012-07-10 14:33 1100很有趣的消息: http://forums.macrumor ... -
The Voice Browser Working Group
2012-07-04 14:38 1981http://www.w3.org/Voice/ ... -
Nuance网站
2012-07-04 14:19 1310http://www.nuance.com/ http: ... -
Nuance HTTP Services
2012-07-03 13:57 981http://dragonmobile.nuancemobil ... -
Nuance - Dragon Mobile SDK - Speech Kit Library Guide (for Android)
2012-07-03 13:09 6511Speech Kit Library Gu ... -
Nuance - Dragon Mobile SDK - Speech Kit
2012-07-02 15:57 1415http://dragonmobile.nuancemobil ... -
Nuance’s Dragon ID Lets You Unlock Your Smartphone Or Tablet By Talking To It
2012-07-02 11:22 1148http://techcrunch.com/2012/06/0 ... -
Android 4.1 Jelly Bean adds Offline Voice Typing
2012-06-28 14:38 1413Google has added offline vo ... -
The http request header of Vlingo request
2012-05-22 21:48 1175Cache-Control no-cache,no-store ... -
三星已经禁止运行在其他手机上的S Voice应用访问服务器了
2012-05-22 09:45 1276S Voice刚被破解不久,三星就采取行动,禁止运行在其他手机 ... -
三星的S Voice应用
2012-05-21 14:58 1090三星的S Voice应用原来不是自己的技术,应该一点自己的技术 ... -
Samsung S Voice
2012-05-21 12:52 995三星Galaxy S III的S Voice应用已经被提取出来 ... -
The response from Vlingo
2012-05-14 16:53 1032<?xml version="1.0" ... -
eyes-free - Speech Enabled Eyes-Free Android Applications
2012-04-06 14:01 1127http://code.google.com/p/eyes-f ... -
Biometric Identification (生物特征识别)
2012-03-27 14:58 1257What is Biometric Identificat ... -
详解wave头格式(尽可能详细并附代码)
2012-03-25 21:43 14681参考网址一:http://blog.csdn.net/sshc ... -
关于数字音频处理的一些常识
2012-03-23 10:25 1313数字音频处理技术http://apps.hi.baidu.co ... -
[AndroidTips]调用TextToSpeech朗读的时候如何中间停顿
2012-03-21 23:27 2868TTS在句子中间会停顿,你也可以通过在任何字符串中加点&quo ... -
The speech energy endpointer implementation from Chrome
2012-03-14 19:26 1168http://src.chromium.org/svn/tru ...
相关推荐
这一特性得益于Sensory的TrulyHandsFree语音控制和识别软件,该软件提供了语音搜索、自定义声控命令、说话人验证和身份识别,且支持多种语言。 DA7322和DA7323的波束成形技术允许麦克风在不同位置灵活布置,适应端...
内容概要:本文全面介绍了Scratch编程语言,包括其历史、发展、特点、主要组件以及如何进行基本和进阶编程操作。通过具体示例,展示了如何利用代码块制作动画、游戏和音乐艺术作品,并介绍了物理模拟、网络编程和扩展库等功能。 适合人群:编程初学者、教育工作者、青少年学生及对编程感兴趣的各年龄段用户。 使用场景及目标:①帮助初学者理解编程的基本概念和逻辑;②提高学生的创造力、逻辑思维能力和问题解决能力;③引导用户通过实践掌握Scratch的基本和高级功能,制作个性化作品。 其他说明:除了基础教学,文章还提供了丰富的学习资源和社区支持,帮助用户进一步提升技能。
mmexport1734874094130.jpg
基于simulink的悬架仿真模型,有主动悬架被动悬架天棚控制半主动悬架 [1]基于pid控制的四自由度主被动悬架仿真模型 [2]基于模糊控制的二自由度仿真模型,对比pid控制对比被动控制,的比较说明 [3]基于天棚控制的二自由度悬架仿真 以上模型,说明文档齐全,仿真效果明显
内容概要:本文档是《组合数学答案-网络流传版.pdf》的内容,主要包含了排列组合的基础知识以及一些经典的组合数学题目。这些题目涵盖了从排列数计算、二项式定理的应用到容斥原理的实际应用等方面。通过对这些题目的解析,帮助读者加深对组合数学概念和技巧的理解。 适用人群:适合初学者和有一定基础的学习者。 使用场景及目标:可以在学习组合数学课程时作为练习题参考,也可以在复习考试或准备竞赛时使用,目的是提高解决组合数学问题的能力。 其他说明:文档中的题目覆盖了组合数学的基本知识点,适合逐步深入学习。每个题目都有详细的解答步骤,有助于读者掌握解题思路和方法。
YOLO系列算法目标检测数据集,包含标签,可以直接训练模型和验证测试,数据集已经划分好,包含数据集配置文件data.yaml,适用yolov5,yolov8,yolov9,yolov7,yolov10,yolo11算法; 包含两种标签格:yolo格式(txt文件)和voc格式(xml文件),分别保存在两个文件夹中,文件名末尾是部分类别名称; yolo格式:<class> <x_center> <y_center> <width> <height>, 其中: <class> 是目标的类别索引(从0开始)。 <x_center> 和 <y_center> 是目标框中心点的x和y坐标,这些坐标是相对于图像宽度和高度的比例值,范围在0到1之间。 <width> 和 <height> 是目标框的宽度和高度,也是相对于图像宽度和高度的比例值; 【注】可以下拉页面,在资源详情处查看标签具体内容;
操作系统实验 Ucore lab5
基于matlab开发的学生成绩管理系统GUI界面,可以实现学生成绩载入,显示,处理及查询。
老版本4.0固件,(.dav固件包),支持7700N-K4,7900N-K4等K51平台,升级后出现异常或变砖可使用此版本。请核对自己的机器信息,确认适用后在下载。
YOLO系列算法目标检测数据集,包含标签,可以直接训练模型和验证测试,数据集已经划分好,包含数据集配置文件data.yaml,适用yolov5,yolov8,yolov9,yolov7,yolov10,yolo11算法; 包含两种标签格:yolo格式(txt文件)和voc格式(xml文件),分别保存在两个文件夹中,文件名末尾是部分类别名称; yolo格式:<class> <x_center> <y_center> <width> <height>, 其中: <class> 是目标的类别索引(从0开始)。 <x_center> 和 <y_center> 是目标框中心点的x和y坐标,这些坐标是相对于图像宽度和高度的比例值,范围在0到1之间。 <width> 和 <height> 是目标框的宽度和高度,也是相对于图像宽度和高度的比例值; 【注】可以下拉页面,在资源详情处查看标签具体内容;
YOLO算法-杂草检测项目数据集-3970张图像带标签-杂草.zip
E008 库洛米(3页).zip
内容概要:本文详细阐述了基于西门子PLC的晶圆研磨机自动控制系统的设计与实现。该系统结合了传感器技术、电机驱动技术和人机界面技术,实现了晶圆研磨过程的高精度和高效率控制。文中详细介绍了控制系统的硬件选型与设计、软件编程与功能实现,通过实验测试和实际应用案例验证了系统的稳定性和可靠性。 适合人群:具备一定的自动化控制和机械设计基础的工程师、研究人员以及从事半导体制造的技术人员。 使用场景及目标:本研究为半导体制造企业提供了一种有效的自动化解决方案,旨在提高晶圆研磨的质量和生产效率,降低劳动强度和生产成本。系统适用于不同规格晶圆的研磨作业,可以实现高精度、高效率、自动化的晶圆研磨过程。 阅读建议:阅读本文时,重点关注晶圆研磨工艺流程和技术要求,控制系统的硬件和软件设计方法,以及实验测试和结果分析。这将有助于读者理解和掌握该自动控制系统的实现原理和应用价值。
YOLO系列算法目标检测数据集,包含标签,可以直接训练模型和验证测试,数据集已经划分好,包含数据集配置文件data.yaml,适用yolov5,yolov8,yolov9,yolov7,yolov10,yolo11算法; 包含两种标签格:yolo格式(txt文件)和voc格式(xml文件),分别保存在两个文件夹中,文件名末尾是部分类别名称; yolo格式:<class> <x_center> <y_center> <width> <height>, 其中: <class> 是目标的类别索引(从0开始)。 <x_center> 和 <y_center> 是目标框中心点的x和y坐标,这些坐标是相对于图像宽度和高度的比例值,范围在0到1之间。 <width> 和 <height> 是目标框的宽度和高度,也是相对于图像宽度和高度的比例值; 【注】可以下拉页面,在资源详情处查看标签具体内容;
深圳建筑安装公司“挖掘机安全操作规程”
YOLO系列算法目标检测数据集,包含标签,可以直接训练模型和验证测试,数据集已经划分好,包含数据集配置文件data.yaml,适用yolov5,yolov8,yolov9,yolov7,yolov10,yolo11算法; 包含两种标签格:yolo格式(txt文件)和voc格式(xml文件),分别保存在两个文件夹中,文件名末尾是部分类别名称; yolo格式:<class> <x_center> <y_center> <width> <height>, 其中: <class> 是目标的类别索引(从0开始)。 <x_center> 和 <y_center> 是目标框中心点的x和y坐标,这些坐标是相对于图像宽度和高度的比例值,范围在0到1之间。 <width> 和 <height> 是目标框的宽度和高度,也是相对于图像宽度和高度的比例值; 【注】可以下拉页面,在资源详情处查看标签具体内容;
大题解题方法等4个文件.zip
保障性安居工程考评内容和评价标准.docx
监督机构检查记录表.docx
该项目适合初学者进行学习,有效的掌握java、swing、mysql等技术的基础知识。资源包含源码、视频和文档 资源下载|如果你正在做毕业设计,需要源码和论文,各类课题都可以,私聊我。 商务合作|如果你是在校大学生,正好你又懂语言编程,或者你可以找来需要做毕设的伙伴,私聊我。。内容来源于网络分享,如有侵权请联系我删除。另外如果没有积分的同学需要下载,请私信我。