`

(未测试)Speech recognition script for Asterisk

 
阅读更多

==============================================
    Speech recognition script for Asterisk
==============================================

This script makes use of Google's Speech API in order to render speech
to text and return it back to the dialplan as an asterisk channel variable.

------------
Requirements
------------
Perl          The Perl Programming Language
perl-libwww   The World-Wide Web library for Perl
perl-libjson  Module for manipulating JSON-formatted data
IO-Socket-SSL Perl module that implements an interface to SSL sockets.
flac          Free Lossless Audio Codec

Speech API key from Google.
Internet access in order to contact google and get the speech data.

** Optional/Highly experimental **
speex         patent-free audio compression format designed for speech.
              works only with patched speex encoder that supports
              MIME "x-speex-with-header-byte"
              https://github.com/zaf/Speex-with-header-bytes

------------
Installation
------------
To install copy speech-recog.agi to your agi-bin directory.
Usually this is /var/lib/asterisk/agi-bin/
To make sure check your /etc/asterisk/asterisk.conf file

-----
Usage
-----
agi(speech-recog.agi,[lang],[timeout],[intkey],[NOBEEP])
Records from the current channel until 2 seconds of silence are detected
(this can be set by the user by the 'timeout' argument, -1 for no timeout) or the
interrupt key (# by default) is pressed. If NOBEEP is set, no beep sound is played
back to the user to indicate the start of the recording.
The recorded sound is send over to googles speech recognition service and the
returned text string is assigned as the value of the channel variable 'utterance'.
The scripts sets the following channel variables:

utterance  : The generated text string.
confidence : A value between 0 and 1 indicating the probability of a correct recognition.
             Values bigger than 0.95 usually mean that the resulted text is correct.

In case of an unxpected error both these variables are set to '-1'.

--------
Examples
--------
sample dialplan code for your extensions.conf

;Simple speech recognition
exten => 1234,1,Answer()
exten => 1234,n,agi(speech-recog.agi,en-US)
exten => 1234,n,Verbose(1,The text you just said is: ${utterance})
exten => 1234,n,Verbose(1,The probability to be right is: ${confidence})
exten => 1234,n,Hangup()

;Speech recognition demo also using googletts.agi for text to speech synthesis:
exten => 1235,1,Answer()
exten => 1235,n,agi(googletts.agi,"Say something in English, when done press the pound key.",en)
exten => 1235,n(record),agi(speech-recog.agi,en-US)
exten => 1235,n,Verbose(1,Script returned: ${confidence} , ${utterance})

;Check the probability of a successful recognition:
exten => 1235,n,GotoIf($["${confidence}" > "0.8"]?playback:retry)

;Playback the text
exten => 1235,n(playback),agi(googletts.agi,"The text you just said was...",en)
exten => 1235,n,agi(googletts.agi,"${utterance}",en)
exten => 1235,n,goto(end)

;Retry in case speech recognition wasn't successful:
exten => 1235,n(retry),agi(googletts.agi,"Can you please repeat more clearly?",en)
exten => 1235,n,goto(record)

exten => 1235,n(fail),agi(googletts.agi,"Failed to get speech data.",en)
exten => 1235,n(end),Hangup()

;Voice dialing example
exten => 1236,1,Answer()
exten => 1236,n,agi(googletts.agi,"PLease say the number you want to dial.",en)
exten => 1236,n(record),agi(speech-recog.agi,en-US)
exten => 1236,n,GotoIf($["${confidence}" > "0.8"]?success:retry)

exten => 1236,n(success),goto(${utterance},1)

exten => 1236,n(retry),agi(googletts.agi,"Can you please repeat?",en)
exten => 1236,n,goto(record)

Under the folder wolfram you can find a sample agi script that in combination with speech-recog.agi
sends queries to WolframAlpha and returs the answers as a dialplan variable. See wolfram/README for
details and dialplan examples.

-------------------
Supported Languages
-------------------
[['Afrikaans',       ['af-ZA']],
['Bahasa Indonesia',['id-ID']],
['Bahasa Melayu',   ['ms-MY']],
['Català',          ['ca-ES']],
['Čeština',         ['cs-CZ']],
['Deutsch',         ['de-DE']],
['English',         ['en-AU', 'Australia'],
                     ['en-CA', 'Canada'],
                     ['en-IN', 'India'],
                     ['en-NZ', 'New Zealand'],
                     ['en-ZA', 'South Africa'],
                     ['en-GB', 'United Kingdom'],
                     ['en-US', 'United States']],
['Español',         ['es-AR', 'Argentina'],
                     ['es-BO', 'Bolivia'],
                     ['es-CL', 'Chile'],
                     ['es-CO', 'Colombia'],
                     ['es-CR', 'Costa Rica'],
                     ['es-EC', 'Ecuador'],
                     ['es-SV', 'El Salvador'],
                     ['es-ES', 'España'],
                     ['es-US', 'Estados Unidos'],
                     ['es-GT', 'Guatemala'],
                     ['es-HN', 'Honduras'],
                     ['es-MX', 'México'],
                     ['es-NI', 'Nicaragua'],
                     ['es-PA', 'Panamá'],
                     ['es-PY', 'Paraguay'],
                     ['es-PE', 'Perú'],
                     ['es-PR', 'Puerto Rico'],
                     ['es-DO', 'República Dominicana'],
                     ['es-UY', 'Uruguay'],
                     ['es-VE', 'Venezuela']],
['Euskara',         ['eu-ES']],
['Français',        ['fr-FR']],
['Galego',          ['gl-ES']],
['Hrvatski',        ['hr_HR']],
['IsiZulu',         ['zu-ZA']],
['Íslenska',        ['is-IS']],
['Italiano',        ['it-IT', 'Italia'],
                     ['it-CH', 'Svizzera']],
['Magyar',          ['hu-HU']],
['Nederlands',      ['nl-NL']],
['Norsk bokmål',    ['nb-NO']],
['Polski',          ['pl-PL']],
['Português',       ['pt-BR', 'Brasil'],
                     ['pt-PT', 'Portugal']],
['Română',          ['ro-RO']],
['Slovenčina',      ['sk-SK']],
['Suomi',           ['fi-FI']],
['Svenska',         ['sv-SE']],
['Türkçe',          ['tr-TR']],
['български',       ['bg-BG']],
['Pусский',         ['ru-RU']],
['Српски',          ['sr-RS']],
['한국어',            ['ko-KR']],
['中文',             ['cmn-Hans-CN', '普通话 (中国大陆)'],
                     ['cmn-Hans-HK', '普通话 (香港)'],
                     ['cmn-Hant-TW', '中文 (台灣)'],
                     ['yue-Hant-HK', '粵語 (香港)']],
['日本語',           ['ja-JP']],
['Lingua latīna',   ['la']]];

-----------------------
Security Considerations
-----------------------
This script contacts googles' servers in order send the recorded voice data and get back
the resulted text. The script uses SSL by default to encrypt all the traffic between
your pbx and google servers so no 3rd party can eavesdrop your communication, but your
voice data will be available to Google under a not yet defined policy.

-------
License
-------
The speech-recog script for asterisk is distributed under the GNU General Public
License v2. See COPYING for details.

--------
Homepage
--------
http://zaf.github.com/asterisk-speech-recog/

 

 

注意:系统需要安装 perl-libjson ,通过附件中的 libjson-perl.tar.gz 解压

1. 解压:

    tar -zxvf libjson-perl.tar.gz

 

2. 安装过程

    perl Makefile.pl

    make

    make test

    make install

分享到:
评论

相关推荐

    Speech-Recognition-Unity-master_speechrecognition_

    本文将深入探讨Unity中的语音识别技术,以"Speech Recognition Unity"项目为例,解析其工作原理和实现方法。 一、Unity与语音识别 Unity引擎本身并不直接支持内置的语音识别功能,但可以通过与其他服务或库集成来...

    unity插件 Speech Recognition System 语音识别

    Speech Recognition System 语音识别插件,不需要互联网连接; 语音识别质量高、速度快; 支持24种语言; 跨平台(Windows, iOS, Android, macOS, Linux); 易于整合。支持语言: 英语、印度英语、中国、俄罗斯、法国、...

    Google Cloud Speech Recognition 3.0

    Google Cloud Speech Recognition How to use First of all, you need to add GCSpeechRecognition prefab from FrostweepGames->GCSpeechRecognition->Prefabs folder to your working scene. Then you need to set...

    android Speech Recognition_Demo android中文离线语音识别 Android离线语音识别

    本项目“android Speech Recognition_Demo”专注于提供一个离线的、针对中文的语音识别解决方案,避免了网络连接的限制,提高了识别的即时性和隐私性。 离线语音识别在Android开发中具有以下关键知识点: 1. **...

    Speech Recognition System 1.0.4.unitypackage

    Unity插件。语音识别(离线版)插件。支持中文、英文、俄语、法语、阿拉伯语等19种语言。支持多平台(PC、移动端)。

    Speech Recognition System 1.0.4.rar

    本文将围绕"Unity Speech Recognition System 1.0.4"这一主题,深入探讨Unity中的语音识别系统及其应用。 Unity语音识别系统1.0.4是Unity引擎针对语音交互功能开发的一个插件,它允许开发者在游戏中或者应用程序中...

    speech recognition.zip_matlab语音识别_语音识别_语音识别matlab

    在提供的压缩包"speech recognition.zip"中,虽然只有一个名为"新建 Microsoft Word 文档.docx"的文件,它可能是项目介绍、代码示例或实验报告,但由于无法直接查看具体内容,我们无法进一步分析其中的详细实现。...

    语音识别基本原理 Fundamentals of Speech Recognition.pdf

    质量一般吧,不过第6章关于HMM的部分比较详细,作者之一便是Lawrence Rabiner;

    【13】Achieving Human Parity in Conversational Speech Recognition.pdf

    Conversational speech recognition has served as a flagship speech recognition task since the release of the DARPA Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the...

    Speech Synthesis & Speech Recognition Using SAPI 5.1

    标题中的“Speech Synthesis & Speech Recognition Using SAPI 5.1”指的是使用SAPI(Speech Application Programming Interface)5.1版本进行语音合成和语音识别的技术。SAPI是微软提供的一套接口,允许开发者在...

    pocketsphinx、speechrecognition中文语音包.rar

    然后,通过speechrecognition的`Recognizer`类创建一个识别对象,并设置其使用的识别器为pocketsphinx。接下来,我们可以加载音频文件或实时录音,调用`recognize_sphinx()`方法,传入声学模型和语言模型路径,就能...

    ASRT_SpeechRecognition-master_ASRT在地识别_python_ASRT离线识别_语音识别pytho

    ASRT(Automatic Speech Recognition Technology,自动语音识别技术)是一种先进的技术,它允许计算机或设备理解并转换人类的口头语言为文字。在这个名为“ASRT_SpeechRecognition-master”的项目中,重点是利用...

    python语音识别SpeechRecognition-3.8.1-py2.py3 和 PyAudio-0.2.11-cp37

    python语音识别所需的PyAudio-0.2.11-cp37-cp37m-win_amd64.whl和SpeechRecognition-3.8.1-py2.py3-none-any.whl打包

    ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA

    FPGA 领域顶级会议 FPGA 2017 于 2 月 24 日在加州 Monterey 结束。在本次大会上,深鉴科技论文《ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA》获得了大会最佳论文奖(Best Paper Award)。

    speechrecognition.rar_speech recognition

    在"speechrecognition.rar_speech recognition"这个压缩包中,包含了一个名为"speechrecognition.m"的文件,我们可以推测这可能是一个与语音识别相关的MATLAB代码或脚本。下面将详细探讨语音识别的基本原理、相关...

    Speech Recognition and Acoustic Features in

    而在较差的一组(即PTA较高的组别)中,则在整个语音测试过程中几乎未观察到双模式优势。此外,研究还发现较好的PTA有助于处理低频和高频信息;然而,对于较差的PTA组别而言,这些频率的信息处理能力显著降低。 ###...

    LANGUAGE MODELING FOR SPEECH RECOGNITION

    LANGUAGE MODELING FOR SPEECH RECOGNITION

    matlab开发-SpeechRecognition

    本项目“matlab开发-SpeechRecognition”显然聚焦于利用 MATLAB 实现语音识别系统,尤其是基于相关性的方法。以下是这个主题中涵盖的一些关键知识点: 1. **语音信号处理**:语音识别的第一步是将声音转换为可分析...

    Fundamental of speech recognition

    "Fundamental of speech recognition"(语音识别基础)是由Lawrence Rabiner 和 Biing-hwang Juang 编著的,它被认为是该领域的权威著作。 由于给定文件内容并没有提供《语音识别基础》的具体章节或内容,我们可以...

Global site tag (gtag.js) - Google Analytics