`
haoningabc
  • 浏览: 1482584 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

kaidi-wasm学习笔记(三)一些编译的坑

阅读更多


1.需要的包:
把kaldi和  clapack-wasm copy到 kaldi-wasm下

cp kaldi_git.tar.gz kaldi-wasm
cp clapack-wasm.tar.gz kaldi-wasm

tar xvf kaldi_git.tar.gz
tar xvf clapack-wasm.tar.gz

cp openfst-1.6.7.tar.gz  kaldi-wasm/kaldi/tools


cd kaldi-wasm/kaldi
git log
commit 031fcb2baa1e4e050935d4d913d8b5070f975c7b (HEAD -> master, origin/master, origin/HEAD)
Author: Xiang Li <heibaidaolx123@gmail.com>
Date:   Wed Dec 2 14:07:16 2020 +0800

    [src] cudadecoder: fix bug of frame range checking in online spetral kernels (#4360)


cd kaldi-wasm
git log
commit ccdf531509098ae3eeaf19b708b7db64d01ec09c (HEAD -> master, origin/master, origin/HEAD)
Merge: e38c239 4a0950f
Author: HU Mathieu <mathieu.hu@inria.fr>
Date:   Wed Dec 9 09:13:21 2020 +0100

    Merge branch 'dev/update_kaldi' into 'master'

    Update kaldi with latest version

    See merge request kaldi.web/kaldi-wasm!19



环境:
emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 2.0.14 (8dd277d191daee9adfad03e5f0663df2db4b8bb1)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

uname -a
Linux ali0227 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


存在问题:
popen undefine的问题是搞不定的,放弃吧,忽略他 -s ERROR_ON_UNDEFINED_SYMBOLS=0



安装:和修改的内容:
关键点:./install_kaldi.sh 之后需要 编译优化成-O0

步骤:

1. 安装CLAPACK不需要修改,直接在install.sh里面就可以
echo "------------ Building CLAPACK ------------"
cd ./clapack-wasm
bash install_repo.sh emcc
cd $script_dir

2.安装kaldi/tools
在ubuntu下,是不是动态库有问题,需要disable一下shared

vim kaldi-wasm/kaldi/toos/Makefile
OPENFST_CONFIGURE ?= --enable-static --enable-shared --enable-far \
                     --enable-ngram-fsts --enable-lookahead-fsts --with-pic

改成
OPENFST_CONFIGURE ?= --enable-static --disable-shared --enable-far \
                     --enable-ngram-fsts --enable-lookahead-fsts --with-pic


可以省去这些-rpath的警告
em++: warning: ignoring dynamic library libfstfar.so because not compiling to JS or HTML, remember to link it when compiling to JS or HTML at the end [-Wemcc]
em++: warning: ignoring dynamic library libfstscript.so because not compiling to JS or HTML, remember to link it when compiling to JS or HTML at the end [-Wemcc]
em++: warning: ignoring dynamic library libfst.so because not compiling to JS or HTML, remember to link it when compiling to JS or HTML at the end [-Wemcc]
em++: warning: linking a library with `-shared` will emit a static object file.  This is a form of emulation to support existing build systems.  If you want to build a runtime shared library use the SIDE_MODULE setting. [-Wemcc]
em++: warning: ignoring unsupported linker flag: `-rpath` [-Wlinkflags]
em++: warning: ignoring unsupported linker flag: `-rpath` [-Wlinkflags]
em++: warning: ignoring unsupported linker flag: `-rpath` [-Wlinkflags]
em++: warning: ignoring unsupported linker flag: `-rpath` [-Wlinkflags]
em++: warning: ignoring unsupported linker flag: `-soname` [-Wlinkflags]

思考: 是否是哪些undefine的问题也是因为动态库的问题 ,看来是了,在ubuntu20.04下,undefine的错误和奇怪的警告不见了

ubuntu和mac都过了,槽







install.sh中的
echo "----------- Building Openfst -----------"
cd ./kaldi/tools
emmake make CFLAGS="-O3" CXXFLAGS="-O3 -s USE_ZLIB=1" LDFLAGS=-O3 openfst
cd $script_dir

这个不变



3.安装kaldi ,这个需要修改

echo "------------ Building Kaldi ------------"
#./install_kaldi.sh $LAPACK_DIR

CXXFLAGS="-O0 -U HAVE_EXECINFO_H -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx  -msimd128"
LDFLAGS="-O0 -s ERROR_ON_UNDEFINED_SYMBOLS=0 -s EXPORTED_FUNCTIONS=['_popen','_main'] --bind"

CXXFLAGS="-O0 -U HAVE_EXECINFO_H -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx  -msimd128"
LDFLAGS="-O0 -s ERROR_ON_UNDEFINED_SYMBOLS=0 -s EXPORTED_FUNCTIONS=['_main'] --bind"

# this -O0 need then


cd kaldi/src
CXXFLAGS="$CXXFLAGS" LDFLAGS="$LDFLAGS" emconfigure ./configure --use-cuda=no \
    --static --clapack-root=../../"$LAPACK_DIR" --host=WASM

sed -i -e 's:-pthread::g; s:-lpthread::g' kaldi.mk
#sed -i -e 's:-O1:-O0:g; s:DEBUG_LEVEL = 1:DEBUG_LEVEL = 2:g' kaldi.mk
sed -i -e 's:-O1:-O0:g; ' kaldi.mk

emmake make -j clean depend
#emmake make -j $(nproc) online2bin
emmake make  online2bin


注意5个地方:
(1)。需要-O0,configure之后会生成 kaldi-wasm/kaldi/src/kaldi.mk
大部分参数都在这里 ,所以要sed把-O1的部分都改成-O0
(2)这个 -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx  -msimd128 的支持,没有浏览器会报错,
(3) ERROR_ON_UNDEFINED_SYMBOLS=0 ,popen和main的问题似乎忽略不掉,主要是这两个的undefined错误给屏蔽掉
(4)需要--bind ,__em_regist_class类似的错误会存在undefined错误,加个这个就好了
(5) 如果出现大量undefine,别急着去用ERROR_ON_UNDEFINED_SYMBOLS 屏蔽,有可能是动态库的问题,尝试用静态库解决,也不要急着用-s EXPORT_ALL=1,去解决,因为生成的包太大了,



4. 把编译后的kaldi的基础组件拼接器来,编译出解码器,主要是 online2-tcp-nnet3-decode-faster-reorganized.cc 生成到 kaldiJS.js和kaldiJS.wasm到
kaldi-wasm/src/computations下

prepare_kaldi_wasm.sh
中要修改
奇怪,官方
cp $PROGRAM $PROGRAM.bc
这玩意想直接就用?????什么原理,怎么想的,难道有什么隐含逻辑没搞懂??

去官方的ci里看编译过程,修改如下

#cp $PROGRAM $PROGRAM.bc
#em++ $EM_OPTS -o $WASM_NAME.js $PROGRAM.bc
把这两行改成

em++ -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx  -msimd128 -s EXPORTED_FUNCTIONS=['_popen','_main']  $EM_OPTS    online2-tcp-nnet3-decode-faster-reorganized.o ../online2/kaldi-online2.a ../ivector/kaldi-ivector.a ../nnet3/kaldi-nnet3.a ../chain/kaldi-chain.a ../nnet2/kaldi-nnet2.a ../cudamatrix/kaldi-cudamatrix.a ../decoder/kaldi-decoder.a ../lat/kaldi-lat.a ../fstext/kaldi-fstext.a ../hmm/kaldi-hmm.a ../feat/kaldi-feat.a ../transform/kaldi-transform.a ../gmm/kaldi-gmm.a ../tree/kaldi-tree.a ../util/kaldi-util.a ../matrix/kaldi-matrix.a ../base/kaldi-base.a   /opt/emscripten/kaldi-wasm/kaldi/tools/openfst-1.6.7/lib/libfst.a /opt/emscripten/kaldi-wasm/clapack-wasm/CLAPACK-3.2.1/lapack.a /opt/emscripten/kaldi-wasm/clapack-wasm/CLAPACK-3.2.1/libcblaswr.a /opt/emscripten/kaldi-wasm/clapack-wasm/CBLAS/lib/cblas.a /opt/emscripten/kaldi-wasm/clapack-wasm/f2c_BLAS-3.8.0/blas.a /opt/emscripten/kaldi-wasm/clapack-wasm/libf2c/libf2c.a -lm  -ldl  -o  $WASM_NAME.js

或者相对位置

em++ -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx  -msimd128 -s EXPORTED_FUNCTIONS=['_popen','_main']  $EM_OPTS    online2-tcp-nnet3-decode-faster-reorganized.o ../online2/kaldi-online2.a ../ivector/kaldi-ivector.a ../nnet3/kaldi-nnet3.a ../chain/kaldi-chain.a ../nnet2/kaldi-nnet2.a ../cudamatrix/kaldi-cudamatrix.a ../decoder/kaldi-decoder.a ../lat/kaldi-lat.a ../fstext/kaldi-fstext.a ../hmm/kaldi-hmm.a ../feat/kaldi-feat.a ../transform/kaldi-transform.a ../gmm/kaldi-gmm.a ../tree/kaldi-tree.a ../util/kaldi-util.a ../matrix/kaldi-matrix.a ../base/kaldi-base.a   ../../../kaldi/tools/openfst-1.6.7/lib/libfst.a ../../../clapack-wasm/CLAPACK-3.2.1/lapack.a ../../../clapack-wasm/CLAPACK-3.2.1/libcblaswr.a ../../../clapack-wasm/CBLAS/lib/cblas.a ../../../clapack-wasm/f2c_BLAS-3.8.0/blas.a ../../../clapack-wasm/libf2c/libf2c.a -lm  -ldl  -o  $WASM_NAME.js




------------ Creating WASM module ------------
warning: undefined symbol: MAIN__ (referenced by top-level compiled C/C++ code)
warning: undefined symbol: popen (referenced by top-level compiled C/C++ code)

这俩忽略吧,因为有这个在 -s ERROR_ON_UNDEFINED_SYMBOLS=0,否则就报错了,popen查了半天就是不支持,到浏览器里就好了


5. 编译采样率的js ,把-O3改成-O0

build_other_wasm.sh

#emcc -O3 -s WASM=1 -s MODULARIZE=1 -s ENVIRONMENT='worker' -s BUILD_AS_WORKER=1 \

emcc -O0 -s WASM=1 -s MODULARIZE=1 -s ENVIRONMENT='worker' -s BUILD_AS_WORKER=1 \
     -s EXTRA_EXPORTED_RUNTIME_METHODS="['ccall']" \
     -s EXPORT_NAME='resampleTo16bint' \
     --post-js audio-resampler/em_src/resampleTo16bint_post.js \
     -I audio-resampler/src -o src/computations/resampleTo16bint.js \
     audio-resampler/em_src/resampleTo16bint.c audio-resampler/src/*.c


5.把模型相关文件放到相应位置 就启动npm start

kaldi-wasm/dummy_serv/public/english_small.zip

这个模型的结构是这样的
.
├── AUTHORS
├── conf
│   ├── ivector.conf
│   └── mfcc_hires.conf
├── english_small.zip
├── extractor
│   ├── final.dubm
│   ├── final.ie
│   ├── final.mat
│   ├── global_cmvn.stats
│   ├── online_cmvn.conf
│   └── splice_opts
├── final.mdl
├── graph
│   ├── HCLG.fst
│   └── words.txt
├── kaldi_config.json
├── LICENSE
└── README.md




6.修改nodejs相关配置,让外网也能访问

vim webpack.config.js

module.exports = {
  devServer: {
    host: 'localhost',
    https: true,
    proxy: {
      '/models': {
        target: 'http://localhost:3000',
      },
    },
  },

localhost改成server的ip

vim package.json
"scripts": {
    "start": "(cd dummy_serv && node server.js) & webpack-dev-server --open",

改成
"scripts": {
"start": "(cd dummy_serv && node server.js) & webpack-dev-server --host 0.0.0.0 --open",


npm install
npm start

浏览器打开https 的8080端口

7. 会存在同步异步的问题 ,需要修改一下前端的代码


kaldi-wasm/src/workers/resamplerWorker.js
去掉 onRuntimeInitialized相关内容
helper中的相关方法全都改成async 前缀  ,参考asrWorker.js里面改的,以为你喂要加载resampleJS 对象,需要用then调用
新建一个thisresampleMod 作为 resampleJS.then之后的返回对象



最后几行改成这样
onmessage = (msg) => {
  const { command } = msg.data;
  const response = { command, ok: true };
//-----add by hao for async function begin
if (command in helper) {
    helper[command](msg)
      .then((value) => { response.value = value; })
      .catch((e) => {
        response.ok = false;
        response.value = e;
      })
      .finally(() => { postMessage(response); });
  } else {
    response.ok = false;
    response.value = new Error(`Unknown command '${command}'`);
    postMessage(response);
  }
//-----add by hao for async function end

//  if (command in helper) response.value = helper[command](msg);
//  else {
//    response.ok = false;
//    response.value = new Error(`Unknown command '${command}'`);
//  }
//  postMessage(response);
};

//resampleMod.onRuntimeInitialized = () => {
//  resampleMod.init();
//  resample = resampleMod.resampleTo16bint;
//};


helper里面修改,这里调用了then
//-----add by hao for a globle var translate resampleJS to resampleJS.then  begin
var thisresampleMod;
//-----add by hao for a globle var translate resampleJS to resampleJS.then  end
const helper = {
  //setConversionRatio(msg) {
  async setConversionRatio(msg) {
//-----add by hao for translate resampleJS to resampleJS.then  begin
    await  resampleMod.then(
        function(result){
           thisresampleMod=result;
           thisresampleMod.init();
           resample = thisresampleMod.resampleTo16bint;
        }
    );
//-----add by hao for translate resampleJS to resampleJS.then  end
    outputInputSampleRateRatio = msg.data.conversionRatio;
    return outputInputSampleRateRatio;
  },
  //resample(msg) {
  async resample(msg) {
    return resample(msg.data.buffer, outputInputSampleRateRatio);
  },
  async reset() {
    //resampleMod.reset();
    thisresampleMod.reset();
    return '';
  },
  async terminate() {
    //resampleMod.terminate();
    thisresampleMod.terminate();
    close();
    return '';
  },
};




下面修改asrWorker.js如果不改 会报类似:
Error: command "init" failed: TypeError: Cannot read property 'mkdir' of undefined
    at eval (workerWrapper.js:19)
    at Worker.handleMessage (workerWrapper.js:35)

这个是因为kaldiJS 没有then ,对象里的FS没有生成

尝试修改 asrWorker.js

新增 var thisModule; 作为kaldiJS.then返回的对象保存 注意把kaldiModule 都改成thisModule

//-----add by hao for globle Promise.then return to thisModule  ---begin
var thisModule;
//-----add by hao for globle Promise.then return to thisModule  ---end
async function loadToFS(modelName, zip) {
//-----add by hao for kaldiJS.then   ---begin
  await  kaldiModule.then(
      function(result){
           thisModule=result;
           initEMFS(thisModule.FS, modelName);
      }
  );
//-----add by hao for globle Promise.then return  ---end
//  initEMFS(kaldiModule.FS, modelName);
  const unzipped = await unzip(zip);

  // hack to wait for model saving on Emscripten fileSystem
  // unzipped.forEach does not allow to wait for end of async calls
  const files = Object.keys(unzipped.files);
  await Promise.all(files.map(async (file) => {
    const content = unzipped.file(file);
    if (content !== null) {
//     await writeToFileSystem(kaldiModule.FS, content.name, content);
        await writeToFileSystem(thisModule.FS, content.name, content);
    }
  }));
  return true;
}

另一个地方:


/*
* Assumes that we are in the directory with the requested model
*/
//function startASR() {
//  parser = new KaldiConfigParser(kaldiModule.FS, kaldiModule.FS.cwd());
//  const args = parser.createArgs();
//  const cppArgs = args.reduce((wasmArgs, arg) => {
//    wasmArgs.push_back(arg);
//    return wasmArgs;
//  }, new kaldiModule.StringList());
//  return new kaldiModule.OnlineASR(cppArgs);
//}
//-----modify  by hao for startASR,change kaldiModule to Promise.then globle thisModule
function startASR() {
  parser = new KaldiConfigParser(thisModule.FS, thisModule.FS.cwd());
  const args = parser.createArgs();
  const cppArgs = args.reduce((wasmArgs, arg) => {
    wasmArgs.push_back(arg);
    return wasmArgs;
  }, new thisModule.StringList());
  return new thisModule.OnlineASR(cppArgs);
}

#########检查编译通过


ubuntu20.04 检查通过


































分享到:
评论

相关推荐

    comet:[ICLR 2021]少量学习的概念学习者

    凯迪(Kaidi Cao)*,玛丽亚(MariaBrbić)*,尤里(Jure Leskovec) 此存储库包含COMET算法的PyTorch中的参考源代码。 COMET是一种元学习方法,可沿人类可理解的概念维度学习可概括的表示形式。 有关更多详细信息...

    凯迪拉克HTML5轻应用

    在文件"kaidi0415"中,可能包含了该轻应用的源代码、样式文件、图片资源、配置文件等,开发者可以通过分析这些文件来学习和理解如何构建一个完整的HTML5轻应用。为了进一步了解凯迪拉克HTML5轻应用的实现细节,需要...

    全球及中国混凝土外加剂行业市场份额调研报告.docx

    此外,报告中还提到了一些其他重要的市场参与者,如Liaoning Kelong、Takemoto、Huangteng Chemical、Kao Chemicals、Arkema、Shanxi Kaidi等,它们在不同地区和细分市场中发挥着重要作用。这些企业的存在和表现,...

    电动升降柱,全球市场总体规模.docx

    报告中提到的主要生产商包括Linak、Phoenix Mecano、Jiecang、Loctek Motion、Timotion、Thomson Industries、Suspa、Kaidi、Progressive Automations、RICHMAT等。2021年,这五家公司在全球市场的份额总计约50.0%,...

    SQL注入攻击实操

    例如,在 http://www.sz-kaidi.com/product_show.asp?id=997 中,攻击者可以 inject 恶意 SQL 代码 and 1=1 和 and 1=2 来判断是否存在 SQL 注入漏洞。如果存在漏洞,攻击者可以继续进行下一步操作。 ### 步骤 2:...

    GCN_ADV_Train:图神经网络的对抗训练

    基于优化的GNN攻防在这项...(*平等贡献) @inproceedings{xu2019topology, title={Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective}, author={Xu, Kaidi and Chen, Hongge and

    输灰仓泵检修文件包.doc

    【输灰仓泵检修文件包】是针对XX凯迪绿色能源开发运营XXWuhan KaiDi Green Energy Development And Operation Co., LTD旗下的12MW机组锅炉检修的重要文档,旨在规范和指导输灰仓泵的维护和检修工作,确保设备的正常...

    node_tasted:Node.js 浅尝辄止——分享至我仍未知道名字的十楼公司

    By Kaidi, ZHU, R&D Engineer of and . 正确打开姿势 预备工作:请确保已安装 Node.js 在你的电脑。 安装 依赖。执行 $ npm install。 启动它,执行 $ npm start 。 在弹出的浏览器窗口中点击 tasted.md 即可。 若非...

Global site tag (gtag.js) - Google Analytics