Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Creating a language Model for use with pocketsphinx-sonic-server

You will need to follow the below steps for creating your own language model for use with pocketsphinx-sonic-server.

  • First of all, we will need the "jasr" tool This tool includes java bits that do some preprocessing, and some wrapper code to make it easy to trigger the creation of Language-Models from java, and also two external libraries that can actually make the .arpa (Language-Model) files. There is a dependency from jasr to djutil which is under the DNLG folder.
  • To make the actual Language-Models, you can use 'cmuslm' or 'srilm', which are both included within the jasr folder. However 'cmuslm' currently only works for Linux and 'srilm' works for linux and Windows XP.

You should do the following

Panel
  1. cd jasr/bin
  2. ant
  3. cd srilm
  4. ./create-lm.bat <corpus.txt> <output.arpa>

where "corpus.txt" is a text file containing individual lines(text) that comprise the words that make up the vocabulary for your domain and output.arpa is the name of your output file i.e. Language-Model.

...