Creating a language Model for use with pocketsphinx-sonic-server
You will need to follow the below steps for creating your own language model for use with pocketsphinx-sonic-server.
where "corpus.txt" is a text file containing individual lines(text) that comprise the words that make up the vocabulary for your domain and output.arpa is the name of your output file i.e. Language-Model.
- Once this is done, copy over the newly created .arpa file to your pocketsphinx-sonic-server folder under core/.
- We will need to edit the "cfg" file in the above mentioned folder to point to the newly created Language-Model. In order to do so , just change the file-path in the "cfg" file where it says '-lm'.
- The same "cfg" file also specifies the dictionary being used (-dict) and the acoustic model being used (-hmm).
- By default, the Virtual Human Toolkit uses the wall street journal acoustic model that comes with pocketsphinx and the CMU pronunciation dictionary. You can change this to use your own.