Linux Voice Recognition - How to implement

ilovebash

Implementing Linux Voice Recognition

Implementing Voice Recognition on Linux may sound like quite a daunting task, and yes, there are many difficulties, and likelyhood of achieving 100% success of translation is almost impossible to achieve. However, achieving even partial success of translating an audio file to text, can be a very useful tool for any business.

Business Uses for linux voice recognition

I came across this whole idea a week or so ago. My goal was to translate the audio from recordings which would then enable me to search a database of translated audio for specific keywords.

Business uses could include :

– Searching for mentions of required text
– Analysis of call centre staff ensuring compliance with their call scripts
– Raising alarm bells from customers who may be abusive to staff

The list is actually quite long, those are just a few that come to me now.

As I said already the issues of 100% success include clarity of the audio, different peoples accents, words that the translation tool cannot find in its dictionary.

Anyway accepting all of the above potential problems, its still fun to try! So here is my method to install pocketsphinx onto CentOS 6 for you to experiment.

Install voice recognition on Linux using pocketsphinx on CentOS 6

Just so you know, I installed this onto an almost ‘clean’ fresh minimal install of CentOS 6 with just the base centos repositories configured as you can see below :

[root@localhost tmp]# yum repolist Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: centos.mirror.srv.co.ge * extras: centos.mirror.srv.co.ge * updates: centos.mirror.srv.co.ge base | 3.7 kB 00:00 extras | 3.4 kB 00:00 extras/primary_db | 33 kB 00:00 updates | 3.4 kB 00:00 repo id repo name status base CentOS-6 - Base 6,575 extras CentOS-6 - Extras 45 updates CentOS-6 - Updates 652 repolist: 7,272 [root@localhost tmp]#

The program I am installing is called POCKETSPHINX. They have a website I will list later in this post, but for now lets just get it installed and working !!

We need the base package as well as the pocketsphinx addon. I generally put these into /usr/local/src as an area where they can be safely compiled. So download and extract the packages as follows :

Check out - Download Youtube video - CentOS (Linux), Windows, Mac

wget -O pocketsphinx-5prealpha.tar.gz "http://downloads.sourceforge.net/project/cmusphinx/pocketsphinx/5prealpha/pocketsphinx-5prealpha.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fcmusphinx%2Ffiles%2Fpocketsphinx%2F5prealpha%2F&ts=1447353369&use_mirror=netix"


tar -zxvf pocketsphinx-5prealpha.tar.gz
wget -O sphinxbase-5prealpha.tar.gz "http://downloads.sourceforge.net/project/cmusphinx/sphinxbase/5prealpha/sphinxbase-5prealpha.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fcmusphinx%2Ffiles%2Fsphinxbase%2F5prealpha%2F&ts=1447353479&use_mirror=netassist"
tar -zxvf sphinxbase-5prealpha.tar.gz

mv sphinxbase-5prealpha sphinxbase

If the above download links dont work then simply get the latest from the CMUSphinx download page HERE

I then found out that in order to first install sphinxbase it has a couple of dependencies including Python. To install those do this :

yum install bison python-devel.x86_64 pcre-devel.x86_64

The other issue I had was it also requires a version of SWIG which is newer than the one in the standard Centos repositories. To get that I downloaded and compiled the latest version of swig (3.0.7) from their sourceforge page :

wget "http://prdownloads.sourceforge.net/swig/swig-3.0.7.tar.gz"


tar -zxvf swig-3.0.7.tar.gz

cd swig-3.0.7 ./configure make make install

Now that all the dependencies for sphinxbase were installed I compiled it like this :

cd /usr/local/src/sphinxbase ./configure --enable-fixed --without-lapack make make install

In the README for pocketsphinx it says that in order to compile it it was required to have the sphinxbase code within the pocketsphinx src directory. So I copied it like this :

cd /usr/local/src/pocketsphinx-5prealpha cp -r ../sphinxbase .

And then compiled it like this :

./configure make clean all make check make install

During the ‘make check’ it performs a number of tests of which I had 1 error you can see below :

PASS: test_ps_init PASS: test_ps_reinit PASS: test_ps_fwdtree PASS: test_ps_fwdtree_fwdflat PASS: test_ps_fwdflat PASS: test_ps_fwdflat_bestpath PASS: test_ps_fwdtree_bestpath FAIL: test_ps_simple PASS: test_ps_nbest PASS: test_ps_lattice PASS: test_ps_set_search PASS: test_acmod PASS: test_acmod_grow PASS: test_fwdtree PASS: test_fwdflat PASS: test_fwdtree_fwdflat PASS: test_fwdtree_bestpath PASS: test_fwdtree_nbest PASS: test_pl_fwdtree PASS: test_ptm_mgau PASS: test_posterior PASS: test_fsg PASS: test_fsg2 PASS: test_fsg3 PASS: test_jsgf PASS: test_lm_read PASS: test_dict PASS: test_dict2pid PASS: test_senfh PASS: test_alignment PASS: test_state_align PASS: test_mllr make[5]: Entering directory `/usr/local/src/pocketsphinx-5prealpha/test/unit' make[5]: Nothing to be done for `all'. make[5]: Leaving directory `/usr/local/src/pocketsphinx-5prealpha/test/unit' ============================================================================ Testsuite summary for pocketsphinx 5prealpha ============================================================================ # TOTAL: 32 # PASS: 31 # SKIP: 0 # XFAIL: 0 # FAIL: 1 # XPASS: 0 # ERROR: 0 ============================================================================ See test/unit/test-suite.log

I spoke to one of the developers at CMUSphinx on their irc channel and he said it was not a problem and to contimue with the ‘make install’ which all worked fine.

You should now be good to do a first test!

Testing Linux Voice Recognition using pocketsphinx

Firstly what you need is an audio file. The audio file should be clear text as much as possible (it does not like too much background noise or music), therefore a recording from a TV news channel is a good place to start.

You need to convert the audio into a format pocketsphinx can read (WAV 16kHz 16-bit mono) which you can do using the media manipulator program FFMPEG (to install ffmpeg read my other tutorial HERE). Here is the command to convert your file :

ffmpeg -i "BBC One BBC News at Six (16 Avril 2015)-9CvYHM_V8Xg.mp4" -acodec pcm_s16le -ac 1 -ar 16000 out.wav

pocketsphinx_continuous was one of the programs installed into /usr/local/bin. It does have a man page, however it is not complete and doesnt say you can make it read from an input file, however you can like this (but it produces a HUGE amount of output to the screen) :

pocketsphinx_continuous -infile "/tmp/out.wav"

What you can see from that output is that it reads the input file in chunks which it then translates. That maybe of use to you, but I just wanted the text. So, to minimise the screen output, I ran it like this :

pocketsphinx_continuous -infile "/tmp/out.wav" -logfn /dev/null

Thats it, you have a translated blob of text from an audio file! As I said earlier here is the link to the CMUSphinx sourceforge page : HERE

The site has a lot of documentation and examples of other use cases so check it out!

As always if you liked this tutorial then please share on Facebook and Twitter, and check out my other Linux tutorials HERE

Linux Voice Recognition – How to implement

Implementing Linux Voice Recognition

Business Uses for linux voice recognition

Install voice recognition on Linux using pocketsphinx on CentOS 6

Testing Linux Voice Recognition using pocketsphinx

Leave a Reply Cancel reply

Implementing Linux Voice Recognition

Business Uses for linux voice recognition

Install voice recognition on Linux using pocketsphinx on CentOS 6

Testing Linux Voice Recognition using pocketsphinx

Related Posts

Leave a Reply Cancel reply