Run pocketsphinx_continuous with a keyphrase - cmusphinx

I am trying to use a keyphrase with pocketsphinx, but it keeps throwing the error,
ERROR: "kws_search.c", line 171: The word 'hey' is missing in the dictionary
Even though it is 100% in the dictionary. It is a big part of the dictionary and it recognizes that word fine when I leave the keyphrase out. Am I using it wrong? There isn't a tutorial that I could find. Everything is using python or android.
pocketsphinx_continuous -hmm /usr/local/share/pocketsphinx/model/en-us/en-us -dict 9063.dic -lm 9063.lm -vad_threshold 3.0 -kws keyphrase.file -infile /dev/stdin
and the keyphrase.file is
hey /1.0/

The correct command line is:
pocketsphinx_continuous -vad_threshold 3.0 -kws keyphrase.file -infile /dev/stdin
you do not need -lm and -dict which configures language model search mode. You need keyword search mode. When you use -dict you replace default dictionary with the dictionary with upper-case words. Words are case sensitive.
Tutorial is here.

Related

Ghostscript not prining accented characters correctly

I have a Bash script that writes a text watermark to PDF files. It does this by generating an overlay PDF with Ghostscript and then using PDFtk to stamp the overlay onto the original.
All this works perfectly, except that Ghostscript is not writing accented characters correctly. If my input text is, for example, "Français", the output on the PDF will be "Franˆ§ais".
My Ghostscript command line is:
/usr/local/bin/gs -q -o "${TEMPFILE}" \
-sDEVICE=pdfwrite -sPAPERSIZE=letter \
-c "60 23 moveto 0.32 0.23 0.22 setrgbcolor /Helvetica-Oblique findfont 9 scalefont setfont (${WATERMARK}) show"
The $WATERMARK variable contains a single line of text to be written. The problem occurs both when running the Bash script that contains this line and also when I run just this command directly.
I'm seeing this problem using Ghostscript 9.06 on Mac OS X (installed via Homebrew) and 9.05 on Ubuntu 12.04 (installed from the Ubuntu package repository). The Bash script and gs command line were both written by someone else; I have no experience using Ghostscript myself.
Changing the font has no effect on the problem and I've been unable to google anything useful related to this. What are we doing wrong here?
Thanks.
You haven't encoded the font correctly (or indeed at all).
You are assuming that the character code which represents the glyph named ccedilla is the same in the font as it is on your computer system. For Latin fonts, and the ASCII characters up to 127 this is usually true, for characters beyond that it usually isn't and for non-Latin languages (eg Russian, Arabic, CJKV languages, etc) it isn't true at all.
Encoding fonts isn't hard, but it is rather lengthy to go into here, so instead let me recommend the excellent series of articles written by John Deubert of Acumen Training, you can find them here:
http://www.acumentraining.com/acumenjournal.html
For your purposes I suggest the November and December 2001 articles.

How to give an input wav file to pocket sphinx

Is there some command line utility of pocket sphinx or cmu sphinx to convert a .wav file to text?
pocketsphinx_continuous -hmm -lm -dict will do. But I don't want to keep speaking the same sentence again and again.
pocketsphinx_continuous starting from version 0.8 has option -infile which you can use to decode a file. File must be in a specific format: 16khz 16bit mono wav file
pocketsphinx_continuous -infile file.wav

How can I get perf to find symbols in my program

When using perf report, I don't see any symbols for my program, instead I get output like this:
$ perf record /path/to/racket ints.rkt 10000
$ perf report --stdio
# Overhead Command Shared Object Symbol
# ........ ........ ................. ......
#
70.06% ints.rkt [unknown] [.] 0x5f99b8
26.28% ints.rkt [kernel.kallsyms] [k] 0xffffffff8103d0ca
3.66% ints.rkt perf-32046.map [.] 0x7f1d9be46650
Which is fairly uninformative.
The relevant program is built with debugging symbols, and the sysprof tool shows the appropriate symbols, as does Zoom, which I think is using perf under the hood.
Note that this is on x86-64, so the binary is compiled with -fomit-frame-pointer, but that's the case when running under the other tools as well.
This post is already over a year old, but since it came out at the top of my Google search results when I had the same problem, I thought I'd answer it here. After some more searching around, I found the answer given in this related StackOverflow question very helpful. On my Ubuntu Raring system, I then ended up doing the following:
Compile my C++ sources with -g (fairly obvious, you need debug symbols)
Run perf as
record -g dwarf -F 97 /path/to/my/program
This way perf is able to handle the DWARF 2 debug format, which is the standard format gcc uses on Linux. The -F 97 parameter reduces the sampling rate to 97 Hz. The default sampling rate was apparently too large for my system and resulted in messages like this:
Warning:
Processed 172390 events and lost 126 chunks!
Check IO/CPU overload!
and the perf report call afterwards would fail with a segmentation fault. With the reduced sampling rate everything worked out fine.
Once the perf.data file has been generated without any errors in the previous step, you can run perf report etc. I personally like the FlameGraph tools to generate SVG visualizations.
Other people reported that running
echo 0 > /proc/sys/kernel/kptr_restrict
as root can help as well, if kernel symbols are required.
In my case the solution was to delete the elf files which contained cached symbols from previous builds and were messing things up.
They are in ~/.debug/ folder
You can always use the '$ nm ' command.
here is some sample output:
Ethans-MacBook-Pro:~ phyrrus9$ nm a.out
0000000100000000 T __mh_execute_header
0000000100000f30 T _main
U _printf
0000000100000f00 T _sigint
U _signal
U dyld_stub_binder
I had this problem too, I couldn't see any userspace symbol, but I saw some kernel symbols. I thought this was a symbol loading issue. After tried all the possible solutions I could find, I still couldn't get it work.
Then I faintly remember that
ulimit -u unlimited
is needed. I tried and it magically worked.
I found from this wiki that this command is needed when you use too many file descriptors.
https://perf.wiki.kernel.org/index.php/Tutorial#Troubleshooting_and_Tips
my final command was
perf record -F 999 -g ./my_program
didn't need --call-graph
Make sure that you compile the program using -g option along with gcc(cc) so that debugging information is produced in the operating system's native format.
Try to do the following and check if there are debug symbols present in the symbol table.
$objdump -t your-elf
$readelf -a your-elf
$nm -a your-elf
How about your dev host machine? Is it also running the x86_64 OS?
If not, please make sure the perf is cross-compiled, because the perf depends on the objdump and other tools in toolchain.
I got the same problem with perf after overriding the name of my program via prctl(PR_SET_NAME)
As I can see your case is pretty similar:
70.06% ints.rkt [unknown]
Command you have executed (racket) is different from the one perf have seen.
you can check the value of kptr_restrict by cat /proc/kallsyms. If the addresses of the symbols in the result are all 0x000000, you can fix it by command echo 0 > sys/kernel/kptr_restrict . After this , you may get a wanted result of the perf report

Generating HTML output from criterion

There is a nice example of HTML output from criterion at http://bos.github.com/criterion/.
Which command line option is used to generate this output?
An answer to a related question asserts that this output exits, but it does not seem to show up in the command line options when using --help.
Sorry I didn't get around to your comment-question.
The answer Jedai gives is right - just use -o. For example, here is a line from one of my Makefiles for running benchmarks using defaultMain from Criterion:
./Bench -g -u Bench.csv -o Bench.html -s $(SAMPLES)
Breaking that down, it says:
-g run GC between each sample
-u output CSV data to the given file
-o output HTML data to the given file
-s collect this many samples
Well if you just want html output, then yourBench -o yourReport.html will generate some perfectly reasonable output. If you want to use your own template, look at the templates/report.tpl example in the distribution and use the -t option.
It seems to me that you just pass the template as a command line option, and then it populates it. If the template happens to be an html template, then you've generated html.
See the source here: https://github.com/bos/criterion

Windres syntax error

I am working in MinGW environment (downloaded with their installer on 12/12/2011). I am attempting to compile a resource (.rc) file using Windres. The specific command I use is
Windres -O coff About1.rc -o About1.res
Windres generates at least 100 lines of warning messages reading: "warning: null characters ignored". Following this Windres emits: "Abouty1.rc:1:syntax error".
As a matter of fact, there are no null characters in the About1.rc file. In addtition, the first line of the file is an include statement: #include "dlgresource.h". I played around and eliminated this statement and it turns out that it doesn't matter what I put there, I get the same flurry of messages and the syntax error notification.
To make things more confusing, this same .rc file compiles without any problem using MSFT's rc.exe. The resulting .res file links smoothly with the program .obj file and runs perfectly.
I have no idea what is going on. Any ideas?
Thanks,
Mark Allyn
Your .rc file is probably encoded as UTF-16.
That's what's required in general by Microsoft's [rc.exe], in order to be able to deal with international characters, but GNU [windres.exe] can only deal with ANSI encoding.
One workaround is to convert the file to ANSI on the spot (possibly losing e.g. Russian or Greek characters):
> chcp 1252
Active code page: 1252
> type my.rc | windres --output-format=COFF -o my.res
> _
You probably used VS or a similar tool to generate the file. There are some parts of the character encodings that you cannot see resulting in null characters and etc.
Generate a new .res file with the same content, don't copy/paste the content, type it in yourself.
Try:
windres About1.rc -o About1.o
and then just use the resulting .o file instead of the originally intended .res file.
I've had the same troubles than you today. I know it has passed a lot of time from your question, but I'm writting this on the hope that it can be useful for someone.
First, I obtained an object file .o compiled using Cygwin, writting:
windres -o resource.o resource.rc
By doing that, you dont need to use the .res file, but the .o one, and you can then link this object with all the others, when you compile yout program, using GNU resources:
g++ Header_files CPP_files flags ... -o program.exe recource.o -lm
For instance.

Resources