Difference in output when executing system(...) in program and actual command - linux

I have a program written in C that creates an output file with lines of characters. My intention is to count the number of unique lines of characters in this output file (excluding "ABC").
I can do it manually via the Linux command line, using
cat output/output.txt | grep -v "ABC" | sort | uniq -c > uniq_stats/stats.txt
I also put this command into my program so I don't have to do it manually.
memset(command, 0, 500);
sprintf(command, "cat %s | grep -v \"ABC\" | sort | uniq -c > uniq_stats/%s", out_filename, filename);
system(command);
out_filename is output/output.txt and filename is stats.txt
I expect a particular line to be seen 1351 times. The method of using the command line gave this correct value. However, the system(command) method gave only 1349 times. Also, there was another line that was incomplete using the system(command) method, i.e. only a portion of the string was printed out.
Why is it that I got different output from the 2 methods? I have only seen this problem once, as I have tried 4 or 5 other files and both methods gave me the correct results.

Related

How to create a dynamic command in bash?

I want to have a command in a variable that runs a program and specifies the output filename for it depending on the number of files exits (to work on a new file each time).
Here is what I have:
export MY_COMMAND="myprogram -o ./dir/outfile-0.txt"
However I would like to make this outfile number increases each time MY_COMMAND is being executed. You may suppose myprogram creates the file soon enough before the next call. So the number can be retrieved from the number of files exists in the directory ./dir/. I do not have access to change myprogram itself or the use of MY_COMMAND.
Thanks in advance.
Given that you can't change myprogram — its -o option will always write to the file given on the command line, and assuming that something also out of your control is running MY_COMMAND so you can't change the way that MY_COMMAND gets called, you still have control of MY_COMMAND
For the rest of this answer I'm going to change the name MY_COMMAND to callprog mostly because it's easier to type.
You can define callprog as a variable as in your example export callprog="myprogram -o ./dir/outfile-0.txt", but you could instead write a shell script and name that callprog, and a shell script can do pretty much anything you want.
So, you have a directory full of outfile-<num>.txt files and you want to output to the next non-colliding outfile-<num+1>.txt.
Your shell script can get the numbers by listing the files, cutting out only the numbers, sorting them, then take the highest number.
If we have these files in dir:
outfile-0.txt
outfile-1.txt
outfile-5.txt
outfile-10.txt
ls -1 ./dir/outfile*.txt produces the list
./dir/outfile-0.txt
./dir/outfile-1.txt
./dir/outfile-10.txt
./dir/outfile-5.txt
(using outfile and .txt means this will work even if there are other files not name outfile)
Scrape out the number by piping it through the stream editor sed … capture the number and keep only that part:
ls -1 ./dir/outfile*.txt | sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:'
(I'm using colon : instead of the standard slash / so I don't have to escape the directory separator in dir/outfile)
Now you just need to pick the highest number. Sort the numbers and take the top
| sort -rn | head -1
Sorting with -n is numeric, not lexigraphic sorting, -r reverses so the highest number will be first, not last.
Putting it all together, this will list the files, edit the names keeping only the numeric part, sort, and get just the first entry. You want to assign that to a variable to work with it, so it is:
high=$(ls -1 ./dir/outfile*.txt | sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:' | sort -rn | head -1)
In the shell (I'm using bash) you can do math on that, $[high + 1] so if high is 10, the expression produces 11
You would use that as the numeric part of your filename.
The whole shell script then just needs to use that number in the filename. Here it is, with lines broken for better readability:
#!/bin/sh
high=$(ls -1 ./dir/outfile*.txt \
| sed -e 's:^.*dir/outfile-\([0-9][0-9]*\)\.txt$:\1:' \
| sort -rn | head -1)
echo "myprogram -o ./dir/outfile-$[high + 1].txt"
Of course you wouldn't echo myprogram, you'd just run it.
you could do this in a bash function under your .bashrc by using wc to get the number of files in the dir and then adding 1 to the result
yourfunction () {
dir=/path/to/dir
filenum=$(expr $(ls $dir | wc -w) + 1)
myprogram -o $dir/outfile-${filenum}.txt
}
this should get the number of files in $dir and append 1 to that number to get the number you need for the filename. if you place it in your .bashrc or under .bash_aliases and source .bashrc then it should work like any other shell command
You can try exporting a function for MY_COMMAND to run.
next_outfile () {
my_program -o ./dir/outfile-${_next_number}.txt
((_next_number ++ ))
}
export -f next_outfile
export MY_COMMAND="next_outfile" _next_number=0
This relies on a "private" global variable _next_number being initialized to 0 and not otherwise modified.

Assign output of command to environment variable different from original output (bash)

I have encountered a problem with a script I have originally designed.
I am trying to get the number of lines a command displays and if the number is bigger than a value, something should happen.
My problem is that originally this worked fine, now it doesn't.
In my script I am using the following command
NO_LINES=$(ps -ef | grep "sh monitor.sh" | wc -l)
echo $NO_LINES
echo $NO_LINES prints 0 even though it should print 1, the line for the grep command.
If I execute the command separately (not assigning the result to an environment variable) like this
ps -ef | grep "sh monitor.sh" | wc -l
This will print out 1 which is the correct result.
Why is it that by assigning the result to the variable, the value is lower with 1 than the original result?
The bash version of the machine is 4.3.46(1)-release.
Thanks

grep an empty value in a binary file in linux

I have a binary file in Linux machine with values: AB=^] (^] is an empty value), AB=N and AB=Y. I want to get the count of occurrences of AB=^] in the file.
I am using the following command :
zcat Logfile|grep 'AB=^]' |wc -l
but it gives the count 0. The above command works fine for AB=N and Y so I guess I am searching for wrong pattern, what should I search for if not AB=^] ?
Output for the above command:
gzip: Logfile: unexpected end of file
0
here 0 indicates the number of occurrences of tag AB=^]
Basically the deleted answers should work. Except of escaping the ^ and ] your regex, you can also use their hexadecimal notation:
grep -o 'AB='$'\x5E'$'\x5D' file | wc -l

Output grep results to text file, need cleaner output

When using the Grep command to find a search string in a set of files, how do I dump the results to a text file?
Also is there a switch for the Grep command that provides cleaner results for better readability, such as a line feed between each entry or a way to justify file names and search results?
For instance, a away to change...
./file/path: first result
./another/file/path: second result
./a/third/file/path/here: third result
to
./file/path: first result
./another/file/path: second result
./a/third/file/path/here: third result
grep -n "YOUR SEARCH STRING" * > output-file
The -n will print the line number and the > will redirect grep-results to the output-file.
If you want to "clean" the results you can filter them using pipe | for example:
grep -n "test" * | grep -v "mytest" > output-file
will match all the lines that have the string "test" except the lines that match the string "mytest" (that's the switch -v) - and will redirect the result to an output file.
A few good grep-tips can be found in this post
Redirection of program output is performed by the shell.
grep ... > output.txt
grep has no mechanism for adding blank lines between each match, but does provide options such as context around the matched line and colorization of the match itself. See the grep(1) man page for details, specifically the -C and --color options.
To add a blank line between lines of text in grep output to make it easier to read, pipe (|) it through sed:
grep text-to-search-for file-to-grep | sed G

Get the first n lines matching a certain pattern (with Linux commands)

I have a giant file where I want to find a term model. I want to pipe the first 5 lines containing the word model to another file. How do I do that using Linux commands?
man grep mentions that
-m NUM, --max-count=NUM
Stop reading a file after NUM matching lines. If the input is
standard input from a regular file, and NUM matching lines are
output, grep ensures that the standard input is positioned to
just after the last matching line before exiting, regardless of
the presence of trailing context lines. This enables a calling
process to resume a search.
so one can use
grep model old_file_name.txt -m 5 > new_file_name.txt
No need for a pipe. grep supports almost everything you need on it's own.
grep model [file] | head -n 5 > [newfile]
grep "model" filename | head -n 5 > newfile
cat file | grep model | head -n 5 > outfile.txt

Resources