how to capture the output of "sort -c" in linux

how to capture the output of "sort -c" in linux - linux

I am trying to capture the output of "sort -c" in linux. I tried redirecting it to a file, used tee command but both did not helped. Any suggestions ?
For example:
roor>cat db | sort -c
sort: -:319: disorder: 1842251880: aa bb bc dd ee
Following failed to give me output
roor>cat db | sort -c > fileName
roor>cat db | sort -c |tee fileName
Sample file:
>cat file
111 aa as sdasd
222 sadf dzfasf af
333 sada gvsdgf h hgfhfghfg
444 asdfafasfa gsdgsdg sgsg
222 asdasd fasdfaf asdasdasd
root>cat file |sort -c
sort: -:5: disorder: 222 asdasd fasdfaf asdasdasd
8>sort -c db 2> fileName
sort: extra operand `2' not allowed with -c
0>sort -c < file 2> result1.txt
sort: open failed: 2: No such file or directory
ANY ALTERNATE TO SORT -C would ALSO WORK FOR ME!!

If sort -c is producing an error, it sends that error to "standard error" (stderr), not to "standard output" (stdout).
In shell, you need to use special redirects to capture standard error.
sort -c inputfile > /path/to/stdout.txt 2> /path/to/stderr.txt
These two output streams are called "file descriptors", and you can alternately redirect one of them to the other:
sort -c inputfile > /path/to/combined.txt 2>&1
You can read more about how these work at tldp.org, in the Bash reference manual, the bash-hackers wiki and the Bash FAQ. Happy reading! :-D

other good alternative for '2>' is STDERR pipe
|&
cat db | sort -c -h |& tee >fileName
Some time it is very suitable when present STDIN, for example:
TIMEFORMAT=%R;for i in `seq 1 20` ; do time kubectl get pods -l app=pod >/dev/null ; done |& sort -u -h
or
TIMEFORMAT=%R;for i in `seq 1 20` ; do time kubectl get pods >>log1 ; done |& sort -u -h

sort -c has no output as you might expect:
[root#home:~] cat /etc/services | sort -c
sort: -:2: disorder: #
As described by the manpage, the -c argument simply checks whether a given file or input is sorted or not.
If you're trying to catch the message from the command, try redirecting the error stream (2), not the standard output (1):
cat file | sort -c 2>result.txt

sort -conly checks if the input is sorted. It does not performs any sorting.
See: man sort
Remove -c to sort the lines.
PS: It gives the "disorder" error because the file of yours isn't already sorted. On line 5, "222" appears after "444" on the previous line.
EDIT: I think I misunderstood.
To redirect the error to a file you must use 2>.
So, the command would become: roor>cat db | sort -c 2> fileName
EDIT2: You can simply use: sort -c db 2> fileName

Related

How to store output for every xargs instance separately

cat domains.txt | xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output.json
Ffuf is used for directory and file bruteforcing while domains.txt contains valid HTTP and HTTPS URLs like http://example.com, http://example2.com. I used xargs to speed up the process by running 10 parallel instances. But the problem here is I am unable to store output for each instance separately and output.json is getting override by every running instance. Is there anything we can do to make output.json unique for every instance so that all data gets saved separately. I tried ffuf/$(date '+%s').json instead but it didn't work either.

Sure. Just name your output file using the domain. E.g.:
xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output-%.json < domains.txt
(I dropped cat because it was unnecessary.)
I missed the fact that your domains.txt file is actually a list of URLs rather than a list of domain names. I think the easiest fix is just to simplify domains.txt to be just domain names, but you could also try something like:
xargs -P10 -I % sh -c 'domain="%"; ffuf -u %/FUZZ -w wordlist.txt -o output-${domain##*/}.json' < domains.txt

cat domains.txt | xargs -P10 -I % sh -c "ping % > output.json.%"
Like this and your "%" can be part of the file name. (I changed your command to ping for my testing)
So maybe something more like this:
cat domains.txt | xargs -P10 -I % sh -c "ffuf -u %/FUZZ -w wordlist.txt -o output.json.%
"
I would replace your ffuf command with the following script, and call this from the xargs command. It just strips out the invalid file name characters and replaces them with a dot then runs the command:
#!/usr/bin/bash
URL=$1
FILE="`echo $URL | sed 's/:\/\//\./g'`"
ffuf -u ${URL}/FUZZ -w wordlist.txt -o output-${FILE}.json

extract syscall names from strace output

I use the following command to extract syscall names from strace output:
strace ls 3>&1 1>&2 2>&3 3>&- | grep -P -o '^[a-z]*(?=\()'
but this command also includes the ls output in the output.
how can I prevent that?

There are two options to strace that will help you get what you want:
-c will output a table of all system calls run by the command, together with the number of times they were called and CPU usage.
$ strace -c ls
Desktop Documents Downloads Music Pictures Public Templates Videos
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
31.07 0.000653 20 32 mmap
9.94 0.000209 20 10 mprotect
9.80 0.000206 12 16 read
8.28 0.000174 15 11 close
7.61 0.000160 16 10 fstat
6.90 0.000145 16 9 openat
2.47 0.000052 17 3 3 ioctl
...
The -o option will send strace's output to a file, so it won't get mixed in with your process's output.
The following will run the ls command, diverting its output to /dev/null, and will send strace's output to an awk script to extract the last column:
$ strace -o >(awk '$1 ~ /^-----/ { toprint = !toprint; next } { if (toprint) print $NF }') \
-c ls >/dev/null 2>/dev/null
mmap
mprotect
read
close
fstat
openat
ioctl
...

Finally I found a solution with the help of this link: http://mywiki.wooledge.org/BashFAQ/047
strace ls 2>&1 >/dev/null | grep -P -o '^[a-z]*(?=\()'
and a useful variant to count the syscalls:
strace ls 2>&1 >/dev/null | grep -P -o '^[a-z]*(?=\()' | sort | uniq -c | sort -nr
And a better solution using Mark Plotnick's answer:
strace -o >(grep -P -o '^[a-z]*(?=\()' | sort | uniq -c | sort -nr) ls &>/dev/null

linux strace: How to filter system calls that take more than a second

I'm using "strace -f -T -tt -o foo.txt -p 1234" to print the time spent in system calls. This makes the output file huge, is there a way to just print the system calls that took greater than 1second. I can grep it out from the file later, but is there a better way?

If we simply omit the -o foo.txt argument, the output goes to standard output. We can pipe it through grep and redirect to the file:
strace -f -T -tt -p 1234 | grep pattern > foo.txt
To watch the output at the same time:
strace -f -T -tt -p 1234 | grep pattern | tee foo.txt
If a command prints only to a file that is passed as an argument, and we want to filter/redirect its output, the first step is to check whether it implements the dash convention: can you specify standard input or output using - as a filename argument:
some_command - | our_pipe > file.txt
If not, then the recourse is to use Bash process substitution substitution syntax: >(output command) and <(input command):
some_command >(our_pipe > file.txt)
The process substitution syntax expands into a token that is suitable as a filename argument for a command or function. When the program opens that token, it gets a file descriptor to the command's input or output, depending on direction.
With process substitution, we can redirect the input or output of stubborn programs which work only with files passed as by name as arguments, and which do not support any convention for requesting that standard input or output be used in place of a file.
The token used by process substitution is platform-dependent; we can see what it is using echo. For instance on GNU/Linux, Bash takes advantage of the /dev/fd operating system feature:
$ echo <(ls -l)
/dev/fd/63

You can use the following command:
strace -T command 2>&1 >/dev/null | awk '{gsub(/[<>]/,"",$NF)}$NF+.0>1.0'
Explanation:
strace -T adds the time spent in the syscall end the end of the line, enclosed in <...>
2>&1 >/dev/null | awk pipes stderr to awk. (strace writes it's output to stderr!)
The awk command removes the <> from the last field $NF and prints lines where the time spent is higher than a second.
Probably you'll also want to pass the threshold as a variable to the awk command:
strace -T command 2>&1 >/dev/null \
| awk -v thres=0.001 '{gsub(/[<>]/,"",$NF)}$NF+.0>thres+.0'

Bash grep command finding the same file 5 times

I'm building a little bash script to run another bash script that's found in multiple directories. Here's the code:
cd /home/mainuser/CaseStudies/
grep -R -o --include="Auto.sh" [\w] | wc -l
When I execute just that part, it finds the same file 5 times in each folder. So instead of getting 49 results, I get 245. I've written a recursive bash script before and I used it as a template for this problem:
grep -R -o --include=*.class [\w] | wc -l
This code has always worked perfectly, without any duplication. I've tried running the first code with and without the " ", I've tried -r as well. I've read through the bash documentation and I can't seem to find a way to prevent, or even why I'm getting, this duplication. Any thoughts on how to get around this?
As a separate, but related question, if I could launch Auto.sh inside of each directory so that the output of Auto.sh was dumped into that directory; without having to place Auto.sh in each folder. That would probably be much more efficient that what I'm currently doing and it would also probably fix my current duplication problem.
This is the code for Auto.sh:
#!/bin/bash
index=1
cd /home/mainuser/CaseStudies/
grep -R -o --include=*.class [\w] | wc -l
grep -R -o --include=*.class [\w] |awk '{print $3}' > out.txt
while read LINE; do
echo 'Path '$LINE > 'Outputs/ClassOut'$index'.txt'
javap -c $LINE >> 'Outputs/ClassOut'$index'.txt'
index=$((index+1))
done <out.txt
Preferably I would like to make it dump only the javap outputs for the application its currently looking at. Since those .class files could be in any number of sub-directories, I'm not sure how to make them all dump in the top folder, without executing a modified Auto.sh in the top directory of each application.

Ok, so to fix the multiple find:
grep -R -o --include="Auto.sh" [\w] | wc -l
Should be:
grep -R -l --include=Auto.sh '\w' | wc -l
The reason this was happening, was that it was looking for instances of the letter w in Auto.sh. Which occurred 5 times in the file.
However, the overall fix that doesn't require having to place Auto.sh in every directory, is something like this:
MAIN_DIR=/home/mainuser/CaseStudies/
cd $MAIN_DIR
ls -d */ > DirectoryList.txt
while read LINE; do
cd $LINE
mkdir ProjectOutputs
bash /home/mainuser/Auto.sh
cd $MAIN_DIR
done <DirectoryList.txt
That calls this Auto.sh code:
index=1
grep -R -o --include=*.class '\w' | wc -l
grep -R -o --include=*.class '\w' | awk '{print $3}' > ProjectOutputs.txt
while read LINE; do
echo 'Path '$LINE > 'ProjectOutputs/ClassOut'$index'.txt'
javap -c $LINE >> 'ProjectOutputs/ClassOut'$index'.txt'
index=$((index+1))
done <ProjectOutputs.txt
Thanks again for everyone's help!

sort -c output being redirected to a file: Linux command

Why is sort -c output not being redirected to file temp.txt?
If I remove -c, it does redirect, as seen below:
$ cat numericSort
33:Thirty Three:6
11:Eleven:2
45:Forty Five:9
01:Zero One:1
99:Ninety Nine:9
18:Eighteen:01
56:Fifty Six:4
78:Seventy Eight:2
$ sort numericSort > temp.txt
$ cat temp.txt
01:Zero One:1
11:Eleven:2
18:Eighteen:01
33:Thirty Three:6
45:Forty Five:9
56:Fifty Six:4
78:Seventy Eight:2
99:Ninety Nine:9
$ rm temp.txt
$ sort -c numericSort > temp.txt
sort: numericSort:2: disorder: 11:Eleven:2
$ cat temp.txt
# No Output Here

The output of sort -c goes to stderr, not stdout.
If you want to redirect that instead:
$ sort -c numericSort 2> temp.txt

Based on document of sort command
-c, --check, --check=diagnose-first
check for sorted input; do not sort
Check whether the given files are already sorted: if they are not
all sorted, print an error message and exit with a status of 1.
so your command sort -c numericSort > temp.txt is essentially just checking whether the file numericSort is sorted or not. If not sorted print error on STDERR and hence you don't see any output to temp.txt. Probably you want to redirect STDERR instead like
sort -c numericSort 2> temp.txt

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to capture the output of "sort -c" in linux - linux

Related

How to store output for every xargs instance separately

extract syscall names from strace output

linux strace: How to filter system calls that take more than a second

Bash grep command finding the same file 5 times

sort -c output being redirected to a file: Linux command

Categories

Resources