Linux bash command incorrectly handles zeros in output. - linux

I'm using Debian 6-64.
When i'm running a command
echo -n `cat /proc/$(ps -o pid --no-header -C x-session-manager | tr -d ' ')/environ 2>/dev/null | tr '\0000' '\n'|grep XA|cut -d '=' -f 2`
to acquire XAUTHORITY for the current user logged in, I expect it to return at the moment my actual xauthority path, which is:
/var/run/gdm3/auth-for-alex-g5t0xM
but what it actually returns is
/var/run/gdm3/auth-for-alex-g5t
the part 0xM is missing.
Apparently it somehow takes 0 as '\0', truncating the output.
What can I do to receive the correct output?

tr manual:
\NNN character with octal value NNN (1 to 3 octal digits)
You've given four digits. Fourth one handled as separate symbol ('0') and gets replaced too.

This should work as well and is a little more concise:
tr '\000' '\n' </proc/$(pgrep x-session-manager)/environ | awk -F= '/XAUTHORITY/{print $2}'

Related

How to strip stdout before logging into file? [duplicate]

Without using sed or awk, only cut, how do I get the last field when the number of fields are unknown or change with every line?
You could try something like this:
echo 'maps.google.com' | rev | cut -d'.' -f 1 | rev
Explanation
rev reverses "maps.google.com" to be moc.elgoog.spam
cut uses dot (ie '.') as the delimiter, and chooses the first field, which is moc
lastly, we reverse it again to get com
Use a parameter expansion. This is much more efficient than any kind of external command, cut (or grep) included.
data=foo,bar,baz,qux
last=${data##*,}
See BashFAQ #100 for an introduction to native string manipulation in bash.
It is not possible using just cut. Here is a way using grep:
grep -o '[^,]*$'
Replace the comma for other delimiters.
Explanation:
-o (--only-matching) only outputs the part of the input that matches the pattern (the default is to print the entire line if it contains a match).
[^,] is a character class that matches any character other than a comma.
* matches the preceding pattern zero or more time, so [^,]* matches zero or more non‑comma characters.
$ matches the end of the string.
Putting this together, the pattern matches zero or more non-comma characters at the end of the string.
When there are multiple possible matches, grep prefers the one that starts earliest. So the entire last field will be matched.
Full example:
If we have a file called data.csv containing
one,two,three
foo,bar
then grep -o '[^,]*$' < data.csv will output
three
bar
Without awk ?...
But it's so simple with awk:
echo 'maps.google.com' | awk -F. '{print $NF}'
AWK is a way more powerful tool to have in your pocket.
-F if for field separator
NF is the number of fields (also stands for the index of the last)
There are multiple ways. You may use this too.
echo "Your string here"| tr ' ' '\n' | tail -n1
> here
Obviously, the blank space input for tr command should be replaced with the delimiter you need.
This is the only solution possible for using nothing but cut:
echo "s.t.r.i.n.g." | cut -d'.' -f2-
[repeat_following_part_forever_or_until_out_of_memory:] | cut -d'.' -f2-
Using this solution, the number of fields can indeed be unknown and vary from time to time. However as line length must not exceed LINE_MAX characters or fields, including the new-line character, then an arbitrary number of fields can never be part as a real condition of this solution.
Yes, a very silly solution but the only one that meets the criterias I think.
If your input string doesn't contain forward slashes then you can use basename and a subshell:
$ basename "$(echo 'maps.google.com' | tr '.' '/')"
This doesn't use sed or awk but it also doesn't use cut either, so I'm not quite sure if it qualifies as an answer to the question as its worded.
This doesn't work well if processing input strings that can contain forward slashes. A workaround for that situation would be to replace forward slash with some other character that you know isn't part of a valid input string. For example, the pipe (|) character is also not allowed in filenames, so this would work:
$ basename "$(echo 'maps.google.com/some/url/things' | tr '/' '|' | tr '.' '/')" | tr '|' '/'
the following implements A friend's suggestion
#!/bin/bash
rcut(){
nu="$( echo $1 | cut -d"$DELIM" -f 2- )"
if [ "$nu" != "$1" ]
then
rcut "$nu"
else
echo "$nu"
fi
}
$ export DELIM=.
$ rcut a.b.c.d
d
An alternative using perl would be:
perl -pe 's/(.*) (.*)$/$2/' file
where you may change \t for whichever the delimiter of file is
It is better to use awk while working with tabular data. You don't have to master on command. If it can be achieved by awk, why not use that? I suggest you do not waste your precious time, and use a handful of commands to get the job done.
Example:
# $NF refers to the last column in awk
ll | awk '{print $NF}'
If you have a file named filelist.txt that is a list paths such as the following:
c:/dir1/dir2/file1.h
c:/dir1/dir2/dir3/file2.h
then you can do this:
rev filelist.txt | cut -d"/" -f1 | rev
Adding an approach to this old question just for the fun of it:
$ cat input.file # file containing input that needs to be processed
a;b;c;d;e
1;2;3;4;5
no delimiter here
124;adsf;15454
foo;bar;is;null;info
$ cat tmp.sh # showing off the script to do the job
#!/bin/bash
delim=';'
while read -r line; do
while [[ "$line" =~ "$delim" ]]; do
line=$(cut -d"$delim" -f 2- <<<"$line")
done
echo "$line"
done < input.file
$ ./tmp.sh # output of above script/processed input file
e
5
no delimiter here
15454
info
Besides bash, only cut is used.
Well, and echo, I guess.
choose -1
choose supports negative indexing (the syntax is similar to Python's slices).
I realized if we just ensure a trailing delimiter exists, it works. So in my case I have comma and whitespace delimiters. I add a space at the end;
$ ans="a, b"
$ ans+=" "; echo ${ans} | tr ',' ' ' | tr -s ' ' | cut -d' ' -f2
b

How to find the last field using 'cut'

Without using sed or awk, only cut, how do I get the last field when the number of fields are unknown or change with every line?
You could try something like this:
echo 'maps.google.com' | rev | cut -d'.' -f 1 | rev
Explanation
rev reverses "maps.google.com" to be moc.elgoog.spam
cut uses dot (ie '.') as the delimiter, and chooses the first field, which is moc
lastly, we reverse it again to get com
Use a parameter expansion. This is much more efficient than any kind of external command, cut (or grep) included.
data=foo,bar,baz,qux
last=${data##*,}
See BashFAQ #100 for an introduction to native string manipulation in bash.
It is not possible using just cut. Here is a way using grep:
grep -o '[^,]*$'
Replace the comma for other delimiters.
Explanation:
-o (--only-matching) only outputs the part of the input that matches the pattern (the default is to print the entire line if it contains a match).
[^,] is a character class that matches any character other than a comma.
* matches the preceding pattern zero or more time, so [^,]* matches zero or more non‑comma characters.
$ matches the end of the string.
Putting this together, the pattern matches zero or more non-comma characters at the end of the string.
When there are multiple possible matches, grep prefers the one that starts earliest. So the entire last field will be matched.
Full example:
If we have a file called data.csv containing
one,two,three
foo,bar
then grep -o '[^,]*$' < data.csv will output
three
bar
Without awk ?...
But it's so simple with awk:
echo 'maps.google.com' | awk -F. '{print $NF}'
AWK is a way more powerful tool to have in your pocket.
-F if for field separator
NF is the number of fields (also stands for the index of the last)
There are multiple ways. You may use this too.
echo "Your string here"| tr ' ' '\n' | tail -n1
> here
Obviously, the blank space input for tr command should be replaced with the delimiter you need.
This is the only solution possible for using nothing but cut:
echo "s.t.r.i.n.g." | cut -d'.' -f2-
[repeat_following_part_forever_or_until_out_of_memory:] | cut -d'.' -f2-
Using this solution, the number of fields can indeed be unknown and vary from time to time. However as line length must not exceed LINE_MAX characters or fields, including the new-line character, then an arbitrary number of fields can never be part as a real condition of this solution.
Yes, a very silly solution but the only one that meets the criterias I think.
If your input string doesn't contain forward slashes then you can use basename and a subshell:
$ basename "$(echo 'maps.google.com' | tr '.' '/')"
This doesn't use sed or awk but it also doesn't use cut either, so I'm not quite sure if it qualifies as an answer to the question as its worded.
This doesn't work well if processing input strings that can contain forward slashes. A workaround for that situation would be to replace forward slash with some other character that you know isn't part of a valid input string. For example, the pipe (|) character is also not allowed in filenames, so this would work:
$ basename "$(echo 'maps.google.com/some/url/things' | tr '/' '|' | tr '.' '/')" | tr '|' '/'
the following implements A friend's suggestion
#!/bin/bash
rcut(){
nu="$( echo $1 | cut -d"$DELIM" -f 2- )"
if [ "$nu" != "$1" ]
then
rcut "$nu"
else
echo "$nu"
fi
}
$ export DELIM=.
$ rcut a.b.c.d
d
An alternative using perl would be:
perl -pe 's/(.*) (.*)$/$2/' file
where you may change \t for whichever the delimiter of file is
It is better to use awk while working with tabular data. You don't have to master on command. If it can be achieved by awk, why not use that? I suggest you do not waste your precious time, and use a handful of commands to get the job done.
Example:
# $NF refers to the last column in awk
ll | awk '{print $NF}'
If you have a file named filelist.txt that is a list paths such as the following:
c:/dir1/dir2/file1.h
c:/dir1/dir2/dir3/file2.h
then you can do this:
rev filelist.txt | cut -d"/" -f1 | rev
Adding an approach to this old question just for the fun of it:
$ cat input.file # file containing input that needs to be processed
a;b;c;d;e
1;2;3;4;5
no delimiter here
124;adsf;15454
foo;bar;is;null;info
$ cat tmp.sh # showing off the script to do the job
#!/bin/bash
delim=';'
while read -r line; do
while [[ "$line" =~ "$delim" ]]; do
line=$(cut -d"$delim" -f 2- <<<"$line")
done
echo "$line"
done < input.file
$ ./tmp.sh # output of above script/processed input file
e
5
no delimiter here
15454
info
Besides bash, only cut is used.
Well, and echo, I guess.
choose -1
choose supports negative indexing (the syntax is similar to Python's slices).
I realized if we just ensure a trailing delimiter exists, it works. So in my case I have comma and whitespace delimiters. I add a space at the end;
$ ans="a, b"
$ ans+=" "; echo ${ans} | tr ',' ' ' | tr -s ' ' | cut -d' ' -f2
b

How to check if a file contains only zeros in a Linux shell?

How to check if a large file contains only zero bytes ('\0') in Linux using a shell command? I can write a small program for this but this seems to be an overkill.
If you're using bash, you can use read -n 1 to exit early if a non-NUL character has been found:
<your_file tr -d '\0' | read -n 1 || echo "All zeroes."
where you substitute the actual filename for your_file.
The "file" /dev/zero returns a sequence of zero bytes on read, so a cmp file /dev/zero should give essentially what you want (reporting the first different byte just beyond the length of file).
If you have Bash,
cmp file <(tr -dc '\000' <file)
If you don't have Bash, the following should be POSIX (but I guess there may be legacy versions of cmp which are not comfortable with reading standard input):
tr -dc '\000' <file | cmp - file
Perhaps more economically, assuming your grep can read arbitrary binary data,
tr -d '\000' <file | grep -q -m 1 ^ || echo All zeros
I suppose you could tweak the last example even further with a dd pipe to truncate any output from tr after one block of data (in case there are very long sequences without newlines), or even down to one byte. Or maybe just force there to be newlines.
tr -d '\000' <file | tr -c '\000' '\n' | grep -q -m 1 ^ || echo All zeros
It won't win a prize for elegance, but:
xxd -p file | grep -qEv '^(00)*$'
xxd -p prints a file in the following way:
23696e636c756465203c6572726e6f2e683e0a23696e636c756465203c73
7464696f2e683e0a23696e636c756465203c7374646c69622e683e0a2369
6e636c756465203c737472696e672e683e0a0a766f696420757361676528
63686172202a70726f676e616d65290a7b0a09667072696e746628737464
6572722c202255736167653a202573203c
So we grep to see if there is a line that is not made completely out of 0's, which means there is a char different to '\0' in the file. If not, the file is made completely out of zero-chars.
(The return code signals which one happened, I assumed you wanted it for a script. If not, tell me and I'll write something else)
EDIT: added -E for grouping and -q to discard output.
Straightforward:
if [ -n $(tr -d '\0000' < file | head -c 1) ]; then
echo a nonzero byte
fi
The tr -d removes all null bytes. If there are any left, the if [ -n sees a nonempty string.
Completely changed my answer based on the reply here
Try
perl -0777ne'print /^\x00+$/ ? "yes" : "no"' file

Shell script to get count of a variable from a single line output

How can I get the count of the # character from the following output. I had used tr command and extracted? I am curious to know what is the best way to do it? I mean other ways of doing the same thing.
{running_device,[test#01,test#02]},
My solution was:
echo '{running_device,[test#01,test#02]},' | tr ',' '\n' | grep '#' | wc -l
I think it is simpler to use:
echo '{running_device,[test#01,test#02]},' | tr -cd # | wc -c
This yields 2 for me (tested on Mac OS X 10.7.5). The -c option to tr means 'complement' (of the set of specified characters) and -d means 'delete', so that deletes every non-# character, and wc counts what's provided (no newline, so the line count is 0, but the character count is 2).
Nothing wrong with your approach. Here are a couple of other approaches:
echo $(echo {running_device,[test#01,test#02]}, |awk -F"#" '{print NF - 1}')
or
echo $((`echo {running_device,[test#01,test#02]} | sed 's+[^#]++g' | wc -c` - 1 ))
The only concern I would have is if you are running this command in a loop (e.g. once for every line in a large file). If that is the case, then execution time could be an issue as stringing together shell utilities incurs the overhead of launching processes which can be sloooow. If this is the case, then I would suggest writing a pure awk version to process the entire file.
Use GNU Grep to Avoid Character Translation
Here's another way to do this that I personally find more intuitive: extract just the matching characters with grep, then count grep's output lines. For example:
echo '{running_device,[test#01,test#02]},' |
fgrep --fixed-strings --only-matching # |
wc -l
yields 2 as the result.

Split output of command by columns using Bash?

I want to do this:
run a command
capture the output
select a line
select a column of that line
Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).
If I run ps I get:
PID TTY TIME CMD
11383 pts/1 00:00:00 bash
11771 pts/1 00:00:00 ps
Now I do ps | egrep 11383 and get
11383 pts/1 00:00:00 bash
Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:
<absolutely nothing/>
The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.
One easy way is to add a pass of tr to squeeze any repeated field separators out:
$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4
I think the simplest way is to use awk. Example:
$ echo "11383 pts/1 00:00:00 bash" | awk '{ print $4; }'
bash
Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...
$ ps h -o pid,user -C ssh,sshd | tr -s " "
1543 root
19645 root
19731 root
Then cutting will result in a blank line for some of those fields if it is the first column:
$ <previous command> | cut -d ' ' -f1
19645
19731
Unless you precede it with a space, obviously
$ <command> | sed -e "s/.*/ &/" | tr -s " "
Now, for this particular case of pid numbers (not names), there is a function called pgrep:
$ pgrep ssh
Shell functions
However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:
$ <command> | while read a b; do echo $a; done
The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.
So,
while read a b c d; do echo $c; done
will then output the 3rd column. As indicated in my comment...
A piped read will be executed in an environment that does not pass variables to the calling script.
out=$(ps whatever | { read a b c d; echo $c; })
arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]} # will output 'b'`
The Array Solution
So we then end up with the answer by #frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).
try
ps |&
while read -p first second third fourth etc ; do
if [[ $first == '11383' ]]
then
echo got: $fourth
fi
done
Your command
ps | egrep 11383 | cut -d" " -f 4
misses a tr -s to squeeze spaces, as unwind explains in his answer.
However, you maybe want to use awk, since it handles all of these actions in a single command:
ps | awk '/11383/ {print $4}'
This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.
Using array variables
set $(ps | egrep "^11383 "); echo $4
or
A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}
Similar to brianegge's awk solution, here is the Perl equivalent:
ps | egrep 11383 | perl -lane 'print $F[3]'
-a enables autosplit mode, which populates the #F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.
Field 3 is printed since Perl starts counting from 0 rather than 1
Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:
command|head -n 6|tail -n 1|awk '{print $4}'
Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.
ps -o cmd= -p 12345
You get the cmmand line of a process with the pid specified and nothing else.
This is POSIX-conformant and may be thus considered portable.
Bash's set will parse all output into position parameters.
For instance, with set $(free -h) command, echo $7 will show "Mem:"

Resources