why does echo -n "100" | wc -c output 3? - linux

I just happened to be playing around with a few linux commands and i found that echo -n "100" | wc -c outputs 3. i knew that 100 could be stored in a single byte as 1100100 so i could not understand why this happened. I guess that it is because of some teminal encoding, is it ? i also found out that if i touch test.txt and echo -n "100" | test.txt and then execute wc ./test.txt -ci get the same output here also my guess is to blame file encoding, am i right ?

100 is three characters long, hence wc giving you 3. If you left out the -n to echo it'd show 4, because echo would be printing out a newline too in that case.

When you echo -n 100, you are showing a string with 3 characters.
When you want to show a character with ascii value 100, use
echo -n "d"
# Check
echo -n "d" | xdd -b
I found value "d" with man ascii. When you don't want to use the man page, use
printf "\\$(printf "%o" 100)"
# Check
printf "\\$(printf "%o" 100)" | xxd -b
# wc returns 1 here
printf "\\$(printf "%o" 100)" | wc -c

It's fine)
$ wc --help
...
-c, --bytes print the byte counts
-m, --chars print the character counts
...
$ man echo
...
-n do not output the trailing newline
...
$ echo -n 'abc' | wc -c
3
$ echo -n 'абс' | wc -c # russian symbols
6

Related

I want to check if some given files contain more then 3 words from an input file in a shell script

My first parameter is the file that contains the given words and the rest are the other directories in which I'm searching for files, that contain at least 3 of the words from the 1st parameter
I can successfully print out the number of matching words, but when testing if it's greater then 3 it gives me the error: test: too many arguments
Here's my code:
#!/bin/bash
file=$1
shift 1
for i in $*
do
for j in `find $i`
do
if test -f "$j"
then
if test grep -o -w "`cat $file`" $j | wc -w -ge 3
then
echo $j
fi
fi
done
done
You first need to execute the grep | wc, and then compare that output with 3. You need to change your if statement for that. Since you are already using the backquotes, you cannot nest them, so you can use the other syntax $(command), which is equivalent to `command`:
if [ $(grep -o -w "`cat $file`" $j | wc -w) -ge 3 ]
then
echo $j
fi
I believe your problem is that you are trying to get the result of grep -o -w "cat $file" $j | wc -w to see if it's greater or equal to three, but your syntax is incorrect. Try this instead:
if test $(grep -o -w "`cat $file`" $j | wc -w) -ge 3
By putting the grep & wc commands inside the $(), the shell executes those commands and uses the output rather than the text of the commands themselves. Consider this:
> cat words
western
found
better
remember
> echo "cat words | wc -w"
cat words | wc -w
> echo $(cat words | wc -w)
4
> echo "cat words | wc -w gives you $(cat words | wc -w)"
cat words | wc -w gives you 4
>
Note that the $() syntax is equivalent to the double backtick notation you're already using for the cat $file command.
Hope this helps!
Your code can be refactored and corrected at few places.
Have it this way:
#!/bin/bash
input="$1"
shift
for dir; do
while IFS= read -r d '' file; do
if [[ $(grep -woFf "$input" "$file" | sort -u | wc -l) -ge 3 ]]; then
echo "$file"
fi
done < <(find "$dir" -type f -print0)
done
for dir loops through all the arguments
Use of sort -u is to remove duplicate words from output of grep.
Usewc -linstead ofwc -wsincegrep -o` prints matching words in separate lines.
find ... -print0 is to take care of file that may have whitespaces.
find ... -type f is to retrieve only files and avoid checking for -f later.

How to get pipe string length?

This is a code that shows my all user names.
-q user | grep -A 0 -B 2 -e uid:\ 5'[0-9][0-9]' | grep ^name | cut -d " " -f2-
For example, the output is like...
usernameone
hello
whoami
Then, I hope that I want to check a length of all user names.
Like this output...
11 //usernameone
5 //hello
6 //whoami
How can I get a length of pipeline code?
Given some command cmd that produces the list of users, you can do this pretty easily with xargs:
$ cat x
usernameone
hello
whoami
$ cat x | xargs -L 1 sh -c 'printf "%s //%s\n" "$(echo -n "$1" | wc -c)" "$1"' '{}'
11 //usernameone
5 //hello
6 //whoami
To get a piped command might not be possible, so here's a one liner that uses a split and a while loop to accomplish this:
-q user | grep -A 0 -B 2 -e uid:\ 5'[0-9][0-9]' | grep ^name | cut -d " " -f2-|tr " " "\n"|while read user; do echo $(echo $user|wc -c) '//'$user;done|tr "\n" " ";echo
This should give you an output in the desired format. I used user as a file hence the cat
i=0;for token in $(cat user); do echo -n "${#token} //$token";echo;i=$((i+1));done;echo;

Why does the wc command count one more character than expected?

The following is the content stored in my file
This is my Input
So, using wc -c command we can get the number of characters stored in the file.
My expected output for above file that edited by using Vim in Ubuntu is 16. But, wc -c command returns 17.
Why is the output like this? There isn't even a carriage return at end of line. So, what is the 17th character?
Of course you had enter. Maybe you can't see it. Consider these two examples:
echo -n "This is my Input" | wc -c
16
Because -n is for avoiding enter, but
echo "This is my Input" | wc -c
17
Look at this example too see the new line:
How to see newline?
echo "This is my Input" | od -c
od dumps files in octal and other formats. -c selects ASCII characters or backslash escapes.
And here is an example for file and usage of od:
In Linux, when Vim saves buffers, it will terminate every line by appending line terminator of new line.
You can open your file and input :!xxd to view hex-dump or directly use hexdump yourfile command.
0000000: 5468 6973 2069 7320 6d79 2049 6e70 7574 This is my Input
0000010: 0a .
~
~
~
In there you can see, the file have appended 0a in the end of file.
So when you use wc -c to get the number of this file, it will return 17 that includes the new line symbol.
The input string you are giving as input has no enter/new line, but echo is assigning enter/newline to it. And wc -c reads enter or newline from given by the echo command.
For example
echo k | wc -c
returns 2 because 1 for k and 1 for new line appended by echo.
While
echo -n k | wc -c
returns 1 because -n suppresses the newline.
But wc -c always reads newline.
You can try
printf k | wc -c
It returns 1.
See what it does in file:
bash-4.1$ echo 1234 > newfile
bash-4.1$ cat newfile
1234
bash-4.1$ cat -e newfile
1234$
bash-4.1$ printf 1234 > newfile
bash-4.1$ cat newfile
1234bash-4.1$ cat -e newfile
1234bash-4.1$
You have 17 because of the /0 chaeracter.

Using -0 with xargs

I am trying to give an input to xargs that is NUL separated. To this effect I have this:
$ echo -n abc$'\000'def$'\000' | xargs -0 -L 1
I get
abcdef
I wonder why doesn't it print o/p as
abc
def
Your main problem is that you forgot -e:
$ echo -n abc$'\000'def$'\000' |cat -v
abcdef
No zero bytes are seen. But this:
$ echo -en abc'\000'def'\000' |cat -v
abc^#def^#
is more like it, the ^# is how cat -v shows a zero byte. And now for xargs:
$ echo -en abc'\000'def'\000' | xargs -0 -L 1
abc
def
Try help echo from your bash prompt.
Try treating the input as a single quoted string.
echo -ne "abc\0def\0" | xargs -0 -L 1

bash line concatenation during variable interpolation

$ cat isbndb.sample | wc -l
13
$ var=$(cat isbndb.sample); echo $var | wc -l
1
Why is the newline character missing when I assign the string to the variable? How can I keep the newline character from being converted into a space?
I am using bash.
You have to quote the variable to preserve the newlines.
$ var=$(cat isbndb.sample); echo "$var" | wc -l
And cat is unnecessary in both cases:
$ wc -l < isbndb.sample
$ var=$(< isbndb.sample); echo "$var" | wc -l
Edit:
Bash normally strips extra trailing newlines from a file when it assigns its contents to a variable. You have to resort to some tricks to preserve them. Try this:
IFS='' read -d '' var < isbndb.sample; echo "$var" | wc -l
Setting IFS to null prevents the file from being split on the newlines and setting the delimiter for read to null makes it accept the file until the end of file.
var=($(< file))
echo ${#var[#]}

Resources