bash line concatenation during variable interpolation - string

$ cat isbndb.sample | wc -l
13
$ var=$(cat isbndb.sample); echo $var | wc -l
1
Why is the newline character missing when I assign the string to the variable? How can I keep the newline character from being converted into a space?
I am using bash.

You have to quote the variable to preserve the newlines.
$ var=$(cat isbndb.sample); echo "$var" | wc -l
And cat is unnecessary in both cases:
$ wc -l < isbndb.sample
$ var=$(< isbndb.sample); echo "$var" | wc -l
Edit:
Bash normally strips extra trailing newlines from a file when it assigns its contents to a variable. You have to resort to some tricks to preserve them. Try this:
IFS='' read -d '' var < isbndb.sample; echo "$var" | wc -l
Setting IFS to null prevents the file from being split on the newlines and setting the delimiter for read to null makes it accept the file until the end of file.

var=($(< file))
echo ${#var[#]}

Related

How to count the number of numbers/letters in file?

I try to count the number of numbers and letters in my file in Bash.
I know that I can use wc -c file to count the number of characters but how can I fix it to only letters and secondly numbers?
Here's a way completely avoiding pipes, just using tr and the shell's way to give the length of a variable with ${#variable}:
$ cat file
123 sdf
231 (3)
huh? 564
242 wr =!
$ NUMBERS=$(tr -dc '[:digit:]' < file)
$ LETTERS=$(tr -dc '[:alpha:]' < file)
$ ALNUM=$(tr -dc '[:alnum:]' < file)
$ echo ${#NUMBERS} ${#LETTERS} ${#ALNUM}
13 8 21
To count the number of letters and numbers you can combine grep with wc:
grep -o [a-z] myfile | wc -c
grep -o [0-9] myfile | wc -c
With little bit of tweaking you can modify it to count numbers or alphabetic words or alphanumeric words like this,
grep -o [a-z]+ myfile | wc -c
grep -o [0-9]+ myfile | wc -c
grep -o [[:alnum:]]+ myfile | wc -c
You can use sed to replace all characters that are not of the kind that you are looking for and then word count the characters of the result.
# 1h;1!H will place all lines into the buffer that way you can replace
# newline characters
sed -n '1h;1!H;${;g;s/[^a-zA-Z]//g;p;}' myfile | wc -c
It's easy enough to just do numbers as well.
sed -n '1h;1!H;${;g;s/[^0-9]//g;p;}' myfile | wc -c
Or why not both.
sed -n '1h;1!H;${;g;s/[^0-9a-zA-Z]//g;p;}' myfile | wc -c
There are a number of ways to approach analyzing the line, word, and character frequency of a text file in bash. Utilizing the bash builtin character case filters (e.g. [:upper:], and so on), you can drill down to the frequency of each occurrence of each character type in a text file. Below is a simple script that reads from stdin and provides the normal wc output as it first line of output, and then outputs the number of upper, lower, digits, punct and whitespace.
#!/bin/bash
declare -i lines=0
declare -i words=0
declare -i chars=0
declare -i upper=0
declare -i lower=0
declare -i digit=0
declare -i punct=0
oifs="$IFS"
# Read line with new IFS, preserve whitespace
while IFS=$'\n' read -r line; do
# parse line into words with original IFS
IFS=$oifs
set -- $line
IFS=$'\n'
# Add up lines, words, chars, upper, lower, digit
lines=$((lines + 1))
words=$((words + $#))
chars=$((chars + ${#line} + 1))
for ((i = 0; i < ${#line}; i++)); do
[[ ${line:$((i)):1} =~ [[:upper:]] ]] && ((upper++))
[[ ${line:$((i)):1} =~ [[:lower:]] ]] && ((lower++))
[[ ${line:$((i)):1} =~ [[:digit:]] ]] && ((digit++))
[[ ${line:$((i)):1} =~ [[:punct:]] ]] && ((punct++))
done
done
echo " $lines $words $chars $file"
echo " upper: $upper, lower: $lower, digit: $digit, punct: $punct, \
whitespace: $((chars-upper-lower-digit-punct))"
Test Input
$ cat dat/captnjackn.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
(along with 2357 other pirates)
Example Use/Output
$ bash wcount3.sh <dat/captnjackn.txt
5 21 108
upper: 12, lower: 68, digit: 4, punct: 3, whitespace: 21
You can customize the script to give you as little or as much detail as you like. Let me know if you have any questions.
You can use tr to preserve only alphanumeric characters by combining the the -c (complement) and -d (delete) flags. From there on, it's just a question of some piping:
$ cat myfile.txr | tr -cd [:alnum:] | wc -c

Error Shell Script

When I try to run this script this error appears : operating extra /home/ubuntu/Desktop/Destino/, and I do not know why , someone help me please.
#!/bin/bash
input="/home/ubuntu/Desktop/Output/SAIDA.txt"
dt=`date +"%Y%m%d%H%M%S"`
layout='C'
if [ -e "$input" ] ; then
header=$(head -n 1 $input)
export header
tail -n +2 $input | split -l 99 -d --additional-suffix=.txt \ --filter='{ printf %s\\n "$header"; cat; }' >/home/ubuntu/Desktop/Destino/$FILE - NOMENCLATURA_${dt}_
for arquivo in ´Is/home/ubuntu/Desktop/*.txt´
do
NOME= ´cat $arquivo | cut -d "." -f1´
touch/home/ubuntu/Desktop/Destino/$NOME.cfg
echo $dt > $NOME.cfg
echo $layout > $NOME.cfg
done
else
echo "The input file does not exist."
fi
You have some strange quote characters in your script. To substitute the output of a command, wrap it with $() or backticks, not ´ characters.
for arquivo in ´Is/home/ubuntu/Desktop/*.txt´
I guess Is was meant to be ls, but you left out the space after it. But there's no need to parse the output of ls, just use the wildcard directly.
for arquivo in /home/ubuntu/Desktop/*.txt
On this line:
tail -n +2 $input | split -l 99 -d --additional-suffix=.txt \ --filter='{ printf %s\\n "$header"; cat; }' >/home/ubuntu/Desktop/Destino/$FILE - NOMENCLATURA_${dt}_
you need to put the output filename in quotes because of the spaces.
tail -n +2 $input | split -l 99 -d --additional-suffix=.txt \ --filter='{ printf %s\\n "$header"; cat; }' >"/home/ubuntu/Desktop/Destino/$FILE - NOMENCLATURA_${dt}_"
Also, the FILE variable is not set, you need to assign that earlier.
On this line:
NOME= ´cat $arquivo | cut -d "." -f1´
you're again using the wrong type of quotes to capture the output of the command. Also, you must not have a space between = and the value you want to assign. It should be:
NOME=$(cat $arquivo | cut -d "." -f1)
There's no need to do export header. The variable is only being used in this script, not in any child processes.

concatenate the result of echo and a command output

I have the following code:
names=$(ls *$1*.txt)
head -q -n 1 $names | cut -d "_" -f 2
where the first line finds and stores all names matching the command line input into a variable called names, and the second grabs the first line in each file (element of the variable names) and outputs the second part of the line based on the "_" delim.
This is all good, however I would like to prepend the filename (stored as lines in the variable names) to the output of cut. I have tried:
names=$(ls *$1*.txt)
head -q -n 1 $names | echo -n "$names" cut -d "_" -f 2
however this only prints out the filenames
I have tried
names=$(ls *$1*.txt
head -q -n 1 $names | echo -n "$names"; cut -d "_" -f 2
and again I only print out the filenames.
The desired output is:
$
filename1.txt <second character>
where there is a single whitespace between the filename and the result of cut.
Thank you.
Best approach, using awk
You can do this all in one invocation of awk:
awk -F_ 'NR==1{print FILENAME, $2; exit}' *"$1"*.txt
On the first line of the first file, this prints the filename and the value of the second column, then exits.
Pure bash solution
I would always recommend against parsing ls - instead I would use a loop:
You can avoid the use of awk to read the first line of the file by using bash built-in functionality:
for i in *"$1"*.txt; do
IFS=_ read -ra arr <"$i"
echo "$i ${arr[1]}"
break
done
Here we read the first line of the file into an array, splitting it into pieces on the _.
Maybe something like that will satisfy your need BUT THIS IS BAD CODING (see comments):
#!/bin/bash
names=$(ls *$1*.txt)
for f in $names
do
pattern=`head -q -n 1 $f | cut -d "_" -f 2`
echo "$f $pattern"
done
If I didn't misunderstand your goal, this also works.
I've always done it this way, I just found out that this is a deprecated way to do it.
#!/bin/bash
names=$(ls *"$1"*.txt)
for e in $names;
do echo $e `echo "$e" | cut -c2-2`;
done

Command to count the characters present in the variable

I am trying to count the number of characters present in the variable. I used the below shell command. But I am getting error - command not found in line 4
#!/bin/bash
for i in one; do
n = $i | wc -c
echo $n
done
Can someone help me in this?
In bash you can just write ${#string}, which will return the length of the variable string, i.e. the number of characters in it.
Something like this:
#!/bin/bash
for i in one; do
n=$(echo $i | wc -c)
echo $n
done
Assignments in bash cannot have a space before the equals sign. In addition, you want to capture the output of the command you run and assign that to $n, rather than that statement which would probably just assign $i to $n.
Use the following instead:
#!/bin/bash
for i in one; do
n=`$i | wc -c`
echo $n
done
It can be as simple as that:
str="abcdef"; wc -c <<< "$str"
7
But mind you that end of line counts as a character:
str="abcdef"; cat -A <<< "$str"
abcdef$
If you need to remove it:
str="abcdef"; tr -d '\n' <<< "$str" | wc -c
6

How to keep blank lines in the end of a file when I user cat command in shell script

the file a.txt has two blank lines at the end
[yaxin#oishi tmp]$ cat -n a.txt
1 jhasdfj
2
3 sdfjalskdf
4
5
and my script is:
[yaxin#oishi tmp]$ cat t.sh
#!/bin/sh
a=`cat a.txt`
a_length=`echo "$a" | awk 'END {print NR}'`
echo "$a"
echo $a_length
[yaxin#oishi tmp]$ sh t.sh
jhasdfj
sdfjalskdf
3
open debug
[yaxin#oishi tmp]$ sh -x t.sh
++ cat a.txt
+ a='jhasdfj
sdfjalskdf'
++ echo 'jhasdfj
sdfjalskdf'
++ awk 'END {print NR}'
+ a_length=3
+ echo 'jhasdfj
sdfjalskdf'
jhasdfj
sdfjalskdf
+ echo 3
3
the cat command steal the blank lines at the end of the file.How to solve this problem.
The cat command does not steal anything. It is the command substitution that does. man bash says:
Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted
If you want to store an output of a command to a variable, you might add && echo . after the command, store the output and remove the final ..
Also, to count the number of lines in a file, the cannonical way is to run wc -l:
wc -l < a.txt
You don't need cat command here, directly use awk like this:
awk 'END {print NR}' a.txt
Your problem is in storing the cat's output in a shell variable. Even this will give right output (though case of UUOC):
cat a.txt | awk 'END {print NR}'
Update: When you try to do this:
a=`cat a.txt`
OR else:
a=$(cat a.txt)
Pitfall is that the process substitution i.e. command inside reverse quote like you have or in $() strips trailing newlines.
You can do this trick to get trailing newlines stored in a shell variable:
a=`cat a.txt; echo ';'`
a="${a%;}"
Test the variable value:
echo "$a"
printf "%q" "$a"
Then output will show newlines as well:
jhasdfj
sdfjalskdf
$'jhasdfj\n\nsdfjalskdf\n\n\n'

Resources