Extract substring in BASH script whit characters treated as numbers - string

I need to extract a substring of string in bash script.
This is the code with "echos":
echo "number:"
echo "$number"
echo "bb"
registers3=$(echo $number | grep -o -E '[0-9]+')
registers2="$(grep -oE '[0-9]+' <<< "$number")"
registers="${number//[^0-9]/}"
valor=$(grep -o "[0-9]" <<<"$number")
echo "valor:"
echo $valor
echo "reg:"
echo "$registers"
echo "reg2:"
echo "$registers2"
echo "reg3:"
echo "$registers3"
And this the output:
number:
/ > -------
420
/ >
bb
valor:
1 0 3 4 4 2 0
reg:
1034420
reg2:
1034
420
reg3:
1034
420
the problem is the special characters of $number.
can you help me to extract only the number. in this case is "421"
Thanks!!!
EDIT:
If i put $number in file ($number> file.txt) and open with vi and :set list i get:
^[[?1034h/ > -------$
420$
/ > $

Instead of this:
registers=$(echo $number | grep -o -E '[0-9]+')
try this:
registers="$(grep -oE '[0-9]+' <<< "$number")"
or even better, this one:
registers="${number//[^0-9]/}"

Related

How to select a group lines by your content of a file in GNU/Linux?

Good days. I have a problem. I have that get the lines that has a content specified. The command grep allow search a content specified of a file, but this line by line. I would like to select a content of various lines.
How to do this?
something as
cat -n /etc/profile | grep "
if [ "$DISPLAY" != "" ]
then
xhost +si:localuser:root
fi
"
Thank you very much.
Suggesting to research grep with option -z.
But better option is awk.
With awk it is possible to select a range awk '/RegExp1/,/RegExp2/' input.txt
In your case:
awk '/if/,/fi/{print}' input.txt
Will print all lines in if fi range.
Also With awk it is possible to define record separator with RS variable. For example record separator is empty line. awk '1' RS="\n\n"
at end, I solve it, so...
sed -i "$(cat -n /etc/profile | gawk '/if \[ "\$DISPLAY" != "" \]/,/fi/{print $1}' | tr '\n' ',' | sed 's/,$//g' | sed -E 's/,(.+),/,/g')d" /etc/profile
cat -n /etc/profile shows the file content and line numbers
27 if [ -d /etc/profile.d ]; then
28 for i in /etc/profile.d/*.sh; do
29 if [ -r $i ]; then
30 . $i
31 fi
32 done
33 unset i
34 fi
35 #start
36 if [ "$DISPLAY" != "" ]
37 then
38 xhost +si:localuser:root
39 fi
40 #end
Thank you to #Dudi Boy this command gawk '/if \[ "\$DISPLAY" != "" \]/,/fi/{print $1}' show the lines of section specified.
36
37
38
39
Now I replace the \n by ,: tr '\n' ','
36,37,38,39,
I delete the last comma (,): sed 's/,$//g'
36,37,38,39
And get only the first and last numbers separed by comma (,): sed -E 's/,(.+),/,/g'
36,39
Finally the result I execute it in a command sed: sed -i "$(cat -n /etc/profile | gawk '/if \[ "\$DISPLAY" != "" \]/,/fi/{print $1}' | tr '\n' ',' | sed 's/,$//g' | sed -E 's/,(.+),/,/g')d" /etc/profile
The result is
if [ -d /etc/profile.d ]; then
for i in /etc/profile.d/*.sh; do
if [ -r $i ]; then
. $i
fi
done
unset i
fi
#start
#end
Thank you very much
It is possible to accomplish the task of removing/substruct the ranged section with oneqawk command:
gawk -i inplace '/if \[ "\$DISPLAY" != "" \]/,/fi/{next}1' /etc/profile

Delete every number after 14 digit count

I am reading numbers from a file. Then counting the total digits in that number and trying to delete all the digits after 14 digits count. In my current logic I was only able to reduce one digit if it exceeds 14 digit count. I am trying to eliminate all other digits once it reaches 14 digit count.
file:
numbers
123456789123454
3454323456765455
34543234567651666
34543234567652
34543234567653
logic.sh:
while read -r numbers || [[ -n "$numbers" ]]; do
digit_number="${imei//[^[:digit:]]/}"
echo "digit of number: ${#digit_number}"
if ((${#digit_number} > 14)); then
modified=${numbers%?}
echo $modified > res1.csv
else
echo $numbers >res1.csv
fi
done <$file
expected output
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653
Using sed
$ sed 's/[0-9]//15g' file
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653
You can use cut for that task:
╰─$ echo "12345678901234567890" | cut -c 1-14
12345678901234
There is also no need to read the file line by line:
╰─$ echo "numbers
123456789123454
3454323456765455
34543234567651666
34543234567652
34543234567653" > file
╰─$ cat file | cut -c 1-14 > output
╰─$ cat output
numbers
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653
If you want to extract only numbers, how about
grep -Eo '^[0-9]{14}' file >res1.csv
Updated your script.
while read -r numbers || [[ -n "$numbers" ]]; do
DIGITS=$(echo $numbers | cut -c 1-14)
echo $DIGITS >> res1.csv
done
Now output:
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653

why does echo -n "100" | wc -c output 3?

I just happened to be playing around with a few linux commands and i found that echo -n "100" | wc -c outputs 3. i knew that 100 could be stored in a single byte as 1100100 so i could not understand why this happened. I guess that it is because of some teminal encoding, is it ? i also found out that if i touch test.txt and echo -n "100" | test.txt and then execute wc ./test.txt -ci get the same output here also my guess is to blame file encoding, am i right ?
100 is three characters long, hence wc giving you 3. If you left out the -n to echo it'd show 4, because echo would be printing out a newline too in that case.
When you echo -n 100, you are showing a string with 3 characters.
When you want to show a character with ascii value 100, use
echo -n "d"
# Check
echo -n "d" | xdd -b
I found value "d" with man ascii. When you don't want to use the man page, use
printf "\\$(printf "%o" 100)"
# Check
printf "\\$(printf "%o" 100)" | xxd -b
# wc returns 1 here
printf "\\$(printf "%o" 100)" | wc -c
It's fine)
$ wc --help
...
-c, --bytes print the byte counts
-m, --chars print the character counts
...
$ man echo
...
-n do not output the trailing newline
...
$ echo -n 'abc' | wc -c
3
$ echo -n 'абс' | wc -c # russian symbols
6

I want to check if some given files contain more then 3 words from an input file in a shell script

My first parameter is the file that contains the given words and the rest are the other directories in which I'm searching for files, that contain at least 3 of the words from the 1st parameter
I can successfully print out the number of matching words, but when testing if it's greater then 3 it gives me the error: test: too many arguments
Here's my code:
#!/bin/bash
file=$1
shift 1
for i in $*
do
for j in `find $i`
do
if test -f "$j"
then
if test grep -o -w "`cat $file`" $j | wc -w -ge 3
then
echo $j
fi
fi
done
done
You first need to execute the grep | wc, and then compare that output with 3. You need to change your if statement for that. Since you are already using the backquotes, you cannot nest them, so you can use the other syntax $(command), which is equivalent to `command`:
if [ $(grep -o -w "`cat $file`" $j | wc -w) -ge 3 ]
then
echo $j
fi
I believe your problem is that you are trying to get the result of grep -o -w "cat $file" $j | wc -w to see if it's greater or equal to three, but your syntax is incorrect. Try this instead:
if test $(grep -o -w "`cat $file`" $j | wc -w) -ge 3
By putting the grep & wc commands inside the $(), the shell executes those commands and uses the output rather than the text of the commands themselves. Consider this:
> cat words
western
found
better
remember
> echo "cat words | wc -w"
cat words | wc -w
> echo $(cat words | wc -w)
4
> echo "cat words | wc -w gives you $(cat words | wc -w)"
cat words | wc -w gives you 4
>
Note that the $() syntax is equivalent to the double backtick notation you're already using for the cat $file command.
Hope this helps!
Your code can be refactored and corrected at few places.
Have it this way:
#!/bin/bash
input="$1"
shift
for dir; do
while IFS= read -r d '' file; do
if [[ $(grep -woFf "$input" "$file" | sort -u | wc -l) -ge 3 ]]; then
echo "$file"
fi
done < <(find "$dir" -type f -print0)
done
for dir loops through all the arguments
Use of sort -u is to remove duplicate words from output of grep.
Usewc -linstead ofwc -wsincegrep -o` prints matching words in separate lines.
find ... -print0 is to take care of file that may have whitespaces.
find ... -type f is to retrieve only files and avoid checking for -f later.

Why do I get an extra 0 on my script

I don't know why I get an extra 0 when I run my script.
This is my script: I run a SQL query and save it ta an file valor.txt.
This is my array: array=(50 60 70)
Valor.txt:
count | trn_hst_id | trn_msg_host
-------+------------+--------------
11 | 50 | Aprobada
2 | 70 | Aprobada
(2 rows)
Code:
function service_status {
cd
cat valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g"
if [ $? -eq 0 ]; then
echo -n 0
else
echo -n $1
fi
}
echo "<prtg>"
# <-- Start
for i in "${array[#]}"
do
echo -n " <result>
<channel>$i</channel>
<value>"
service_status $i
echo "</value>
</result>"
done
# End -->
echo "</prtg>"
exit
And this is my output.
<prtg>
<result>
<channel>50</channel>
<value>11
0</value>
</result>
<result>
<channel>60</channel>
<value>0</value>
</result>
<result>
<channel>70</channel>
<value>2
0</value>
</result>
</prtg>
Why do I get the 0 here? —
<value>2
0</value>
If I understand your comment correctly, you want to print the count. That is the value of the count column, if present in valor.txt, or 0 if the trn_hst_id in array is not in valor.txt. This should work (though not tested):
function service_status {
val=$(cat ~/valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g")
# ^^ so you don't need to "cd" each time
# Save the value into "$val"
echo -n "${val:-0}" # If there is nothing in $val, print a 0
}
The "${val:-0}" sequence expands as "$val", if $val has text in it, or as a literal 0 otherwise. If the $1 wasn't in valor.txt, $val will be empty, so you will get a zero. See the wiki for more about how :- and friends work.
The "0" is the result of the echo -n 0 which is executed inside the function in case the awk command works properly (which is usually the case).
From the code it is not clear why is it written like it is. It's clear that is supposed to extract certain values from a file, but the 'if' condition seems to be checking the wrong thing, the return code of 'sed' which I bet is not what is intended. (better candidate would be the return code of 'grep'.
So I would write the function like this:
function service_status {
cd
var=$(cat valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g")
if [ -z "$var" ]; then
echo -n 0
else
echo -n "$var"
fi
}
The variable 'var' will contain the result of the "search command". If the search would not return any value then 'var' will be empty and '0' will be the output of the function, otherwise the content of 'var' will be on the output.

Resources