Delete every number after 14 digit count - linux

I am reading numbers from a file. Then counting the total digits in that number and trying to delete all the digits after 14 digits count. In my current logic I was only able to reduce one digit if it exceeds 14 digit count. I am trying to eliminate all other digits once it reaches 14 digit count.
file:
numbers
123456789123454
3454323456765455
34543234567651666
34543234567652
34543234567653
logic.sh:
while read -r numbers || [[ -n "$numbers" ]]; do
digit_number="${imei//[^[:digit:]]/}"
echo "digit of number: ${#digit_number}"
if ((${#digit_number} > 14)); then
modified=${numbers%?}
echo $modified > res1.csv
else
echo $numbers >res1.csv
fi
done <$file
expected output
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653

Using sed
$ sed 's/[0-9]//15g' file
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653

You can use cut for that task:
╰─$ echo "12345678901234567890" | cut -c 1-14
12345678901234
There is also no need to read the file line by line:
╰─$ echo "numbers
123456789123454
3454323456765455
34543234567651666
34543234567652
34543234567653" > file
╰─$ cat file | cut -c 1-14 > output
╰─$ cat output
numbers
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653

If you want to extract only numbers, how about
grep -Eo '^[0-9]{14}' file >res1.csv

Updated your script.
while read -r numbers || [[ -n "$numbers" ]]; do
DIGITS=$(echo $numbers | cut -c 1-14)
echo $DIGITS >> res1.csv
done
Now output:
12345678912345
34543234567654
34543234567651
34543234567652
34543234567653

Related

Linux script Pipe delimiter count check

I need to check delimiter '|' count for each line in text file for that used awk command and stored count of output file in temp file.
It was generating count of delimiter for each row and script also finally I can see success with exit code 0.
but in one of the line it was showing arithmetic syntax error could some one tell me how to resolve this.
I provided sample filedata, script and script output could someone tell me what was the issue here for arithmetic syntax error.
Text file Sample data: in below sample file there were 5 '|' delimiter and some sample rows
Name|Address|phone|pincode|location|
xyz|usa|123|111|NY|
abc|uk|123|222|LON|
pqr|asia|123|333|IND|
Script:
Standard_col_cnt="5"
cd /$SRC_FILE_PATH
touch temp.txt
col_cnt=`awk -F"|" '{print NF}' $SRC_FILE_PATH/temp.txt` >>$Logfile 2>&1
while read line
do
i=1
echo $line >/temp.txt
if [ "$col_cnt" -ne "$Standard_col_cnt" ]
then
echo "No of columns are not equal to the standard value in Line no - $i:" >>$Logfile
exit 1
fi
i=`expr $i + 1`
done < $File_name
Awk command will generate below output to temp file:
5
5
5
5
--------- Script output -----------
script.sh[59]: [: |xyz|usa|123|111|NY|
: arithmetic syntax error
+ expr 1 + 1
+ i=2
+ read line
+ i=1
+ echo 'xyz|usa|123|111|NY|\r'
+ script.sh[48]: /temp.txt: cannot create [Permission denied]
+ 'abc|uk|123|222|LON|\r' -ne 91 ]
script.sh[59]: [: pqr|asia|123|333|IND|: arithmetic syntax error
Your current script will constantly reset i to 1 every time the line is read.
It is unclear how your awk code is writing to the temp file, when it seems it has just been created and is then being used to create a variable, while empty!
If you want to check the condition that the | pipe delimiters per line are 5, you could do so with just awk
Sample Data
$ cat test
Name|Address|phone|pincode|location|
xyz|usa|123|111|NY|
abc|uk|123222|LON|
pqr|asia|123|333|IND|
$ export logfile
$ cat script.awk
BEGIN {
FS="|"
Standard_col_cnt=5
logfile=ENVIRON["logfile"]
} {
if (NF-1 != Standard_col_cnt) print "No of columns are not equal to the standard value in Line no - "NR
}
$ awk -f script.awk test
$ cat "$logfile"
No of columns are not equal to the standard value in Line no - 3
col_cnt=5
grep -o -n '[|]' input_file |awk '{print $1}' | uniq -c| \
awk -v d="$col_cnt" '$1!=d {print "No of columns are not equal to the standard value in Line no - "NR}'
No of columns are not equal to the standard value in Line no - 3
other
count=5
string="No of columns are not equal to the standard value in Line no -"
grep -o -n '[|]' input_file|cut -d: -f 1| uniq -c|sed "s/^ *//;"| sed "/^[${count} ]/d"|sed "s/^[^${count} ]/${string}/"
No of columns are not equal to the standard value in Line no - 3

Shell script to read file line by line and find value greater than 1000 and print

I have a file which contains data as below:
SYSTEM: Running, Fri Jan 6 00:00:01 GMT 2017
29 DEADLETTER
123 SU
1234 SR
100089 SM
1278969 DR
From this file, I want to read each line and find the value which is greater than 1000 and if its greater than 1000 execute 1 set of command, if its less than 1000 execute another set of commands.
Is it possible?
Let me share one way of doing this (I believe someone can share a better one)
You need to remove/skip the first line of file to make this work.
while read line
do
num=`echo $line | cut -d " " -f1`
if (( $num > 1000));
then
"Your if logic"
else
"Your else logic"
fi ;
done< filename
I would use awk like this:
awk 'NR<3{next} $1<1000 {system("echo 1")} $1>1000{system("echo 2")}' file
Output
1
1
2
2
2
That says... "If the line number is less than 3, forget it and move to the next line. If the first field is less than 1000, run the command echo 1. If the first field is more than 1000, run the command echo 2".
Note that your question doesn't deal with the number being exactly 1000, so nor does my answer - but you could use $1<=1000 if you wanted to make it deal with 1000.
If you really wanted to only use bash shell:
#!/bin/bash
{ read header
read header
while read f1 f2; do
[ $f1 -lt 1000 ] && echo 1
[ $f1 -gt 1000 ] && echo 2
done } < file
The {...} is a compound statement that reads from file. Inside there, I read and discard two header lines then enter a loop reading subsequent lines into 2 separate fields, f1 and f2. I then check the value of f1 and execute a different echo depending on whether the field is less than, or more than 1000.
Same deal as above with first field being exactly 1000 - you can use -le for "less than or equal" and -ge for "greater than or equal".
Or you could use sed to get rid of the 2 header lines like this:
#!/bin/bash
sed '1,2d' file | while read f1 f2; do
[ $f1 -lt 1000 ] && echo 1
[ $f1 -gt 1000 ] && echo 2
done
This is my solution.
cat yourfile.txt | awk -F " " '{if ($1<THRESHOLD) print $1;else print $2}' > result.txt
Explain: -F option: separate each line by space " "
Example:
cat yourfile.txt | awk -F " " '{if ($1<1000) print $1;else print $2 }' > result.txt
Input
yourfile.txt contains:
29 DEADLETTER
123 SU
1234 SR
100089 SM
1278969 DR
Output
result.txt looks like this:
29
123
SR
SM
DR

How can I get a character of a string at a particular index?

I'm trying to write a script where the user enters a number as a parameter and the script calculates the sum of all the digits e.g.,
./myScript 963
18
So the script takes the string "963" and adds all the characters in the string 9+6+3=18. I'm thinking I could get the length of the string and use a loop to add all the indexes of the string together but I cannot figure out how to get an index of the string without already knowing the character you're looking for.
I was able to break the string up using the following command,
echo "963" | fold -w1
9
6
3
But I'm not sure if/how I could pipe | or redirect > the results into a variable and add it to a total each time.
How can I get a character of a string at a particular index?
Update:
Example 1:
$1=59 then the operation is
5+9=14
Example 2:
$1=2222 then the operation is
2+2+2+2=8
All the characters in the string are added to a total sum.
The following script loops through all of the digits in the input string and adds them together:
#!/bin/bash
s="$1"
for ((i=0; i<${#s}; ++i)); do
((t+=${s:i:1}))
done
echo "sum of digits: $t"
The syntax ${s:i:1} extracts a substring of length 1 from position i in the string $s.
Output:
$ ./add.sh 963
sum of digits: 18
If you wanted to continue adding together the digits until there was only one remaining, you could do this instead:
#!/bin/bash
s="$1"
while (( ${#s} > 1 )); do
t=0
for ((i=0; i<${#s}; ++i)); do
((t+=${s:i:1}))
done
echo "iteration $((++n)): $t"
s=$t
done
echo "final result: $s"
The outer while loop continues as long as the length of the string is greater than 1. The inner for loop adds together each digit in the string.
Output:
$ ./add.sh 963
iteration 1: 18
iteration 2: 9
final result: 9
Not that you asked for it but there are many ways to sum all of the digits in a string. Here's another one using Perl:
$ perl -MList::Util=sum -F -anE 'say sum #F' <<<639
18
List::Util is a core module in Perl. The sum subroutine does a reduction sum on a list to produce a single value. -a enables auto-split mode so the input is split into the array #F. -F is used to set the field delimiter (in this case it is blank, so every character counts as a separate field). -n processes every line of input one at a time and -E is used to enter a Perl one-liner but with newer features (such as say) enabled. say is like print but a newline is added to the output.
If you're not familiar with the <<< syntax, it is equivalent to echo 639 | perl ....
Not using string subscription but computing the desired sum:
number=963
sum=0
for d in `echo "$number" | sed 's,\(.\), \1,g'`
do
sum=$(($sum + $d))
done
echo $sum
Output: 18
I would do this:
num="963"
echo "$num" | grep -o . | paste -sd+ - | bc
#or using your fold
echo "$num" | fold -w1 | paste -sd+ - | bc
both prints
18
Explanation
the grep -o . return each digit from your number as well as the fold -w1
the paste -sd+ - merges the lines to one line using the delimiter + - e.g. create an calculation string like 9+6+3
the bc does the calculation
if you want script, e.g. digadd.sh use
grep -o . <<<"$1" | paste -sd+ - | bc
using it
$ bash digadd.sh #nothing
$ #will return nothing
$ bash digadd.sh 1273617617273450359345873647586378242349239471289638982
268
$
For fun, doing this in loop until the result is only 1 digit
num=12938932923849028940802934092840924
while [[ ${#num} > 1 ]]
do
echo -n "sum of digits for $num is:"
num=$(echo "$num" | grep -o . | paste -sd+ - | bc)
echo $num
done
echo "final result: $num"
prints
sum of digits for 12938932923849028940802934092840924 is:159
sum of digits for 159 is:15
sum of digits for 15 is:6
final result: 6
another fun variant, what will extract all digits from any string is:
grep -oP '\d' <<<"$1" | paste -sd+ - | bc
so using it in the script digadd.sh like
bash digadd.sh 9q6w3
produces
18
The answer for your question in the title: To getting the Nth character from any string you can use
echo "$string:POSITION:length" #position from 0
e.g. to get the 1st digit
echo "${num:0:1}"
You can use cut with -c parameter to get character at any position. for example:
echo "963" | cut -c1
Outputs: 9
Using awk:
awk 'split($0,a,""){for(i in a) sum+=i}END{print sum}' <<<$1
This can be done using substring manipulation (supported by busybox ash, but not posix sh compliant)
#!/bin/ash
i=0
sum=0
while [ $i -lt ${#1} ]; do
sum=$((sum+${1:i:1}));
i=$((i+1))
done
echo $sum
If you really must have a posix shell compliant version, you can use:
#!/bin/sh
sum=0
A=$1
while [ ${#B} -lt ${#A} ];do
B=$B?
done
while [ "$A" ]; do
B=${B#?*}
sum=$((sum+${A%$B}))
A=${A#?*}
done
echo $sum

grep -o: Keep input line format

$ echo "abca\ndeaf" | grep -o a
a
a
a
I am looking for the output:
aa
a
Or perhaps
a a
a
or even
a<TAB>a
a
(this is a very very simplified example)
I just want it not to throw away the line grouping.
You can do it with sed by removing any character that isn't a:
echo "abca\ndeaf" | sed 's/[^a]//g'
aa
a
It can't be done with grep alone.
#sudo_O's answer shows how to do this with single-character strings. The difficulty level is raised if you want to match longer strings.
One way to do it is by parsing the output of grep -n -o, like so:
$ cat mgrep
#!/bin/bash
# Print each match along with its line number.
grep -no "$#" | {
matches=() # An array of matches to be printed when the line number changes.
lastLine= # Keep track of the current and previous line numbers.
# Read the matches, with `:' as the separator.
while IFS=: read line match; do
# If this is the same line number as the previous match, add this one to
# the list.
if [[ $line = $lastLine ]]; then
matches+=("$match")
# Otherwise, print out the list of matches we've accumulated and start
# over.
else
(( ${#matches[#]} )) && echo "${matches[#]}"
matches=("$match")
fi
lastLine=$line
done
# Print any remaining matches.
(( ${#matches[#]} )) && echo "${matches[#]}"
}
Example usage:
$ echo $'abca\ndeaf' | ./mgrep a
a a
a
$ echo $'foo bar foo\nbaz\ni like food' | ./mgrep foo
foo foo
foo
Based off John Kugelman's solution, this one works with one input file and gawk
grep -on abc file.txt | awk -v RS='[[:digit:]]+:' 'NF{$1=$1; print}'
If you're willing to use perl:
$ echo $'abca\ndeaf' | perl -ne '#m = /a/g; print "#m\n"'
a a
a

Bash: extract (percent) number of variable length from a string

I want to write a little progress bar using a bash script.
To generate the progress bar I have to extract the progress from a log file.
The content of such a file (here run.log) looks like this:
Time to finish 2d 15h, 42.5% completed, time steps left 231856
I'm now intersted to isolate the 42.5%. The problem is now that the length of this digit is variable as well as the position of the number (e.g. 'time to finish' might content only one number like 23h or 59min).
I tried it over the position via
echo "$(tail -1 run.log | awk '{print $6}'| sed -e 's/[%]//g')"
which fails for short 'Time to finish' as well as via the %-sign
echo "$(tail -1 run.log | egrep -o '[0-9][0-9].[0-9]%')"
Here is works only for digits >= 10%.
Any solution for a more variable nuumber extraction?
======================================================
Update: Here is now the full script for the progress bar:
#!/bin/bash
# extract % complete from run.log
perc="$(tail -1 run.log | grep -o '[^ ]*%')"
# convert perc to int
pint="${perc/.*}"
# number of # to plot
nums="$(echo "$pint /2" | bc)"
# output
echo -e ""
echo -e " completed: $perc"
echo -ne " "
for i in $(seq $nums); do echo -n '#'; done
echo -e ""
echo -e " |----.----|----.----|----.----|----.----|----.----|"
echo -e " 0% 20% 40% 60% 80% 100%"
echo -e ""
tail -1 run.log
echo -e ""
Thanks for your help, guys!
based on your example
grep -o '[^ ]*%'
should give what you want.
You can extract % from below command:
tail -n 1 run.log | grep -o -P '[0-9]*(\.[0-9]*)?(?=%)'
Explanation:
grep options:
-o : Print only matching string.
-P : Use perl style regex
regex parts:
[0-9]* : Match any number, repeated any number of times.
(\.[0-9]*)? : Match decimal point, followed by any number of digits.
? at the end of it => optional. (this is to take care of numbers without fraction part.)
(?=%) :The regex before this must be followed by a % sign. (search for "positive look-ahead" for more details.)
You should be able to isolate the progress after the first comma (,) in your file. ie.you want the characters between , and %
There are many ways to achieve your goal. I would prefer using cut several times as it is easy to read.
cut -f1 -d'%' | cut -f2 -d',' | cut -f2 -d' '
After first cut:
Time to finish 2d 15h, 42.5
After second (note space):
42.5
And the last one just to get rid of space, the final result:
42.5

Resources