What does wc -w do in an echo and tr command? - linux

I'm currently approaching Linux and stumbled upon something I don't really understand.
I have a already stated command going:
echo "12345"|wc –w|tr "123" "321"
The output of this command is 3, so I thought that it might count how many of these numbers have change, but after some testing I came up with a conclusion that in fact it shows the first number in second tr argument, since it worked in many cases.
For a while I thought I was done with my experiments since I got the whole idea, but I've found a specific case:
echo "46817"|wc -w|tr "46817" "64194" which outputs in 9 and I don't have any idea why.
What does the whole command outputs in not certain cases?

The last command tr changes numbers in the score of second command. So as wc command counts words in first argument (is equal to 1) than last command changes intiger 1 to 9.

echo "12345"|wc –w|tr "123" "321" (outputs 3)
echo "46817"|wc -w|tr "46817" "64194" (outputs 9)
The above commands are pipes in which the output of each command is fed to the next one. Commands are separated by "|" (symbol named, surprise!, "pipe"). Both commands do:
echo: outputs something (to wc).
wc: counts characters, or words, or lines. "wc -w" counts words, so it will output "1" because "12345" and "46817" are words not containing any word separator.
tr: "translates", i.e. changes the characters it receives with other ones. When specifying "123" "321" the 1's (first char in 123) is translated in 3 (the first char of 321); the 2's (second char in 123) are translated into 2 (second char in 321) and so on.
In both commands tr receives "1" as input, and turns that "1" in some other character.

Related

How to return only integers from a variable in Shell Script and discard letters and leading zeros?

In my shell script there is a parameter that comes from certain systems and it gives an answer similar to this one: PAR0000008.
And I need to send only the last number of this parameter to another variable, ie VAR=8.
I used the command VAR=$( echo ${PAR} | cut -c 10 ) and it worked perfectly.
The problem is when the PAR parameter returns with numbers from two decimal places like PAR0000012. I need to discard the leading zeros and send only the number 12 to the variable, but I don't know how to do the logic in the Shell to discard all the characters to the left of the number.
Edit Using grep To Handle 0 As Part Of Final Number
Since you are using POSIX shell, making use of a utility like sed or grep (or cut) makes sense. grep is quite a bit more flexible in parsing the string allowing a REGEX match to handle the job. Say your variable v=PAR0312012 and you want the result r=312012. You can use a command substitution (e.g. $(...)) to parse the value assigning the result to r, e.g.
v=PAR0312012
r=$(echo $v | grep -Eo '[1-9].*$')
echo $r
The grep expression is:
-Eo - use Extended REGEX and only return matching portion of string,
[1-9].*$ - from the first character in [1-9] return the remainder of the string.
This will work for PAR0000012 or PAR0312012 (with result 312012).
Result
For PAR0312012
312012
Another Solution Using expr
If your variable can have zeros as part of the final number portion, then you must find the index where the first [1-9] character occurs, and then assign the substring beginning at that index to your result variable.
POSIX shell provides expr which provides a set of string parsing tools that can to this. The needed commands are:
expr index string charlist
and
expr substr string start end
Where start and end are the beginning and ending indexes to extract from the string. end just has to be long enough to encompass the entire substring, so you can just use the total length of your string, e.g.
v=PAR0312012
ndx=$(expr index "$v" "123456789")
r=$(expr substr "$v" "$ndx" 10)
echo $r
Result
312012
This will handle 0 anywhere after the first [1-9].
(note: the old expr ... isn't the fastest way of handling this, but if you are only concerned with a few tens of thousands of values, it will work fine. A billion numbers and another method will likely be needed)
This can be done easily using Parameter Expension.
var='PAR0000008'
echo "${var##*0}"
//prints 8
echo "${var##*[^1-9]}"
//prints 8
var="${var##*0}"
echo "$var"
//prints 8
var='PAR0000012'
echo "${var##*0}"
//prints 12
echo "${var##*[^1-9]}"
//prints 12
var="${var##*[^1-9]}"
echo "$var"
//prints 12

Show rows of a file which have a regular expression more than 'n' number of times

I have file- abc.txt, in below format-
a:,b:,c:,d:,e:,f:,g:
a:0;b:,c:3,d:,e:,f:,g:1
a:9,b:8,c:6,d:5,e:2,f:,g:
a:0;b:,c:2,d:1,e:,f:,g:
Now in unix, I want to get only those rows where this regular expression :[0-9] (colon followed by any number) exists more than 2 times.
Or in other words show rows where at least 3 attributes have numerical values present.
Output should be only 2nd and 3rd row
a:0;b:,c:3,d:,e:,f:,g:1
a:9,b:8,c:6,d:5,e:2,f:,g:
With basic grep:
grep '\(:[[:digit:]].*\)\{3,\}' file
:[[:digit:]].* matches a colon followed by a digit and zero or more arbitrary characters. This expressions is put into a sub pattern: \(...\). The expression \{3,\} means that the previous expression has to occur 3 or more times.
With extended posix regular expressions this can be written a little simpler, without the need to escape ( and {:
grep -E '(:[[:digit:]].*){3,}' file
$ awk -F':[0-9]' 'NF>3' file
a:0;b:,c:3,d:,e:,f:,g:1
a:9,b:8,c:6,d:5,e:2,f:,g:
a:0;b:,c:2,d:1,e:,f:,g:
perl -nE '/:[0-9](?{$count++})(?!)/; print if $count > 2; $count=0' input
perl -ne 'print if /(.*?\:\d.*?){2,}/' yourfile
This matches rows having character:number twice or more times.
https://regex101.com/r/tRWtbY/1

How to search for string including digits by grep command

I have strings in a file in below format:
fixedstring_1
fixedstring_23
fixedstring_456
...
fixedstring_[1 to n digits]
I tried with grep -E "fixedstring_[.....n times]" filepath in terminal. But, failed.
I want commands to get the count (-c) and list the lines.
If I understand correctly, given the following file...
fixedstring_1
bar
fixedstring_456
foo
fixedstring_45622
fixedstring_
fixedstring
You want to match (and get the count of) only these lines:
fixedstring_1
fixedstring_456
fixedstring_45622
This should work:
grep -Ec 'fixedstring_[[:digit:]]+' filename
The [[:digit:]]+ part matches 1 or more digits. More on grep regexes here: http://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#Regular-Expressions
EDIT:
If you want to match strings with only a certain number of digit's you'll have to get a little more clever:
grep -E 'fixedstring_[[:digit:]]{MIN,MAX}([^[:digit:]]|$)' filename
Replace the MIN with the minimum number of digits you want to match, and MAX with the max.

Return value of sed for no match

I'm using sed for updating my JSON configuration file in the runtime.
Sometimes, when the pattern doesn't match in the JSON file, sed still exits with return code 0.
Returning 0 means successful completion, but why does sed return 0 if it doesn't find the proper pattern and update the file? Is there a workaround for that?
as #cnicutar commented, the return code of a command means if the command was executed successfully. has nothing to do with the logic you implemented in the codes/scripts.
so if you have:
echo "foo"|sed '/bar/ s/a/b/'
sed will return 0 but if you write some syntax/expression errors, or the input/file doesn't exist, sed cannot execute your request, sed will return 1.
workaround
this is actually not workaround. sed has q command: (from man page):
q [exit-code]
here you can define exit-code as you want. For example '/foo/!{q100}; {s/f/b/}' will exit with code 100 if foo isn't present, and otherwise perform the substitution f->b and exit with code 0.
Matched case:
kent$ echo "foo" | sed '/foo/!{q100}; {s/f/b/}'
boo
kent$ echo $?
0
Unmatched case:
kent$ echo "trash" | sed '/foo/!{q100}; {s/f/b/}'
trash
kent$ echo $?
100
I hope this answers your question.
edit
I must add that, the above example is just for one-line processing. I don't know your exact requirement. when you want to get exit 1. one-line unmatched or the whole file. If whole file unmatching case, you may consider awk, or even do a grep before your text processing...
This might work for you (GNU sed):
sed '/search-string/{s//replacement-string/;h};${x;/./{x;q0};x;q1}' file
If the search-string is found it will be replaced with replacement-string and at end-of-file sed will exit with 0 return code. If no substitution takes place the return code will be 1.
A more detailed explanation:
In sed the user has two registers at his disposal: the pattern space (PS) in which the current line is loaded into (minus the linefeed) and a spare register called the hold space (HS) which is initially empty.
The general idea is to use the HS as a flag to indicate if a substitution has taken place. If the HS is still empty at the end of the file, then no changes have been made, otherwise changes have occurred.
The command /search-string/ matches search-string with whatever is in the PS and if it is found to contain the search-string the commands between the following curly braces are executed.
Firstly the substitution s//replacement-string/ (sed uses the last regexp i.e. the search-string, if the lefthand-side is empty, so s//replacement-string is the same as s/search-string/replacement-string/) and following this the h command makes a copy of the PS and puts it in the HS.
The sed command $ is used to recognise the last line of a file and the following then occurs.
First the x command swaps the two registers, so the HS becomes the PS and the PS becomes the HS.
Then the PS is searched for any character /./ (. means match any character) remember the HS (now the PS) was initially empty until a substitution took place. If the condition is true the x is again executed followed by q0 command which ends all sed processing and sets the return code to 0. Otherwise the x command is executed and the return code is set to 1.
N.B. although the q quits sed processing it does not prevent the PS from being reassembled by sed and printed as per normal.
Another alternative:
sed '/search-string/!ba;s//replacement-string/;h;:a;$!b;p;x;/./Q;Q1' file
or:
sed '/search-string/,${s//replacement-string/;b};$q1' file
These answers are all too complicated. What is wrong with writing a bit of shell script that uses grep to figure out if the thing you want to replace is there then using sed to replace it?
grep -q $TARGET_STRING $file
if [ $? -eq 0 ]
then
echo "$file contains the old site"
sed -e "s|${TARGET_STRING}|${NEW_STRING}|g" ....
fi
For 1 line of input. To avoid repeating the /pattern/:
When s succeeds to substitute, use t to jump conditionally to a label, e.g. x. Otherwise use q to quit with an exit code, e.g. 100:
's/pattern/replacement/;tx;q100;:x'
Example:
$ echo 1 > one
$ < one sed 's/1/replaced-it/;tx;q1;:x'
replaced-it
$ echo $?
0
$ < one sed 's/999/replaced-it/;tx;q100;:x'
1
$ echo $?
100
https://www.gnu.org/software/sed/manual/html_node/Branching-and-flow-control.html
We have the answer above but it took some time for me work out what is happening. I am trying to provide a simple explanation for basic user of sed like me.
Lets consider the example:
echo "foo" | sed '/foo/!{q100}; {s/f/b/}'
Here we have two sed commands. First one is '/foo/!{q100}' This command actually check the pattern matching and return exist code 100 if no match. Consider following examples, -n is used to silent the output so we only get exist code.
This example foo matches so exit code return is 0
echo "foo" | sed -n '/foo/!{q100}'; echo $?
0
This example input is foo and we try match boo so no match and exit code 100 is returned
echo "foo" | sed -n '/boo/!{q100}'; echo $?
100
So if my requirement is only to check a pattern match or not I can use
echo "<input string>" | sed -n '/<pattern to match>/!{q<exit-code>}'
More examples:
echo "20200206" | sed -n '/[0-9]*/!{q100}' && echo "Matched" || echo "No Match"
Matched
echo "20200206" | sed -n '/[0-9]{2}/!{q100}' && echo "Matched" || echo "No Match"
No Match
Second command is '{s/f/b/}' is to replace the f in foo with b which I used many times.
Below is the pattern we use with sed -rn or sed -r.
The entire search and replace command ("s/.../.../...") is optional. If the search and replace is used, for speed and having already matched $matchRe, we use as fast a $searchRe value as possible, using . where the character does not need to be re-verified and .{$len} for fixed length sections of the pattern.
The return value for none found is $notFoundExit.
/$matchRe/{s/$searchRe/$replacement/$options; Q}; q$notFoundExit
For the following reasons:
No time wasted testing for both matched and unmatched case
No time wasted copying to or from buffers
No superfluous branches
Reasonable flexibility
Varying the case of Q commands will vary the behavior depending on when the exit should occur. Behaviors involving the application of Boolean logic to a multiple line input requires more complexity in the solution.
For any number of input lines:
sed --quiet 's/hello/HELLO/;t1;b2;:1;h;:2;p;${g;s/..*//;tok;q1;:ok}'
Fills hold space on match, and checks it after the last line.
Returns status 1 if no match in file.
s/hello/HELLO - substitution to check for
t1 - jump to label 1 if substitution succeeded
b2 - jump to label 2 unconditionally
:1 - label 1
h - copy pattern to hold space (when substitution succeeded)
:2 - label 2
p - print pattern space, unconditionally
${ ... } - match last line, evaluate block inside
g - copy hold space into pattern space (non-empty if first substitution succeded before)
s/..*// - dummy substitution, to set branch-flag
tok - jump to label ok (if dummy substitution succeeded on non-empty hold space)
q1 - exit with error status 1
:ok - label ok
As we already know, when sed fails to match then it simply returns its input string - no error has occurred. It is true that a difference between the input and output strings implies a match, but a match does not imply a difference in the strings; after all sed could have simply matched all of the input characters.
The flaw is created in the following example
h=$(echo "$g" | sed 's/.*\(abc[[:digit:]]\).*/\1/g')
if [ ! "$h" = "$g" ]; then
echo "1"
else
echo "2"
fi
where g=Xabc1 gives 1, while setting g=abc1 gives 2; yet both of these input strings are matched by sed! So, it can be hard to determine whether sed has matched or not. A solution:
h=$(echo "fix${g}ed" | sed 's/.*\(abc[[:digit:]]\).*/\1/g')
if [ ! "$h" = "fix${g}ed" ]; then
echo "1"
else
echo "2"
fi
in which case the 1 is printed if-and-only-if sed has matched.
I had wanted to truncate a file by quitting when the match was found (and exclude the matching line). This is handy when a process that adds lines at the end of the file may be re-run. "Q;Q1" didn't work but simply "Q1" did, as follows:
if sed -i '/text I wanted to find/Q1' file.txt
then
insert blank line at end of file + new lines
fi
insert just the new lines without the blank line

elif conditional statement not working

I have this file as:
The number is %d0The number is %d1The number is %d2The number is %d3The number is %d4The number is %d5The number is %d6The...
The number is %d67The number is %d68The number is %d69The number is %d70The number is %d71The number is %d72The....
The number is %d117The number is %d118The number is %d119The number is %d120The number is %d121The number is %d122
I want to pad it like:
The number is %d0 The number is %d1 The number is %d2 The number is %d3 The number is %d4 The number is %d5 The number is %d6
The number is %d63 The number is %d64 The number is %d65 The number is %d66 The number is %d67 The number is %d68 The number is %d69
d118The number is %d119The number is %d120The number is %d121The number is %d122The number is %d123The number is %d124The
Please tell me how to do it through shell script
I am working on Linux
Edit:
This single command pipeline should do what you want:
sed 's/\(d[0-9]\+\)/\1 /g;s/\(d[0-9 ]\{3\}\) */\1/g' test2.txt >test3.txt
# ^ three spaces here
Explanation:
For each sequence of digits following a "d", add three spaces after it. (I'll use "X" to represent spaces.)
d1 becomes d1XXX
d10 becomes d10XXX
d100 becomes d100XXX
Now (the part after the semicolon), capture every "d" and the next three character which must be digits or spaces and output them but not any spaces beyond.
d1XXX becomes d1XX
d10XXX becomes d10X
d100XXX becomes d100
If you want to wrap the lines as you seem to show in your sample data, then do this instead:
sed 's/\(d[0-9]\+\)/\1 /g;s/\(d[0-9 ]\{3\}\) */\1/g' test2.txt | fold -w 133 >test3.txt
You may need to adjust the argument of the fold command to make it come out right.
There's no need for if, grep, loops, etc.
Original answer:
First of all, you really need to say which shell you're using, but since you have elif and fi, I'm assuming it's Bourne-derived.
Based on that assumption, your script makes no sense.
The parentheses for the if and elif are unnecessary. In this context, they create a subshell which serves no purpose.
The sed commands in the if and elif say "if the pattern is found, copy hold space (it's empty, by the way) to pattern space and output it and output all other lines.
The first sed command will always be true so the elif will never be executed. sed always returns true unless there's an error.
This may be what you intended:
if grep -Eqs 'd[0-9]([^0-9]|$)' test2.txt; then
sed 's/\(d[0-9]\)\([^0-9]\|$\)/\1 \2/g' test2.txt >test3.txt
elif grep -Eqs 'd[0-9][0-9]([^0-9]|$)' test2.txt; then
sed 's/\(d[0-9][0-9]\)\([^0-9]\|$\)/\1 \2/g' test2.txt >test3.txt
else
cat test2.txt >test3.txt
fi
But I wonder if all that could be replaced by something like this one-liner:
sed 's/\(d[0-9][0-9]?\)\([^0-9]\|$\)/\1 \2/g' test2.txt >test3.txt
Since I don't know what test2.txt looks like, part of this is only guessing.

Resources