What's wrong with my usage of grep? - linux

I'm executing the following command:
echo "ze2s hihi" | tr ' ' '\n' | grep 'h*'
but instead of getting hihi in the output I'm getting this:
ze2s
hihi
What's wrong?

What you want is:
echo "ze2s hihi" | tr ' ' '\n' | grep 'h.*'
With "h*" you are asking to match any number of h's in a sequence, including 0 h's, which ze2s matches.
Or maybe you just want to match anything which contains an h:
echo "ze2s hihi" | tr ' ' '\n' | grep 'h'

Consider using egrep or grep -E if you only want to have the lines with h* at the beginning:
echo "ze2s hihi" | tr ' ' '\n' | egrep '^h'

The asterisk matches the preceding item zero or more times. Thus h* matches h zero or more times, i.e. anything.
If you want to match h and any characters after it, use h.* expression, where the period matches any single character.

You got the answer to your question but FYI you don't need multiple commands and pipes to do what you want:
$ echo "ze2s hihi" | awk -v RS='\\s+' '/h/'
hihi
The above uses GNU awk for multi-char RS and \s for space chars.

Related

how to replace a specific char occurrences in string after a given substring

I have a string Contain key=value format separated by #
I am trying to replace the '=' char occurrences with ':' in the value of TITLE using BASH script.
"ID=21566#OS=Linux#TARGET_END=Synchronica#DEPENDENCY=Independent#AUTOMATION_OS=Linux#AUTOMATION_TOOL=JSystem#TITLE=Session tracking. "DL Started" Status Reported.Level=none"
later on i am parsing this string to execute the eval operation
eval $(echo $test_line | sed 's/"//g' | tr '#' '\n' | tr ' ' '_' | sed 's/=/="/g' | sed 's/$/"/g')
When the sed 's/=/="/g' section will also change ..Level=none to
Level="none
This leads to
eval: line 52: unexpected EOF while looking for matching `"'
What will be right replace bash command to replace my string ?
As an alternative, consider pure-bash solution to bring the variables into bash, avoiding the (risky) eval.
IFS=# read -a kv <<<"ID=21566#OS=Linux#TARGET_END=Synchronica#..."
for kvp in "${kv[#]}" ; do
declare "$kvp"
done
I found the way to solve it.
I will add sed 's/=/:/8g' to my eval command.
It will replace 8th to nth occurrences of '='.
The action will only effect the value of TITLE as expected.
eval $(echo $test_line | sed 's/=/:/8g' | sed 's/"/"/g' | tr '#' '\n' | tr ' ' '_' | sed 's/=/="/g' | sed 's/$/"/g')
I did it like this :
echo '"ID=21566#OS:Linux#TARGET_END:Synchronica#DEPENDENCY:Independent#AUTOMATION_OS:Linux#AUTOMATION_TOOL:JSystem#TITLE:Session tracking. "DL Started" Status Reported.Level=none"' \
|
sed -E 's/(#)?([A-Z_]+)(=)/\1\2:/g'
Let me know if it works for you.

Fetch latest matching string value

I have a file which contains two values for initial... keyword. I want to grab the latest date for matching initial... string. After getting the date I also need to format the date by replacing / with -
---other data
INFO | abc 1 | 2018/01/04 20:04:35 | initial...
INFO | abc 1 | 2018/02/05 17:01:42 | INFO | new| InitialLauncher | c.t.s.s.setup.launch | initial...
---other data
In the above example, my output should be 2018-02-05. Here, I am fetching the line which contains initial... value and only getting the line with latest date value. Then, I need to strip out the remaining string and fetch only the date value.
I am using the following grep but it is not yet as per the requirement.
grep -q -iF "initial..." /tmp/file.log
Using the knowledge that later dates appear later in the file, it's only necessary to print the date from the last line containing initial....
First step (drop the -q from grep — you don't want it to be quiet):
grep -iF 'initial...' /tmp/file.log |
tail -n 1 |
sed -e 's/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/' -e 's%/%-%g'
The (first) s/// command matches a series of non-pipes followed by a pipe, another series of non-pipes followed by a pipe, a blank, then captures a series of non-blanks, and finally matches a blank and anything; it replaces all that with just the captured string, which is the date field after the second pipe on the input line. The (second) s%%% command replaces slashes with dashes, using % to avoid the confusion that the equivalent s/\//-/g might engender, thereby reformatting the date in ISO 8601-style format.
But we can lose the tail with:
grep -iF 'initial...' /tmp/file.log |
sed -n -e '$ { s/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/; s%/%-%gp; }'
The -n suppresses normal output; the $ matches only the last line; the p after the second s/// operation prints the result.
The case-insensitive fixed-pattern search is more conveniently written in grep than in sed. Although it could be done in a single sed command, you have to work fairly hard, saving matching rows in the hold space, then swapping the hold and pattern space at the end, and doing the substitution and printing:
sed -n \
-e '/[Ii][Nn][Ii][Tt][Ii][Aa][Ll]\.\.\./h' \
-e '$ { x; s/^[^|]*|[^|]*| *\([^ ]*\) .*/\1/; s%/%-%gp; }' /tmp/file.log
Each of these produces the output 2018-02-05 on the sample data. If fed an input with no initial... in it, they output nothing.
Grep for only (-o) the string you want, sort it, and cut for the first word:
grep -o '2[0-9]\{3\}/[0-9][0-9]/[0-9][0-9] [0-2][0-9]:[0-5][0-9]:[0-9][0-9] .* | initial' file.txt | sort | cut -d' ' -f1 | tai -1
something like this...
$ awk -F'|' '$NF~/initial\.\.\./ {if(max<$3) max=$3}
END {gsub("/","-",max);
split(max,dt," "); print dt[1]}' file

Count number of patterns with a single command

I'd like to count the number of occurrences in a string. For example, in this string :
'apache2|ntpd'
there are 2 different strings separated by | character.
Another example :
'apache2|ntpd|authd|freeradius'
In this case there are 4 different strings separated by | character.
Would you know a shell or perl command that could simply count this for me?
you can use awk command as below;
echo "apache2|ntpd" | awk -F'|' '{print NF}'
-F'|' is to field separator;
NF means Number of Fields
Example;
user#host:/tmp$ echo 'apache2|ntpd|authd|freeradius' | awk -F'|' '{print NF}'
4
you can also use this;
user#host:/tmp$ echo "apache2|ntpd" | tr '|' ' ' | wc -w
2
user#host:/tmp$ echo 'apache2|ntpd|authd|freeradius' | tr '|' ' ' | wc -w
4
tr '|' ' ' : translate | to space
wc -w : print the word counts
if there are spaces in the string, wc -w not correct result, so
echo 'apac he2|ntpd' | tr '|' '\n' | wc -l
user#host:/tmp$ echo 'apac he2|ntpd' | tr '|' ' ' | wc -w
3 --> not correct
user#host:/tmp$ echo 'apac he2|ntpd' | tr '|' '\n' | wc -l
2
tr '|' '\n' : translate | to newline
wc -l : number of lines
Do can do this just within bash without calling external languages like awk or external programs like grep and tr.
data='apache2|ntpd|authd|freeradius'
res=${data//[!|]/}
num_strings=$(( ${#res} + 1 ))
echo $num_strings
Let me explain.
res=${data//[!|]/} removes all characters that are not (that's the !) pipes (|).
${#res} gives the length of the resulting string.
num_strings=$(( ${#res} + 1 )) adds one to the number of pipes to get the number of fields.
It's that simple.
Another pure bash technique using positional-parameters
$ userString="apache2|ntpd|authd|freeradius"
$ printf "%s\n" $(IFS=\|; set -- $userString; printf "%s\n" "$#")
4
Thanks to cdarke's suggestion from the commands, the above command can directly store the count to a variable
$ printf -v count "%d" $(IFS=\|; set -- $userString; printf "%s\n" "$#")
$ printf "%d\n" "$count"
4
With wc and parameter expansion:
$ data='apache2|ntpd|authd|freeradius'
$ wc -w <<< ${data//|/ }
4
Using parameter expansion, all pipes are replaced with spaces. The result string is passed to wc -w for word count.
As #gniourf_gniourf mentionned, it works with what at first looks like process names but will fail if strings contain spaces.
You can do this with grep as well-
echo "apache2|ntpd|authd|freeradius" | grep -o "|" | wc -l
Output-
3
That output is the number of pipes.
To get the number of commands-
var=$(echo "apache2|ntpd|authd|freeradius" | grep -o "|" | wc -l)
echo $((var + 1))
Output -
4
You could use awk to count the occurrances of delimiters +1:
$ awk '{print gsub(/\|/,"")+1}' <(echo "apache2|ntpd|authd|freeradius")
4
may be this will help you.
IN="apache2|ntpd"
mails=$(echo $IN | tr "|" "\n")
for addr in $mails
do
echo "> [$addr]"
done

return all lines that match String1 in a file after the last matching String2 in the same file

I figured out how to get the line number of the last matching word in the file :
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1
It gave me the value of 1787. So, I passed it manually to the sed command to search for the lines that contains the sentence "blades are down" after that line number and it returned all the lines successfully
sed -n '1787,$s/blades are down/&/p' myfile.txt
Is there a way that I can pass the line number from the first command to the second one through a variable or a file so I can but them in the script to be executed automatically ?
Thank you.
You can do this by just connecting your two commands with xargs. 'xargs -I %' allows you to take the stdin from a previous command and place it whenever you want in the next command. The '%' is where your '1787' will be written:
cat -n textfile.txt | grep " b " | tail -1 | cut -f 1 | xargs -I % sed -n %',$s/blades are down/&/p' myfile.txt
You can use:
command substitution to capture the result of the first command in a variable.
simple string concatenation to use the variable in your sed comand
startLine=$(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)
sed -n ${startLine}',$s/blades are down/&/p' myfile.txt
You don't strictly need the intermediate variable - you could simply use:
sed $(grep -n ' b ' textfile.txt | tail -1 | cut -d ':' -f1)',$s/blades are down/&/p' myfile.txt`
but it may make sense to do error checking on the result of the command substitution first.
Note that I've streamlined the first command by using grep's -n option, which puts the line number separated with : before each match.
First we can get "half" of the file after the last match of string2, then you can use grep to match all the string1
tac your_file | awk '{ if (match($0, "string2")) { exit ; } else {print;} }' | \
grep "string1"
but the order is reversed if you don't care about the order. But if you do care, just add another tac at the end with a pipe |.
This might work for you (GNU sed):
sed -n '/\n/ba;/ b /h;//!H;$!d;x;//!d;s/$/\n/;:a;/\`.*blades are down.*$/MP;D' file
This reads through the file storing all lines following the last match of the first string (" b ") in the hold space.
At the end of file, it swaps to the hold space, checks that it does indeed have at least one match, then prints out those lines that match the second string ("blades are down").
N.B. it makes the end case (/\n/) possible by adding a new line to the end of the hold space, which will eventually be thrown away. This also caters for the last line edge condition.

How to extract numbers from a string?

I have string contains a path
string="toto.titi.12.tata.2.abc.def"
I want to extract only the numbers from this string.
To extract the first number:
tmp="${string#toto.titi.*.}"
num1="${tmp%.tata*}"
To extract the second number:
tmp="${string#toto.titi.*.tata.*.}"
num2="${tmp%.abc.def}"
So to extract a parameter I have to do it in 2 steps. How to extract a number with one step?
You can use tr to delete all of the non-digit characters, like so:
echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9
To extract all the individual numbers and print one number word per line pipe through -
tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Breakdown:
Replaces all line breaks with spaces: tr '\n' ' '
Replaces all non numbers with spaces: sed -e 's/[^0-9]/ /g'
Remove leading white space: -e 's/^ *//g'
Remove trailing white space: -e 's/ *$//g'
Squeeze spaces in sequence to 1 space: tr -s ' '
Replace remaining space separators with line break: sed 's/ /\n/g'
Example:
echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Will print out
20
2
324
2
Here is a short one:
string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')
echo $id // => output: 12 2
with space between the numbers.
Hope it helps...
Parameter expansion would seem to be the order of the day.
$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2
This of course depends on the format of $string. But at least for the example you've provided, it seems to work.
This may be superior to anubhava's awk solution which requires a subshell. I also like chepner's solution, but regular expressions are "heavier" than parameter expansion (though obviously way more precise). (Note that in the expression above, [^0-9] may look like a regex atom, but it is not.)
You can read about this form or Parameter Expansion in the bash man page. Note that ${string//this/that} (as well as the <<<) is a bashism, and is not compatible with traditional Bourne or posix shells.
This would be easier to answer if you provided exactly the output you're looking to get. If you mean you want to get just the digits out of the string, and remove everything else, you can do this:
d#AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d#AirBox:~$ echo "${string//[a-z,.]/}"
122
If you clarify a bit I may be able to help more.
You can also use sed:
echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'
Here, sed replaces
any digits (class [0-9])
repeated any number of times (*)
with nothing (nothing between the second and third /),
and g stands for globally.
Output will be:
toto.titi..tata..abc.def
Convert your string to an array like this:
$ str="toto.titi.12.tata.2.abc.def"
$ arr=( ${str//[!0-9]/ } )
$ echo "${arr[#]}"
12 2
Use regular expression matching:
string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}
Using awk:
arr=( $(echo $string | awk -F "." '{print $3, $5}') )
num1=${arr[0]}
num2=${arr[1]}
Hi adding yet another way to do this using 'cut',
echo $string | cut -d'.' -f3,5 | tr '.' ' '
This gives you the following output:
12 2
Fixing newline issue (for mac terminal):
cat temp.txt | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed $'s/ /\\\n/g'
Assumptions:
there is no embedded white space
the string of text always has 7 period-delimited strings
the string always contains numbers in the 3rd and 5th period-delimited positions
One bash idea that does not require spawning any subprocesses:
$ string="toto.titi.12.tata.2.abc.def"
$ IFS=. read -r x1 x2 num1 x3 num2 rest <<< "${string}"
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"
In a comment OP has stated they wish to extract only one number at a time; the same approach can still be used, eg:
$ string="toto.titi.12.tata.2.abc.def"
$ IFS=. read -r x1 x2 num1 rest <<< "${string}"
$ typeset -p num1
declare -- num1="12"
$ IFS=. read -r x1 x2 x3 x4 num2 rest <<< "${string}"
$ typeset -p num2
declare -- num2="2"
A variation on anubhava's answer that uses parameter expansion instead of a subprocess call to awk, and still working with the same set of initial assumptions:
$ arr=( ${string//./ } )
$ num1=${arr[2]}
$ num2=${arr[4]}
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"

Resources