Extract Digits From String After Capturing It From File

Extract Digits From String After Capturing It From File - string

I'm trying to retrieve a memory value from file, and compare it to reference value. But one thing at a time....
I've attempted using set/source/grep/substring to variable but non of them actually worked. Then I found a way to do it using a for loop (see code).
The issue: I'm receiving the entire string from the file, but I can't manage to get rid of the last character in it.
#!/bin/bash
#source start_params.properties
#mem_val= "$default.default.minmaxmemory.main"
#mem_val= grep "default.default.minmaxmemory.main" start_params.properties
for mLine in $(grep 'default.default.minmaxmemory.main' start_params.properties)
do
echo "$mLine"
done
echo "${mLine:4:5}" # didn't get rid of the last `m` in `-max4095m`
v1="max"
v2="m"
echo "$mLine" | sed -e "s/.*${v1}//;s/${v2}.*//" #this echo the right value.
The loop iterates twice:
First output: default.default.minmaxmemory.main=-min512m
Second output: -max4096m
Then the sed command output is 4096,but how can I change the last line in the code S.T. it'll store the value in a variable?
Thank you for your suggestions,

You could use grep to filter the max part and then another a grep -o to extract the numbers:
echo "$mLine" | grep "$max" | grep -o '[[:digit:]]*'

$ sed '/max[0-9]/!d; s/.*max//; s/m//' start_params.properties
4096
remove lines not matching max[0-9]
remove first part of line until max
remove final m

Related

Remove path prefix of space separated paths

Given a list of paths separated by a single space:
/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b
I want to remove the prefix /home/me/src/ so that the result is:
test vendor/a vendor/b
For a single path I would do: ${PATH#/home/me/src/} but how do I apply it to this series?

You can use // to replace all occurrences of substring. Replace it with null string to remove them.
$ path="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ echo ${path//\/home\/me\/src\/}
test vendor/a vendor/b
Reference: ${parameter/pattern/string} in Bash reference manual

Using shell parameter expansion doesn't seem to be the solution for this, since it would remove everything up to / from a given point is useful, as nu11p01n73R's answer reveals.
For clarity, I would use sed with the syntax sed 's#pattern#replacement#g':
$ str="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ sed 's#/home/me/src/##g' <<< "$str"
test vendor/a vendor/b

Like always a grep solution from my side :
echo 'your string' | grep -Po '^/([^ /]*/)+\K.+'
Please note that the above regex do this for any string like /x/y/z/test ... But if you are interested only in replacing /home/me/src/, try the following :
echo 'your string' | grep -Po '^/home/me/src/\K.+' --color

bash string manipulation - Display the value, not the variable name

I'm writing a script to process inbound data files. The inbound file names all follow the same pattern:
word1_word2_word3_YYYYMMDD.txt
My script takes the name of the inbound file, strips the file extension, strips out the date, replaces all underscores with spaces and appends the resulting string to each line in the original file. I can succesfully create the desired string and have assigned it to a variable "STR"
The last step is to append the value of $STR to each line in the file so that the data lines within the file end up looking like this:
casenumber1"|"word1 word2 word3
casenumber2"|"word1 word2 word3
casenumber3"|"word1 word2 word3
My problem is that for the life of me I cannot get bash to display the variable value, it always displays the variable name.
This is the line I use to create the string needed from the file name:
STR=`echo $DATAFILENAME | cut -d '.' -f 1 | sed 's/[0-9]*//g'|sed 's/_/ /g' | sed 's/[[:blank:]]*$//'`
I'm trying to use a typical sed replace command:
sed 's/$/`echo "$STR"`/g' inputfile > outputfile
But keep getting the variable name instead of the variable value:
example output:
1000056|$"STR"
1000057|$"STR"
...
desired output:
1000056|Closed With Notification
1000057|Closed With Notification
What am I doing wrong? Thanks, Vic

The gist of your question is that you need to add a string to a file using sed and the value of that string is contained in a variable, which you call "a", as we read in the final list.
Then you need use this combination, which is missing from your list above:
sed "s/$/| $a/g" $DATAFILE > datfile99
The problem is that the single quotes around your command prevent the interpolation of the variable $a.
If you wrap the command in double quotes the whole string will be passed to sed after that the shell replaces $a to its current value.

Try replacing your ' with " this will tell your shell to substitute any shell variables
sed -i "s/$/echo $STR/g"
Note -i option will make actual changes to your file, hence it is wise to backup.
EDIT: instead of using this
STR=`echo $DATAFILENAME | cut -d '.' -f 1 | sed 's/[0-9]*//g'|sed 's/_/ /g' | sed 's/[[:blank:]]*$//'`
You can try this
sed -i -r "s/(.*)[.][a-zA-Z]+$/\\1/g;s/[._]/ /g" <<< "$DATAFILENAME"

Return value of sed for no match

I'm using sed for updating my JSON configuration file in the runtime.
Sometimes, when the pattern doesn't match in the JSON file, sed still exits with return code 0.
Returning 0 means successful completion, but why does sed return 0 if it doesn't find the proper pattern and update the file? Is there a workaround for that?

as #cnicutar commented, the return code of a command means if the command was executed successfully. has nothing to do with the logic you implemented in the codes/scripts.
so if you have:
echo "foo"|sed '/bar/ s/a/b/'
sed will return 0 but if you write some syntax/expression errors, or the input/file doesn't exist, sed cannot execute your request, sed will return 1.
workaround
this is actually not workaround. sed has q command: (from man page):
q [exit-code]
here you can define exit-code as you want. For example '/foo/!{q100}; {s/f/b/}' will exit with code 100 if foo isn't present, and otherwise perform the substitution f->b and exit with code 0.
Matched case:
kent$ echo "foo" | sed '/foo/!{q100}; {s/f/b/}'
boo
kent$ echo $?
0
Unmatched case:
kent$ echo "trash" | sed '/foo/!{q100}; {s/f/b/}'
trash
kent$ echo $?
100
I hope this answers your question.
edit
I must add that, the above example is just for one-line processing. I don't know your exact requirement. when you want to get exit 1. one-line unmatched or the whole file. If whole file unmatching case, you may consider awk, or even do a grep before your text processing...

This might work for you (GNU sed):
sed '/search-string/{s//replacement-string/;h};${x;/./{x;q0};x;q1}' file
If the search-string is found it will be replaced with replacement-string and at end-of-file sed will exit with 0 return code. If no substitution takes place the return code will be 1.
A more detailed explanation:
In sed the user has two registers at his disposal: the pattern space (PS) in which the current line is loaded into (minus the linefeed) and a spare register called the hold space (HS) which is initially empty.
The general idea is to use the HS as a flag to indicate if a substitution has taken place. If the HS is still empty at the end of the file, then no changes have been made, otherwise changes have occurred.
The command /search-string/ matches search-string with whatever is in the PS and if it is found to contain the search-string the commands between the following curly braces are executed.
Firstly the substitution s//replacement-string/ (sed uses the last regexp i.e. the search-string, if the lefthand-side is empty, so s//replacement-string is the same as s/search-string/replacement-string/) and following this the h command makes a copy of the PS and puts it in the HS.
The sed command $ is used to recognise the last line of a file and the following then occurs.
First the x command swaps the two registers, so the HS becomes the PS and the PS becomes the HS.
Then the PS is searched for any character /./ (. means match any character) remember the HS (now the PS) was initially empty until a substitution took place. If the condition is true the x is again executed followed by q0 command which ends all sed processing and sets the return code to 0. Otherwise the x command is executed and the return code is set to 1.
N.B. although the q quits sed processing it does not prevent the PS from being reassembled by sed and printed as per normal.
Another alternative:
sed '/search-string/!ba;s//replacement-string/;h;:a;$!b;p;x;/./Q;Q1' file
or:
sed '/search-string/,${s//replacement-string/;b};$q1' file

These answers are all too complicated. What is wrong with writing a bit of shell script that uses grep to figure out if the thing you want to replace is there then using sed to replace it?
grep -q $TARGET_STRING $file
if [ $? -eq 0 ]
then
echo "$file contains the old site"
sed -e "s|${TARGET_STRING}|${NEW_STRING}|g" ....
fi

For 1 line of input. To avoid repeating the /pattern/:
When s succeeds to substitute, use t to jump conditionally to a label, e.g. x. Otherwise use q to quit with an exit code, e.g. 100:
's/pattern/replacement/;tx;q100;:x'
Example:
$ echo 1 > one
$ < one sed 's/1/replaced-it/;tx;q1;:x'
replaced-it
$ echo $?
0
$ < one sed 's/999/replaced-it/;tx;q100;:x'
1
$ echo $?
100
https://www.gnu.org/software/sed/manual/html_node/Branching-and-flow-control.html

We have the answer above but it took some time for me work out what is happening. I am trying to provide a simple explanation for basic user of sed like me.
Lets consider the example:
echo "foo" | sed '/foo/!{q100}; {s/f/b/}'
Here we have two sed commands. First one is '/foo/!{q100}' This command actually check the pattern matching and return exist code 100 if no match. Consider following examples, -n is used to silent the output so we only get exist code.
This example foo matches so exit code return is 0
echo "foo" | sed -n '/foo/!{q100}'; echo $?
0
This example input is foo and we try match boo so no match and exit code 100 is returned
echo "foo" | sed -n '/boo/!{q100}'; echo $?
100
So if my requirement is only to check a pattern match or not I can use
echo "<input string>" | sed -n '/<pattern to match>/!{q<exit-code>}'
More examples:
echo "20200206" | sed -n '/[0-9]*/!{q100}' && echo "Matched" || echo "No Match"
Matched
echo "20200206" | sed -n '/[0-9]{2}/!{q100}' && echo "Matched" || echo "No Match"
No Match
Second command is '{s/f/b/}' is to replace the f in foo with b which I used many times.

Below is the pattern we use with sed -rn or sed -r.
The entire search and replace command ("s/.../.../...") is optional. If the search and replace is used, for speed and having already matched $matchRe, we use as fast a $searchRe value as possible, using . where the character does not need to be re-verified and .{$len} for fixed length sections of the pattern.
The return value for none found is $notFoundExit.
/$matchRe/{s/$searchRe/$replacement/$options; Q}; q$notFoundExit
For the following reasons:
No time wasted testing for both matched and unmatched case
No time wasted copying to or from buffers
No superfluous branches
Reasonable flexibility
Varying the case of Q commands will vary the behavior depending on when the exit should occur. Behaviors involving the application of Boolean logic to a multiple line input requires more complexity in the solution.

For any number of input lines:
sed --quiet 's/hello/HELLO/;t1;b2;:1;h;:2;p;${g;s/..*//;tok;q1;:ok}'
Fills hold space on match, and checks it after the last line.
Returns status 1 if no match in file.
s/hello/HELLO - substitution to check for
t1 - jump to label 1 if substitution succeeded
b2 - jump to label 2 unconditionally
:1 - label 1
h - copy pattern to hold space (when substitution succeeded)
:2 - label 2
p - print pattern space, unconditionally
${ ... } - match last line, evaluate block inside
g - copy hold space into pattern space (non-empty if first substitution succeded before)
s/..*// - dummy substitution, to set branch-flag
tok - jump to label ok (if dummy substitution succeeded on non-empty hold space)
q1 - exit with error status 1
:ok - label ok

As we already know, when sed fails to match then it simply returns its input string - no error has occurred. It is true that a difference between the input and output strings implies a match, but a match does not imply a difference in the strings; after all sed could have simply matched all of the input characters.
The flaw is created in the following example
h=$(echo "$g" | sed 's/.*\(abc[[:digit:]]\).*/\1/g')
if [ ! "$h" = "$g" ]; then
echo "1"
else
echo "2"
fi
where g=Xabc1 gives 1, while setting g=abc1 gives 2; yet both of these input strings are matched by sed! So, it can be hard to determine whether sed has matched or not. A solution:
h=$(echo "fix${g}ed" | sed 's/.*\(abc[[:digit:]]\).*/\1/g')
if [ ! "$h" = "fix${g}ed" ]; then
echo "1"
else
echo "2"
fi
in which case the 1 is printed if-and-only-if sed has matched.

I had wanted to truncate a file by quitting when the match was found (and exclude the matching line). This is handy when a process that adds lines at the end of the file may be re-run. "Q;Q1" didn't work but simply "Q1" did, as follows:
if sed -i '/text I wanted to find/Q1' file.txt
then
insert blank line at end of file + new lines
fi
insert just the new lines without the blank line

Extracting part of a string to a variable in bash

noob here, sorry if a repost. I am extracting a string from a file, and end up with a line, something like:
abcdefg:12345:67890:abcde:12345:abcde
Let's say it's in a variable named testString
the length of the values between the colons is not constant, but I want to save the number, as a string is fine, to a variable, between the 2nd and 3rd colons. so in this case I'd end up with my new variable, let's call it extractedNum, being 67890 . I assume I have to use sed but have never used it and trying to get my head around it...
Can anyone help? Cheers
On a side-note, I am using find to extract the entire line from a string, by searching for the 1st string of characters, in this case the abcdefg part.

Pure Bash using an array:
testString="abcdefg:12345:67890:abcde:12345:abcde"
IFS=':'
array=( $testString )
echo "value = ${array[2]}"
The output:
value = 67890

Here's another pure bash way. Works fine when your input is reasonably consistent and you don't need much flexibility in which section you pick out.
extractedNum="${testString#*:}" # Remove through first :
extractedNum="${extractedNum#*:}" # Remove through second :
extractedNum="${extractedNum%%:*}" # Remove from next : to end of string
You could also filter the file while reading it, in a while loop for example:
while IFS=' ' read -r col line ; do
# col has the column you wanted, line has the whole line
# # #
done < <(sed -e 's/\([^:]*:\)\{2\}\([^:]*\).*/\2 &/' "yourfile")
The sed command is picking out the 2nd column and delimiting that value from the entire line with a space. If you don't need the entire line, just remove the space+& from the replacement and drop the line variable from the read. You can pick any column by changing the number in the \{2\} bit. (Put the command in double quotes if you want to use a variable there.)

You can use cut for this kind of stuff. Here you go:
VAR=$(echo abcdefg:12345:67890:abcde:12345:abcde |cut -d":" -f3); echo $VAR
For the fun of it, this is how I would (not) do this with sed, but I'm sure there's easier ways. I guess that'd be a question of my own to future readers ;)
echo abcdefg:12345:67890:abcde:12345:abcde |sed -e "s/[^:]*:[^:]*:\([^:]*\):.*/\1/"

this should work for you: the key part is awk -F: '$0=$3'
NewVar=$(getTheLineSomehow...|awk -F: '$0=$3')
example:
kent$ newVar=$(echo "abcdefg:12345:67890:abcde:12345:abcde"|awk -F: '$0=$3')
kent$ echo $newVar
67890
if your text was stored in var testString, you could:
kent$ echo $testString
abcdefg:12345:67890:abcde:12345:abcde
kent$ newVar=$(awk -F: '$0=$3' <<<"$testString")
kent$ echo $newVar
67890

How do I count the number of occurrences of a string in an entire file?

Is there an inbuilt command to do this or has anyone had any luck with a script that does it?
I am looking to count the number of times a certain string (not word) appears in a file. This can include multiple occurrences per line so the count should count every occurrence not just count 1 for lines that have the string 2 or more times.
For example, with this sample file:
blah(*)wasp( *)jkdjs(*)kdfks(l*)ffks(dl
flksj(*)gjkd(*
)jfhk(*)fj (*) ks)(*gfjk(*)
If I am looking to count the occurrences of the string (*) I would expect the count to be 6, i.e. 2 from the first line, 1 from the second line and 3 from the third line. Note how the one across lines 2-3 does not count because there is a LF character separating them.
Update: great responses so far! Can I ask that the script handle the conversion of (*) to \(*\), etc? That way I could just pass any desired string as an input parameter without worrying about what conversion needs to be done to it so it appears in the correct format.

You can use basic tools such as grep and wc:
grep -o '(\*)' input.txt | wc -l

Using perl's "Eskimo kiss" operator with the -n switch to print a total at the end. Use \Q...\E to ignore any meta characters.
perl -lnwe '$a+=()=/\Q(*)/g; }{ print $a;' file.txt
Script:
use strict;
use warnings;
my $count;
my $text = shift;
while (<>) {
$count += () = /\Q$text/g;
}
print "$count\n";
Usage:
perl script.pl "(*)" file.txt

This loops over the lines of the file, and on each line finds all occurrences of the string "(*)". Each time that string is found, $c is incremented. When there are no more lines to loop over, the value of $c is printed.
perl -ne'$c++ while /\(\*\)/g;END{print"$c\n"}' filename.txt
Update: Regarding your comment asking that this be converted into a solution that accepts a regex as an argument, you might do it like this:
perl -ne'BEGIN{$re=shift;}$c++ while /\Q$re/g;END{print"$c\n"}' 'regex' filename.txt
That ought to do the trick. If I felt inclined to skim through perlrun again I might see a more elegant solution, but this should work.
You could also eliminate the explicit inner while loop in favor of an implicit one by providing list context to the regexp:
perl -ne'BEGIN{$re=shift}$c+=()=/\Q$re/g;END{print"$c\n"}' 'regex' filename.txt

You can use basic grep command:
Example: If you want to find the no of occurrence of "hello" word in a file
grep -c "hello" filename
If you want to find the no of occurrence of a pattern then
grep -c -P "Your Pattern"
Pattern example : hell.w, \d+ etc

I have used below command to find particular string count in a file
grep search_String fileName|wc -l

text="(\*)"
grep -o $text file | wc -l
You can make it into a script which accepts arguments like this:
script count:
#!/bin/bash
text="$1"
file="$2"
grep -o "$text" "$file" | wc -l
Usage:
./count "(\*)" file_path

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Extract Digits From String After Capturing It From File - string

You could use grep to filter the max part and then another a grep -o to extract the numbers: echo "$mLine" | grep "$max" | grep -o '[[:digit:]]*'

$ sed '/max[0-9]/!d; s/.*max//; s/m//' start_params.properties 4096 remove lines not matching max[0-9] remove first part of line until max remove final m

Related

Remove path prefix of space separated paths

bash string manipulation - Display the value, not the variable name

Return value of sed for no match

Extracting part of a string to a variable in bash

How do I count the number of occurrences of a string in an entire file?

Categories

Resources