How to find a substring from some text in a file and store it in a bash variable?

How to find a substring from some text in a file and store it in a bash variable? - linux

I have a file named config.txt which has following data:
ABC_PATH=xxx/xxx
IMAGE=docker.name.net:3000/apache:1.8.109.1
NAMESPACE=xxx
Now I am running a shell script in which I want to store 1.8.109.1 (this value may differ, rest will remain same) in a variable, maybe using sed, awk or any other linux tool.
How can I achieve that?

The following will work.
ver="$(cat config.txt | grep apache: | cut -d: -f3)"
grep apache: will find the line that has the text 'apache:' in it.
-d specifies what delimiters to use. In this case : is set as the delimiter.
-f is used to select the specific field (array index, starting at 1) of the resulting list obtained after delimiting by :
Thus, -f3 selects the 3rd occurence of the delimited list.
The version info is now captured in the variable $ver

I think this should work:
cat config.txt | grep apache: | cut -d: -f3

Related

What do back brackets do in this bash script code?

so i'm doing a problem with bashscript, this one: ./namefreq.sh ANA should return a list of two names (on separate lines) ANA and RENEE, both of which have frequency 0.120.
Basically I have a file from table.csv shown in the code below that have names and a frequency number next to them e.g. Anna, 0.120
I'm still unsure what the `` does for this code, and I'm also struggling to understand how this code is able to print out two names with identical frequencies. The way I read the code is:
grep compares the word (-w) typed by the user (./bashscript.sh Anna) to the value of (a), which then uses the cut command to be able to compare the 2nd field of the line separated by the delimiter "," which is the frequency from the file table.csv and then | cut -f1 -d"," prints out the first fields which are the names with the same frequency
^ would this be correct?
thanks :)
#!/bin/bash
a=`grep -w $1 table.csv | cut -f2 -d','`
grep -w $a table.csv | cut -f1 -d',' | sort -d

When a command is in backticks or $(), the output of the command is subsituted back into the command in place of it. So if the file has Anna,0.120
a=`grep -w Anna table.csv | cut -f2 -d','`
will execute the grep and cut commands, which will output 0.120, so it will be equivalent to
a=0.120
Then the command looks for all the lines that match 0.120, extracts the first field with cut, and sorts them.

Recursively grep unique pattern in different files

Sorry title is not very clear.
So let's say I'm grepping recursively for urls like this:
grep -ERo '(http|https)://[^/"]+' /folder
and in folder there are several files containing the same url. My goal is to output only once this url. I tried to pipe the grep to | uniq or sort -u but that doesn't help
example result:
/www/tmpl/button.tpl.php:http://www.w3.org
/www/tmpl/header.tpl.php:http://www.w3.org
/www/tmpl/main.tpl.php:http://www.w3.org
/www/tmpl/master.tpl.php:http://www.w3.org
/www/tmpl/progress.tpl.php:http://www.w3.org

If you only want the address and never the file where it was found in, there is a grep option -h to suppress file output; the list can then be piped to sort -u to make sure every address appears only once:
$ grep -hERo 'https?://[^/"]+' folder/ | sort -u
http://www.w3.org
If you don't want the https?:// part, you can use Perl regular expressions (-P instead of -E) with variable length look-behind (\K):
$ grep -hPRo 'https?://\K[^/"]+' folder/ | sort -u
www.w3.org

If the structure of the output is always:
/some/path/to/file.php:http://www.someurl.org
you can use the command cut :
cut -d ':' -f 2- should work. Basically, it cuts each line into fields separated by a delimiter (here ":") and you select the 2nd and following fields (-f 2-)
After that, you can use uniq to filter.

Pipe to Awk:
grep -ERo 'https?://[^/"]+' /folder |
awk -F: '!a[substr($0,length($1))]++'
The basic Awk idiom !a[key]++ is true the first time we see key, and forever false after that. Extracting the URL (or a reasonable approximation) into the key requires a bit of additional trickery.
This prints the whole input line if the key is one we have not seen before, i.e. it will print the file name and the URL for the first occurrence of each URL from the grep output.
Doing the whole thing in Awk should not be too hard, either.

How to grep only the content that contains x and y?

I have 2mill lines of content and all lines look like this:
--username:orderID:email:country
I already added a -- prefix to all usernames.
What I need now is to get ONLY the usernames from the file. I think its possible with grep file starting with "--" ending with ":", but I have absolutely no idea.
So output should be:
usernameThank you all for the help.
THIS WORKED:
cut -d: -f1

Even without adding the prefix, you should be able to get the usernames with cut:
cut -d: -f1
-d says what the delimiter is, -f says which field(s) to return.

Try this:
cat YOUR_FILE | sed "s/:/\n/g" | grep "\-\-"

How to determine the exact character of a whitespace in linux?

I'm trying to get the PID field of ps -aux. I know I can achieve this using ps -aux | awk '{print $2}', but as practice wanted to see if I can do the same using the cut command. My idea is to specify a delimiter and chose the second field like this:
ps -aux | cut -d[delimiter] -f2
Using space as a delimiter (' ') did not work, neither did tab (\t).
In general, how do I find out the exact character of a white-space in linux?

To identify otherwise unprintable or similar-looking characters (like whitespace), pipe output to a tool like xxd or od -c. For example, this outputs both the hex values of each character as well as the text for easy lookup:
ps -aux | xxd -g 1 # -g 1 outputs each character individually
However I think your issue is that ps -aux uses multiple spaces between the fields; cut does not handle multiple consecutive delimiters, so it prints whatever's between the first and second space, i.e. nothing.
If you really want to use cut you have to remove both leading spaces and duplicate spaces:
ps -aux | sed 's/^ *//;s/ */ /g' | cut -d' ' -f2

cut doesn't support multi-chars as delimit.
There are multiple whitespace between fields, if you really want to use cut:
ps aux | sed 's/ */ /g' | cut -d ' ' -f 2

To get the PID of a ps command you can do this:
ps -aux | cut -c10-15

For information: the u that you use in ps aux means, according to man ps:
u Display user-oriented format
So you're explicitly asking for a human readable output and then you parse it with some tool? That's not very appropriate (to say the least). If you need to format the output of ps, please use the -o (or --format) option, if your version of ps accepts it. Hence:
ps ax -o pid
will be much better.

Unix cut except last two tokens

I'm trying to parse file names in specific directory. Filenames are of format:
token1_token2_token3_token(N-1)_token(N).sh
I need to cut the tokens using delimiter '_', and need to take string except the last two tokens. In above examlpe output should be token1_token2_token3.
The number of tokens is not fixed. I've tried to do it with -f#- option of cut command, but did not find any solution. Any ideas?

With cut:
$ echo t1_t2_t3_tn1_tn2.sh | rev | cut -d_ -f3- | rev
t1_t2_t3
rev reverses each line.
The 3- in -f3- means from the 3rd field to the end of the line (which is the beginning of the line through the third-to-last field in the unreversed text).

You may use POSIX defined parameter substitution:
$ name="t1_t2_t3_tn1_tn2.sh"
$ name=${name%_*_*}
$ echo $name
t1_t2_t3

It can not be done with cut, However, you can use sed
sed -r 's/(_[^_]+){2}$//g'

Just a different way to write ysth's answer :
echo "t1_t2_t3_tn1_tn2.sh" |rev| cut -d"_" -f1,2 --complement | rev

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find a substring from some text in a file and store it in a bash variable? - linux

I think this should work: cat config.txt | grep apache: | cut -d: -f3

Related

What do back brackets do in this bash script code?

Recursively grep unique pattern in different files

How to grep only the content that contains x and y?

How to determine the exact character of a whitespace in linux?

Unix cut except last two tokens

Categories

Resources