Identify a text in a file which contains path(including * asterisk) in Shell - linux

Input:
Text file: backup_list.txt
/home/common/xyz_V*.txt
/home/common/hello.txt
/home/mutaq/xya_*_logs.txt
/home/mutaq/xygs.txt
Text: /home/mutaq/xya_Juvi_V1.01_logs.txt
Now i want to match this text in the file.
As the file has a line /home/mutaq/xya_*_logs.txt which is similar to the text /home/mutaq/xya_Juvi_V1.01_logs.txt considering asterisk(*) as the character to represent multiple character in between.
now i want to know whether the text exists in the file or not.
Using grep, i cannot differentiate with asterisk.
one way i found that i can first iterate through the backup_list.txt
and invoke ls command for each of the line and store the same in some place then, i can directly match the text with the stored value.
But is there any better way of doing this, such that i can directory search the text withing that file ?

A first answer could be grep -f option, that allow grep to take parttern to search in the file pass in parameter. You could try :
ls /path/to/files/ | grep -f backup_list.txt
But to give backup_list.txt content to grep as pattern, '*' have to be replace by '.*' to say any character 0 or n time and '.' have to be replace by '\.' to match '.' character and not any character.
You can make replacement with sed.
Hope this help.

Related

Output the names of all files from file.txt, having the .conf extension

I need to output from a file file.txt the names of all files with the .conf extension.
grep .conf file.txt
But in the end, I get a file called dconf and a file with the config extension. How can I output everything else, but without these two?
The '.' has a special meaning, it says "any character". If you really want to match only the dot itself, you have to mask the character with:
grep "\.conf" file.txt
The masking with backslash must also be masked for the shell itself with ".
To see a list of regular expressions, you can take a look at online regex test.
Add on:
From the comments: How to see no file from the list which is named xyz.config
Answer: You have to tell grep that the regular expression ends at the end of the word with:
grep "\.conf\>" file.txt
TL;DR: you should instead do:
grep "\.conf\>" file.txt
grep uses Regular Expressions. The . character in a regex is a command which means "match any one character." So your command means "match any string which contains one character followed by c o n f in that order."
So, your regular expression will match what you are looking for, but it will also match strings that have things after your match (your .config example) as well as anything followed by "conf" (your dconf example)
So instead you want to tell grep that you are looking for a "string literal ." by escaping that character in your regular expression by preceding it with a backslash (\), and you want to describe what the end or your string input is like, which may be a newline or it may simply be a space.

How to count exact match of certain patterns in a text file using linux shell command?

I want to find the count of certain pattern in a text file which contains lot of mixed patterns also using linux shell command.
I have a text file which contains below patterns,
[--------------]
[+--------------+]
[+----------+------------+--------------------+]
[+---------------------+---------------------+]
How to find exact count of only first pattern [--------------]?
Note: Don't include square bracket as a pattern. Only special character inside square bracket is a pattern.
cat ./file | sed -e 's/\]/\]\n/' |grep "\[--------------\]" -c
cat reads file
sed replace ] with ]\n
grep searches every line for your expression and prints the number of lines -c

How to use sed to replace a string that contains the slash?

I have a text file that contain a lot of mess text.
I used grep to get all the text that contains the string prod like this
cat textfile | grep "<host>prod*"
The result
<host>prod-reverse-proxy01</host>
<host>prod-reverse-proxy01</host>
<host>prod-reverse-proxy01</host>
Continually, i used sed with the intention to remove all the "host" part
cat textfile | grep "<host>prod*" | sed "s/<host>//g"; "s/</host>//g"
But only the first "host" was removed.
prod-reverse-proxy01</host>
prod-reverse-proxy01</host>
prod-reverse-proxy01</host>
How can i remove the other "/host" part?
sed -n -e "s/^<host>\(.*\)<\/host>/\1/p" textfile
sed can process your file directly. No need to grep or cat.
-n is there to suppress any lines that do not match. Last 'p' in the script will print all matching files.
Script dissection:
s/.../.../...
is the search/replace form. The bit between the first and the second '/' is what you search for. The bit between the second and third is what you replace it with. The last part is any commands you want to apply to the replacement.
Search:
^<host>\(.*\)<\/host>
finds all lines beginning with <host> followed by any text (.*) followed by </host>. Any text between <host> and </host> is stored into internal variable '1' using '(' and ')'. Note that (, ) and / (in </host>) have to be escaped.
Replace:
\1
Replace found text with contents of variable 1 (1 has to be escaped, otherwise, everything is replaced by character '1'.
Commands:
p
Print resulting line (after replacement).
Note: Your search involves removing two similar but not identical strings (<host> and </host>).
I think this sed is enough
sed 's/<[/]*host>//g' infile

sed to replace same patterns that have slightly different ending to the string

I am using grep on an entire directory and sed to replace the string. There are some conflicts in replacing the as there are two strings that are very similar and have the same pattern. Only big difference is the file extension at the end.
String1
xargs sed -i
's,//website.net/resources/special.js,//newsite.net/location/newspecial.js,g'
String2
xargs sed -i
's,//website.net/resources/file.swf,//newsite.net/location/player.swf,g'
How do I specify that .js receives the correct replacement and .swf receives the correct replacement?
For the first, you can restrict the match easily, for the second you need a mapping to provide the old file name to new file name otherwise how the script is going to know that "file.swf" to be replaced with "player.swf".
$ echo '//website.net/resources/special.js' |
sed -r 's,(.*/)(.*.js)$,\1new\2,'
//website.net/resources/newspecial.js
first match group will include every char until the last /., second match things ending with .js, you may need another anchor if there are multiple elements on the same line. Note that in one element case g is unnecessary.

Display entire line containing certain characters in Linux

I am wondering if there is a way to display the entire line in a data file containing specific characters in linux? For example searching for "577789999" in a file.txt should display me the line such as below
577789999 adef YTM 777888
that's what grep is for
grep 577789999 file.txt
you might want to restrict the pattern to occur only in the beginning of the line:
grep ^577789999 file.txt
Generaly, you might use grep like that :
grep "the researched string" "filename"
It'll tells you at which line you're string is, and if you search it on "*" (All files in the current dir), it tells you in which file you're string is ;)

Resources