how to grep a string from a particular line? [closed] - linux

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I want to search for a String by navigating to a particular line, How to do this in shell scripting?
For example,
I have
this is the first line
this is the Second line
This is the Third line
Now here i would want to look for string "Third" by going to 3rd line.
Any help is appreciated, Thank you.

Try stringing together cat, sed, and grep.
sed '3!d' filename | grep Third
The unnamed or anonymous pipe (|) and redirection (<, >) are powerful features of many shells. They allow one to combine a set of commands to perform a more complex function.
In the case of this question there were two clear steps,
1) Operate on a specific line of a file (e.g. filter a file)
2) Search the output of the filter for a specific string
Recognizing that there were two steps is a strong indicator that two commands will need to be combined. Therefore, the problem can be solved by finding a solution to each step and then combining them in to one command with pipes and redirection.
If you know about the Stream Editor (sed), it may come to your mind when thinking about how to accomplish the first step of filtering the file. If not searching for, "linux get a specific line of a file" this OS question comes up high in the search results.
$ cat tmp.txt
this is the first line
this is the Second line
This is the Third. line
$ sed '3!d' tmp.txt
This is the Third. line
Knowing that grep can be search for lines with the string of interest the next challenge is to figure out how to get the output of sed as the input to grep. The pipe (|) solves this problem.
sed '3!d' filename | grep Third
Example output:
$ sed '3!d' tmp.txt | grep Third
This is the Third. line
$
Another powerful concept in shell scripting is the exit status. The grep command will set the exit status to 0 when a match is found and 1 when a match is not found. The shell stores the exit status in a special variable named $? (for bash). Therefore, one could use the exit status to conditionally determine the next step in the shell script. The example below does not implement conditions (like if, else). The example below shows the exit status value using the echo command.
$ sed '3!d' tmp.txt | grep Third
This is the Third. line
$ echo $?
0
$ sed '3!d' tmp.txt | grep third
$ echo $?
1
$

Related

How can i search for an hexadecimal content in a file in a linux/unix/bash script? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have an hexadecimal string s and a file f, i need to search the first occurence of that string in the file and save that in a variable with his offset. I thought that the right way to do that is convert the file to hex and search that with a grep. The main problem is that i saw a lot of commands(hexdump,xxd,etc.) to convert but none of them actually work. Any suggestion?
My attempt was like this:
xxd -plain $f > $f
grep "$s" .
output should be like:
> offset:filename
A first approach without any error handling could look like
#!/bin/bash
BINFILE=$1
SEARCHSTRING=$2
HEXSTRING=$(xxd -p ${BINFILE} | tr -d "\n")
echo "${HEXSTRING}"
echo "Searching ${SEARCHSTRING}"
OFFSET=$(grep -aob ${SEARCHSTRING} <<< ${HEXSTRING} | cut -d ":" -f 1)
echo ${OFFSET}:${BINFILE}
I've used xxd here because of Does hexdump respect the endianness of its system?. Please take also note that according How to find a position of a character using grep? grep will return multiple matches, not only the first one. The offset will be counted beginning from 1, not 0. To substract 1 from the variable ${OFFSET} you may use $((${OFFSET}-1)).
I.e. search for the "string" ELF (HEX 454c46) in a system binary will look like
./searchHEX.sh /bin/yes 454c46
7f454c460201010000000000000000000...01000000000000000000000000000000
Searching 454c46
2:/bin/yes
I would use regex for this as well:
The text file:
$ cat tst.txt
1234567890x1fgg0x1cfffrr
A script you can easily change/extend yourself.
#! /bin/bash
part="$(perl -0pe 's/^((?:(?!0(x|X)[0-9a-fA-F]+).)*)(0(x|X)[0-9a-fA-F]+)(.|\n)*/\1:\3\n/g;' tst.txt)"
tmp=${part/:0x*/}
tmp=${#tmp}
echo ${part/*:0x/$tmp:0x} # Echoes 123456789:0x1f
Regex:
^((?:(?!0x[0-9a-fA-F]+).)*) = Search for the first entry that's a hexadecimal number and create a group of it (\1).
(0x[0-9a-fA-F]+) = Make a group of the hexadecimal number (\3).
(.|\n)* = Whatever follows.
Please note that tmp=${part/:0x*/} could cause problems if you have text like :0x before the hexadecimal number that is caught.

SED Command Replacement [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Suppose I have a file with warnings. Each warning in a new line with an id that has 3 capital letters followed by 3 digits only, should be replaced by its id.
Example:
SIM_WARNING[ANA397]: Node q<159> for vector output signal does not exist
The output should be ANA397 and the rest of line is deleted.
How to do so using sed?
I don't think you need sed for that. A simple grep with --only-matching could do, as in:
grep -E 'SIM_WARNING\[(.)\]' --only-matching
should work for you.
Where:
-E does "enhanced regular expressions. I think we need those for capturing with ( )
then follows the pattern, which consists of the fixed SIM_WARNING, followed by a match inside the square brackets
--only-matching simply makes grep print only matching content
In other words: by using ( match ) you tell grep that you only care about the thing in that match pattern.
for id in $(grep -o "^SIM_WARNING\[[A-Z][A-Z][A-Z][0-9][0-9][0-9]\]" test1.bla | grep -o "[A-Z][A-Z][A-Z][0-9][0-9][0-9]" test1.bla ); do echo $id; done
This finds ANA397 from the below.
SIM_WARNING[ANA397]: Node q<159> for vector output signal does not exist
First of all, you have to choose how to use the IDs, for example if you need to cycle the file first or the IDs later...
E.G. (Cycle file first)
exec 3<file
while read -r line <&3; do
id="$(printf "%s" "${line}" | sed -e "s/.*\[\([[:alnum:]]\+\)\].*/\1/")"
### Do something with id
done
exec 3>&-
Otherwise you can decide to cycle the output of sed...
E.G.
for id in $(sed -e "s/.*\[\([[:alnum:]]\+\)\].*/\1/" file); do
### Do something with id
done
Both of the examples should work with posix shell (If I am not missing something...), but shell like posh may not support classes as [[:alnum:]], you can substitute them with the equivalent [a-zA-Z0-9], as every guide will teach you.
Note that the check is not on 3 letters and 3 digits, but for any letter and digit between brackets ([ and ]).
EDIT:
If your lines start with SIM_WARNING you can discriminates those lines with -e "/^SIM_WARNING/! d"
For a strict check on 3 letters and 3 digits you can use -e "s/.*\[\([a-zA-Z][a-zA-Z][a-zA-Z][0-9][0-9][0-9]\)\].*/\1/"
So taking the example above you can do somethin like:
for id in $(sed -e "/^SIM_WARNING/! d" -e "s/.*\[\([a-zA-Z][a-zA-Z][a-zA-Z][0-9][0-9][0-9]\)\].*/\1/" file)
### Do something with id
done

Check if an array element is in a file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am writing a bash script to check if an array element is in a file.
For example:
I have an array of errors errors=("1234" "5678" "9999")
I have a file that contains patterns of strings
123400 452612 9999A0 1010EB
I am looking to loop over the file that contains the errors and check to see if any of the array elements matches any string pattern in the file. If it does then give me back the exact array pattern it matched in the file for further processing.
Any ideas on how I can do this?
Here's a way where you only need to invoke grep once:
$ grep -oFf <(printf "%s\n" "${errors[#]}") file
1234
9999
The -f option is to specify a file that contains the pattern. I use a process substitution to "contain" the patterns, one per line.
The -F option specifies plain-text matching: I assume your "errors" array won't contain regular expressions.
Sounds like you just want a loop:
for error in "${errors[#]}"; do
if grep -qE "(^| )$error( |\$)" file; then
# $error was found in the file
fi
done
This matches the error preceded by the start of the line or a space, and followed by a space or the end of the line.
I made an effort to not match appearances of the errors within substrings but if you don't care, then you could change the grep command to this:
grep -qF "$error" file
This will return success if the error string occurs anywhere on the line.
The script goes like this,
#/bin/bash
errors=("1234" "5678" "9999")
for error in "${errors[#]}"
do
grep -o "$error" file
done
For a sample file,
$ cat file
123400 452612 9999A0 1010EB
The script produces an output
$ ./script.sh
1234
9999
meaning the above two keys from the array have matched in the file. The -o flag in grep is to identify only the matching parts from the array. An excerpt from the man grep page.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

Remove lines in text file which contain fewer than 4 pipes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have a text file with data separated by 4 separate |
There are some problem lines in the file. These lines contain fewer than 4 pipes.
The data in the problem rows is not needed and I want to run a command on the file which deletes any line which contains fewer than four pipes. I would also like to know how many lines were deleted afterwards so if this could be printed on the screen once the command is applied that would be ideal.
Sample data:
865|Blue Moon Club|Havana Project|34d|879
899|Soya Plates|Dimsby|78a|699
657|Sherlock
900|Forestry Commission|Eden Project|68d|864
Desired output:
865|Blue Moon Club|Havana Project|34d|879
899|Soya Plates|Dimsby|78a|699
900|Forestry Commission|Eden Project|68d|864
I have tried awk '|>=3' file.txt which didn't work. There is a lot of info out there regarding awk, some of which I found, but there's so much it makes it difficult to find exactly what I want to do due to its sheer volume.
To eliminate the lines:
grep '|.*|.*|.*|' file > newfile
To count the number of bad lines:
grep -cv '|.*|.*|.*|' file
That doesn't do the edit in place; you could do that with sed but it is often safer to do edits like this to a newfile, in order to avoid losing data if you make a mistake.
The first grep pattern matches any line with four pipe symbols. (By default, grep uses "Basic" regular expressions, in which you have to write the alternation operator \|. So you can use | as an ordinary character.)
The second invocation counts (-c) the number of non-matching (-v) lines.
Here's a simple sed solution:
sed -n -i.bak '/|.*|.*|.*|/p' file
The -n option turns off automatic printing, so the command only prints the lines which match the pattern. (Again, by default, sed uses basic regexes.). The -i.bak option does the edit in place, creating a backup of the original with the name file.bak.
If you wanted to select lines with exactly four pipes, you could use awk:
awk -F'|' 'NF==5' file > newfile
which will set the filed separator to a pipe symbol and then select the lines with exactly five fields, which are the lines with four pipes.
A useful tool to count lines is wc:
wc -l file
will tell you how many lines are in file; if you count lines in both file and newfile, the difference will obviously be the number of deletions. You could do that computation in awk, too, but it's a bit wordier:
awk -F'|' 'NF==5{print;next}{del+=1}END{print del >>"/dev/stderr"}' file > newfile
This will do:
sed -i.bak '/\([^|]*|\)\{4\}/!d' file
Or (as Cyrus's comment)
sed -i.bak -E '/(\|[^\|]*){4}/!d' file
Or
sed -n '/^[^|]*|[^|]*|[^|]*|[^|]*|$/p' file > newfile
Or
sed -e '/^[^|]*|[^|]*|[^|]*|$/d' \
-e '/^[^|]*|[^|]*|$/d' \
-e '/^[^|]*|$/d' \
-e '/^[^|]*$/d' \
-i.bak file
This won't give you line count though. To get line count run grep -cv '^[^|]*|[^|]*|[^|]*|[^|]*|$' file on the original file as rici mentioned, or compare the line number before and after with wc -l file command
Explanation:
The first two sed matches loosely 4 pipes (not less but can be more) and the third one matches exactly 4 | (not more or less).
The fourth sed matches exactly 3,2,1 and 0 pipes (|) and deletes those lines (in place) and prepares a backup file (file.bak) of the original.

What is cat for and what is it doing here? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have this script I'm studying and I would like to know what is cat doing in this section.
if cat downloaded.txt | grep "$count" >/dev/null
then
echo "File already downloaded!"
else
echo $count >> downloaded.txt
cat $count | egrep -o "http://server.*(png|jpg|gif)" | nice -n -20 wget --no-dns-cache -4 --tries=2 --keep-session-cookies --load-cookies=cookies.txt --referer=http://server.com/wallpaper/$number -i -
rm $count
fi
Like most cats, this is a useless cat.
Instead of:
if cat downloaded.txt | grep "$count" >/dev/null
It could have been written:
if grep "$count" download.txt > /dev/null
In fact, because you've eliminated the pipe, you've eliminated issues with which exit value the if statement is dealing with.
Most Unix cats you'll see are of the useless variety. However, people like cats almost as much as they like using a grep/awk pipe, or using multiple grep or sed commands instead of combining everything into a single command.
The cat command stands for concatenate which is to allow you to concatenate files. It was created to be used with the split command which splits a file into multiple parts. This was useful if you had a really big file, but had to put it on floppy drives that couldn't hold the entire file:
split -b140K -a4 my_really_big_file.txt my_smaller_files.txt.
Now, I'll have my_smaller_files.txt.aaaa and my_smaller_files.txt.aaab and so forth. I can put them on the floppies, and then on the other computer. (Heck, I might go all high tech and use UUCP on you!).
Once I get my files on the other computer, I can do this:
cat my_smaller_files.txt.* > my_really_big_file.txt
And, that's one cat that isn't useless.
cat prints out the contents of the file with the given name (to the standard output or to wherever it's redirected). The result can be piped to some other command (in this case, (e)grep to find something in the file contents). Concretely, here it tries to download the images referenced in that file, then adds the name of the file to downloaded.txt in order to not process it again (this is what the check in if was about).
http://www.linfo.org/cat.html
"cat" is a unix command that reads the contents of one or more files sequentially and by default prints out the information the user console ("stdout" or standard output).
In this case cat is being used to read the contents of the file "downloaded.txt", the pipe "|" is redirecting/feeding its output to the grep program, which is searching for whatever is in the variable "$count" to be matched with.

Resources