I have a requirement of searching a pattern from a file and displaying the pattern only in the screen,not the whole line .How can I do it in linux? [duplicate] - linux

This question already has answers here:
Can grep show only words that match search pattern?
(15 answers)
Closed 5 years ago.
I have a requirement of searching a pattern like x=<followed by any values> from a file and displaying the pattern i.e x=<followed by any values>, only in the screen, not the whole line. How can I do it in Linux?

I have 3 answers, from simple (but with caveats) to complex (but foolproof):
1) If your pattern never appears more than once per line, you could do this (assuming your shell is
PATTERN="x="
sed "s/.*\($PATTERN\).*/\1/g" your_file | grep "$PATTERN"
2) If your pattern can appear more than once per line, it's a bit harder. One easy but hacky way to do this is to use a special characters that will not appear on any line that has your pattern, eg, "#":
PATTERN="x="
SPECIAL="#"
grep "$PATTERN" your_file | sed "s/$PATTERN/$SPECIAL/g" \
| sed "s/[^$SPECIAL]//g" | sed "s/$SPECIAL/$PATTERN/g"
(This won't separate the output pattern per line, eg. you'll see x=x=x= if a source line had 3 times "x=", this is easy to fix by adding a space in the last sed)
3) Something that always works no matter what:
PATTERN="x="
awk "NF>1{for(i=1;i<NF;i++) printf FS; print \"\"}" \
FS="$PATTERN" your_file

Related

How to use sed or awk or something similar to replace every odd occurrence of character? [duplicate]

This question already has answers here:
Replace every n'th occurrence in huge line in a loop
(4 answers)
Closed 4 years ago.
I have the following string:
"1,0,2,0,3,0,4,0,5,0,6,0,13,05,24233,55".
How to use awk, or sed to get
"1.0,2.0,3.0,4.0,5.0,6.0,13.05,24233.55"?
I tried to use
sed 's/,/./g' <<< "1,0,2,0,3,0,4,0,5,0,6,0,13,05,24233,55"
1.0.2.0.3.0.4.0.5.0.6.0.13.05.24233.55
and also
sed 's/,/./2' <<< "1,0,2,0,3,0,4,0,5,0,6,0,13,05,24233,55"
1,0.2,0,3,0,4,0,5,0,6,0,13,05,24233,55
Which replaced the second item only. I need every odd occurrence changed.
For future, what would be the code the replace every odd occurrence of, by . ?
Thanks for your help
With any sed that supports EREs via -E, e.g. GNU sed and OSX/BSD sed:
$ echo "1,0,2,0,3,0,4,0,5,0,6,0,13,05,24233,55" | sed -E 's/,([^,]+(,|$))/.\1/g'
1.0,2.0,3.0,4.0,5.0,6.0,13.05,24233.55
The above was inspired by #PesaThe's comment to my original answer.
try this:
for the end:
sed 's/[,]$/?/' YourFile
putting the , between [] allow you to remove most of the regex behavior taking litteral value (not for some char like ^ that need to be manage another way
putting the $ is telling to refere to end of string
the g in your test mean change every occurence, you only wanted 1 and at the end
for the internal:
sed -e 's/,/./1;p' \
-e ':a' \
-e 's/^\(\([^.]*[.][^,]*,\)*\)\([^,]*\),\([^,]*\)/\1\3.\4/
/[^,]*,[^,.]*,/ ta' YourFile
you need a loop and a special test due to alternance existing

Get text only within parenthesis from a file in linux terminal [duplicate]

This question already has an answer here:
How can I extract the content between two brackets?
(1 answer)
Closed 4 years ago.
I have a large log file I need to sort, I want to extract the text between parentheses. The format is something like this:
<#44541545451865156> (example#6144) has left the server!
How would I go about extracting "example#6144"?
This sed should work here:
sed -E -n 's/.*\((.*)\).*$/\1/p' file_name
There are many ways to skin this cat.
Assuming you always have only one lexeme in parentheses, you can use bash parameter expansion:
while read t; do echo $(t=${t#*(}; echo ${t%)*}); done <logfile
The first substitution: ${t#*(} cuts off everything up and including the left parenthesis, leaving you with example#6144) has left the server!; the second one: ${t%)*} cuts off the right parenthesis and everything after that.
Alternatively, you can also use awk:
awk -F'[)(]' '{print $2}' logfile
-F'[)(]' tells awk to use either parenthesis as the field delimiter, so it splits the input string into three tokens: <#44541545451865156>, example#6144, and has left the server!; then {print $2} instructs it to print the second token.
cut would also do:
cut -d'(' -f 2 logfile | cut -d')' -f 1
Try this:
sed -e 's/^.*(\([^()]*\)).*$/\1/' <logfile
The /^.*(\([^()]*\)).*$/ is a regular expression or regex. Regexes are hard to read until you get used to them, but are most useful for extracting text by pattern, as you are doing here.

Sed - How to switch two words in a line [duplicate]

This question already has answers here:
exchange two words using sed
(5 answers)
Closed 5 years ago.
I'm trying to write a shell script that switches the first and third words in a line. In this case only strings that contain letters (both upper- and lowercase) count as words, everything else (numbers, punctuation, whitespace) is considered whitespace.
For example:
abc123def. ghi...jkl
would turn into:
ghi123def. abc...jkl
I tried the following, but it doesn't work:
sed 's/\([a-zA-Z][a-zA-Z]*\)[^A-Z^a-z]\([a-zA-Z][a-zA-Z]*\)[^A-Z^a-z]\([a-zA-Z][a-zA-Z]*\)/\3 \2 \1/' input.txt
With sed:
$ echo "abc123def. ghi...jkl" | sed -r 's/([A-Za-z]*)([^A-Za-z]*[A-Za-z]*[^A-Za-z]*)([A-Za-z]*)(.*)/\3\2\1\4/g'
$ ghi123def. abc...jkl

Remove lines in text file which contain fewer than 4 pipes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have a text file with data separated by 4 separate |
There are some problem lines in the file. These lines contain fewer than 4 pipes.
The data in the problem rows is not needed and I want to run a command on the file which deletes any line which contains fewer than four pipes. I would also like to know how many lines were deleted afterwards so if this could be printed on the screen once the command is applied that would be ideal.
Sample data:
865|Blue Moon Club|Havana Project|34d|879
899|Soya Plates|Dimsby|78a|699
657|Sherlock
900|Forestry Commission|Eden Project|68d|864
Desired output:
865|Blue Moon Club|Havana Project|34d|879
899|Soya Plates|Dimsby|78a|699
900|Forestry Commission|Eden Project|68d|864
I have tried awk '|>=3' file.txt which didn't work. There is a lot of info out there regarding awk, some of which I found, but there's so much it makes it difficult to find exactly what I want to do due to its sheer volume.
To eliminate the lines:
grep '|.*|.*|.*|' file > newfile
To count the number of bad lines:
grep -cv '|.*|.*|.*|' file
That doesn't do the edit in place; you could do that with sed but it is often safer to do edits like this to a newfile, in order to avoid losing data if you make a mistake.
The first grep pattern matches any line with four pipe symbols. (By default, grep uses "Basic" regular expressions, in which you have to write the alternation operator \|. So you can use | as an ordinary character.)
The second invocation counts (-c) the number of non-matching (-v) lines.
Here's a simple sed solution:
sed -n -i.bak '/|.*|.*|.*|/p' file
The -n option turns off automatic printing, so the command only prints the lines which match the pattern. (Again, by default, sed uses basic regexes.). The -i.bak option does the edit in place, creating a backup of the original with the name file.bak.
If you wanted to select lines with exactly four pipes, you could use awk:
awk -F'|' 'NF==5' file > newfile
which will set the filed separator to a pipe symbol and then select the lines with exactly five fields, which are the lines with four pipes.
A useful tool to count lines is wc:
wc -l file
will tell you how many lines are in file; if you count lines in both file and newfile, the difference will obviously be the number of deletions. You could do that computation in awk, too, but it's a bit wordier:
awk -F'|' 'NF==5{print;next}{del+=1}END{print del >>"/dev/stderr"}' file > newfile
This will do:
sed -i.bak '/\([^|]*|\)\{4\}/!d' file
Or (as Cyrus's comment)
sed -i.bak -E '/(\|[^\|]*){4}/!d' file
Or
sed -n '/^[^|]*|[^|]*|[^|]*|[^|]*|$/p' file > newfile
Or
sed -e '/^[^|]*|[^|]*|[^|]*|$/d' \
-e '/^[^|]*|[^|]*|$/d' \
-e '/^[^|]*|$/d' \
-e '/^[^|]*$/d' \
-i.bak file
This won't give you line count though. To get line count run grep -cv '^[^|]*|[^|]*|[^|]*|[^|]*|$' file on the original file as rici mentioned, or compare the line number before and after with wc -l file command
Explanation:
The first two sed matches loosely 4 pipes (not less but can be more) and the third one matches exactly 4 | (not more or less).
The fourth sed matches exactly 3,2,1 and 0 pipes (|) and deletes those lines (in place) and prepares a backup file (file.bak) of the original.

Highlight text similar to grep, but don't filter out text [duplicate]

This question already has answers here:
Colorized grep -- viewing the entire file with highlighted matches
(24 answers)
Closed 7 years ago.
When using grep, it will highlight any text in a line with a match to your regular expression.
What if I want this behaviour, but have grep print out all lines as well? I came up empty after a quick look through the grep man page.
Use ack. Checkout its --passthru option here: ack. It has the added benefit of allowing full perl regular expressions.
$ ack --passthru 'pattern1' file_name
$ command_here | ack --passthru 'pattern1'
You can also do it using grep like this:
$ grep --color -E '^|pattern1|pattern2' file_name
$ command_here | grep --color -E '^|pattern1|pattern2'
This will match all lines and highlight the patterns. The ^ matches every start of line, but won't get printed/highlighted since it's not a character.
(Note that most of the setups will use --color by default. You may not need that flag).
You can make sure that all lines match but there is nothing to highlight on irrelevant matches
egrep --color 'apple|' test.txt
Notes:
egrep may be spelled also grep -E
--color is usually default in most distributions
some variants of grep will "optimize" the empty match, so you might want to use "apple|$" instead (see: https://stackoverflow.com/a/13979036/939457)
EDIT:
This works with OS X Mountain Lion's grep:
grep --color -E 'pattern1|pattern2|$'
This is better than '^|pattern1|pattern2' because the ^ part of the alternation matches at the beginning of the line whereas the $ matches at the end of the line. Some regular expression engines won't highlight pattern1 or pattern2 because ^ already matched and the engine is eager.
Something similar happens for 'pattern1|pattern2|' because the regex engine notices the empty alternation at the end of the pattern string matches the beginning of the subject string.
[1]: http://www.regular-expressions.info/engine.html
FIRST EDIT:
I ended up using perl:
perl -pe 's:pattern:\033[31;1m$&\033[30;0m:g'
This assumes you have an ANSI-compatible terminal.
ORIGINAL ANSWER:
If you're stuck with a strange grep, this might work:
grep -E --color=always -A500 -B500 'pattern1|pattern2' | grep -v '^--'
Adjust the numbers to get all the lines you want.
The second grep just removes extraneous -- lines inserted by the BSD-style grep on Mac OS X Mountain Lion, even when the context of consecutive matches overlap.
I thought GNU grep omitted the -- lines when context overlaps, but it's been awhile so maybe I remember wrong.
You can use my highlight script from https://github.com/kepkin/dev-shell-essentials
It's better than grep cause you can highlight each match with it's own color.
$ command_here | highlight green "input" | highlight red "output"
Since you want matches highlighted, this is probably for human consumption (as opposed to piping to another program for instance), so a nice solution would be to use:
less -p <your-pattern> <your-file>
And if you don't care about case sensitivity:
less -i -p <your-pattern> <your-file>
This also has the advantage of having pages, which is nice when having to go through a long output
You can do it using only grep by:
reading the file line by line
matching a pattern in each line and highlighting pattern by grep
if there is no match, echo the line as is
which gives you the following:
while read line ; do (echo $line | grep PATTERN) || echo $line ; done < inputfile
If you want to print "all" lines, there is a simple working solution:
grep "test" -A 9999999 -B 9999999
A => After
B => Before
If you are doing this because you want more context in your search, you can do this:
cat BIG_FILE.txt | less
Doing a search in less should highlight your search terms.
Or pipe the output to your favorite editor. One example:
cat BIG_FILE.txt | vim -
Then search/highlight/replace.
If you are looking for a pattern in a directory recursively, you can either first save it to file.
ls -1R ./ | list-of-files.txt
And then grep that, or pipe it to the grep search
ls -1R | grep --color -rE '[A-Z]|'
This will look of listing all files, but colour the ones with uppercase letters. If you remove the last | you will only see the matches.
I use this to find images named badly with upper case for example, but normal grep does not show the path for each file just once per directory so this way I can see context.
Maybe this is an XY problem, and what you are really trying to do is to highlight occurrences of words as they appear in your shell. If so, you may be able to use your terminal emulator for this. For instance, in Konsole, start Find (ctrl+shift+F) and type your word. The word will then be highlighted whenever it occurs in new or existing output until you cancel the function.

Resources