Unix: Using sed remove part of each line - linux

I am trying to remove the Who: ,What: ,When: ,Where: and the proceeding space from each line in a text file. Below is a example:
Who: Tester1+Password
What: Authentication Success
When: Tues March 20, 2015 08:15:02 UTD
Where: 198.192.1.2

If you want to remove anything upto first :, then you can do:
sed -re 's/(^[^:]+: )(.*)/\2/' file
Tester1+Password
Authentication Success
Tues March 20, 2015 08:15:02 UTD
198.192.1.2
As Glenn suggested we can avoid capture groups completely by removing the portion we don't need.
sed 's/[^:]\+: //' file

Use cut to exclude the first space-separated field:
cut -d " " -f 2- file

sed 's/^[^ ]* *//' YourFile
assuming only these 4 word are possible and as your sample

Put the pattern which matches What, where, when , who inside a capturing group followed by a colon. Then replace the matched chars with an empty string. Add i flag at the last if you want do a case-insensitive match.
$ sed 's/^Wh\(ere\|en\|at\|o\):[[:blank:]]*//' file
Tester1+Password
Authentication Success
Tues March 20, 2015 08:15:02 UTD
198.192.1.2
For general case, you could use
sed 's/^[^[:blank:]]\+[[:blank:]]\+//' file

Related

fIm trying to find a string betwen two match patterns and then add that string before a pattern using sed

Say I have the below line in file named "logs_test":
Sample input:
"at 10947 usecs after Tue Feb 23 18:29:46 2021 [119] init: Event=populatedonRestart"
I wanted to find a string between "at" and "usecs" and add the string before "2021" in the above line
sample output:
"at 10947 usecs after Tue Feb 23 18:29:46 10947 2021 [119] init: Event=populatedonRestart"
sed command to find a string between two matching patterns:
sed "s/at//;s/usecs.*//“ <file_name>
sed command to add a string before a pattern:
sed 's/2021/string &/g' <file_name>
How can I accomplish two tasks using one sed command? Is there were to use the sed command inside sed to do this ?
This will do it for your example (with GNU sed):
sed 's/^\(at \)\(.* \)\(usecs.*\)\(2021.*\)/\1\2\3\2\4/' your_file
The ways it's working is as follows:
I remember the stuff in between \( and \) (these are called capture groups)
I break the string into 4 capture groups, the 2nd capture group is the
number you care about. And I put the groups back together and use the 2nd
capture group twice: /\1\2\3\2\4/
Once you've confirmed it does what you want, you could add the -i to do the
replacement in-place:
sed -i 's/^\(at \)\(.* \)\(usecs.*\)\(2021.*\)/\1\2\3\2\4/' your_file

How to use "grep -v" or something similar in the entire text except for the first column?

I am trying to manipulate a file lets say :
76ers23 Philadelphia 76ers announced today that
76ers24 Lakers announced today
76ers25 blazers plays today
76ers26 celics announced today that
76ers27 Bonston has Day off
76ers28 Philadelphia 76ers announced today that
76ers29 the blazzers announced today that
76ers30 76ers Training day
76ers31 Philadelphia 76ers has a day off today
76ers32 Philadelphia 76ers humiliate Lakers
76ers33 celics announced today that
I want to remove all the entries containing the term 76ers from the second column so as to obtain:
76ers24 Lakers announced today
76ers25 blazers plays today
76ers26 celics announced today that
76ers27 Bonston has Day off
76ers29 the blazzers announced today that
76ers33 celics announced today that
my issue here is that if I will use the grep -v "76ers" it returns null
I am looking to use the grep (or another command) in the second line only.
I found this complicate way but which is pretty much what I want, but I got an_at the beginning of the second column.
cat file|awk '{print $1}' >file1
cat file|awk '{$1="";print $0}'|tr -s ' ' | tr ' ' '_' >file2
paste file1 file2 |grep -v "_76ers"
I'm not a bash expert so I guess there will be an easier way for that.
Thank you in advance!
Use a regular expression that skips over the first column.
grep -v '^[^ ]* .*76ers' file
[^ ]* matches everything up to the first space.
using awk:
awk '{ found=0;for(i=2;i<=NF;i++) { if (match($i,"76ers")) { found=1 } } if (found==0) { print $0 } }' file
Loop through the second space separated field to the last field and use match to check if that field contains 76ers. If it does, set a found flag. Only print the line if found is 0 after we have looped through each field for every line..
You can create an Extend Reqular Expression to Ignore the first column. Not knowing exactly what you "flavor" of the OS is, I'll give you two different formats.
grep -E is the same as egrep
[[:digit:]] is the same as [0-9]
[[:space:]] is the same as []
First option: Look for 76ers with white space after it:
grep -Ev '76ers[[:space:]]' <file>
Second Option: Look for 76ers, followed by one or more digits, , then a second 76ers:
grep -Ev '76ers[[:digit:]][[:digit:]]*.*76ers' <filename>
With GNU grep, requiring that the match is "whole word" with the -w/--word-regexp option:
grep -vw '76ers' infile
From the manual:
-w
--word-regexp
Select only those lines containing matches that form whole words. The
test is that the matching substring must either be at the beginning of
the line, or preceded by a non-word constituent character. Similarly,
it must be either at the end of the line or followed by a non-word
constituent character. Word constituent characters are letters,
digits, and the underscore. This option has no effect if -x is also
specified.
Here is an alternative approach using awk. Similar to the idea of Balmer, ensure that the first column does not match the ERE.
$ awk -v ere='76ers' '$0~ere && $1!~ere' file
This will print all the records/lines which match the regular expression ere ($0~ere) but only if the first column does not match that regular expression $1!~ere.
$ grep -v ' .*76ers' file
76ers24 Lakers announced today
76ers25 blazers plays today
76ers26 celics announced today that
76ers27 Bonston has Day off
76ers29 the blazzers announced today that
76ers33 celics announced today that

Replacing characters in each line on a file in linux

I have a file with different word in each line.
My goal is to replace the first character to a capital letter and replace the 3rd character to "#".
For example: football will be exchanged to Foo#ball.
I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.
With GNU sed and two s commands:
echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'
Output:
Foo#ball
See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion
This might work for you (GNU sed):
sed 's/\(...\)./\u\1#/' file
With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:
line="${line^}" # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}" # make 4th character '#'
Example Input File
$ cat file
football
soccer
baseball
Example Use/Output
$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball
While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.
(note: your question says 3rd character, but your example replaces the 4th character with '#')
With GNU awk for the 3rd arg to match():
$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball
Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)
This is just an additional solution with awk because you mentioned it and used also awk tag:
$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball
This is a most simple solution without RegEx. It will also work on non-GNU awk.
This should work with any version of awk:
awk '{
for(i=1;i<=NF;i++){
# Note that string indexes start at 1 in awk !
$i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
}
print
}' file
Note: If a word is less than 3 characters long, like it, it will be printed as It#
if your data in 'd' file, tried on gnu sed:
sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

Separate a text file with sed

I have the following sample file:
evtlog.161202.002609.debugevtlog.161201.162408.debugevtlog.161202.011046.debugevtlog.161202.002809.debugevtlog.161201.160035.debugevtlog.161201.155140.debugevtlog.161201.232156.debugevtlog.161201.145017.debugevtlog.161201.154816.debug
I want to separate the string and add a newline after matching "debug" like this:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
So far I tried almost everything with sed, but it doesn't seem to do what I want.
sed 's/debug/{G}' latest_evtlogs.out
sed '/debug/i "SAD"' latest_evtlogs.out
etc...
sed 's/debug/\n/g' latest_evtlogs.out doesn't work when I add it as a pipe in the script , but it does when I run it manually.
Here's how I generate the file:
printf $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g') >> latest_evtlogs.out
Initially I wanted to just add newline with awk, but it doesn't work either.
Any ideas why I can't separate the string with a newline ?
I'm using :
Distributor ID: Debian
Description: Debian GNU/Linux 5.0.10 (lenny)
Release: 5.0.10
Codename: lenny
Just add a new line after debug:
sed 's/debug/&\n/g' file
Note & prints back the matched text, so it is a way to print "debug" back.
This returns:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
evtlog.161202.011046.debug
evtlog.161202.002809.debug
evtlog.161201.160035.debug
evtlog.161201.155140.debug
evtlog.161201.232156.debug
evtlog.161201.145017.debug
evtlog.161201.154816.debug
The problem is, that you are using the output of sed in a command expansion. In this context your shell will replace all newlines with spaces. The spaces are then used to do the word splitting, so that printf sees each line as a separate argument, interpreting the first line as the format argument and ignoring the rest as there are printf-placeholders in the format.
It should work if you drop the outer printf $() from your command and just redirect the output from your pipeline to your file:
ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g' >> latest_evtlogs.out
Maybe Perl is "happier" than sed on your system:
perl -pe 's/debug/&\n/g' < YourLogFile
Get will append what is in the hold buffer unto the pattern space (Usually just the current line read from the input file) So this cannot be used.
insert will print the specified text to standard output. So this cannot be used.
What you you want to to replace all debug with debug^J, where ^J is a newline, dependent on the sed version, you can either do:
sed 's/debug/&\n/g' input_file
But \n is - afaik - not strictly specified in POSIX sed. One can however use c strings:
sed 's/debug/&'$'\n''/g' input_file
Or a multi line string:
sed 's/debug/&\
/g' input_file
Thank you all for the answers.I finally did it like this :
echo $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/&\n/g') > temp.out
sed 's/ /\n/g' /share/sqa/dumps/5314577631/checks/temp.out > latest_evtlogs.out
It's not at all elegant, but it finally works.

Search specific word and delete it from file in linux

I have a file (file1.txt) which contains below text:
mon
tue
tue_day
tuesday
wed
and I want to search for a word "tue" and delete it from this file.
I used
sed -i "/tue/d" file1.txt
but it deletes all the lines containing tue word i.e. line 2,3 and 4. I want to delete the only line 2 which conatins exact same text that i want to remove from file.
could you please suggest?
Just tell sed that you want lines that are exactly "tue". How? Prepending and appending ^ and $ to indicate beginning and end of line:
$ sed '/^tue$/d' file
mon
tue_day
tuesday
wed
To replace with something given in a variable, use double quotes like this:
var="tue"
sed -i "/^$var$/d" file

Resources