Replacing characters in each line on a file in linux - linux

I have a file with different word in each line.
My goal is to replace the first character to a capital letter and replace the 3rd character to "#".
For example: football will be exchanged to Foo#ball.
I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.

With GNU sed and two s commands:
echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'
Output:
Foo#ball
See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion

This might work for you (GNU sed):
sed 's/\(...\)./\u\1#/' file

With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:
line="${line^}" # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}" # make 4th character '#'
Example Input File
$ cat file
football
soccer
baseball
Example Use/Output
$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball
While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.
(note: your question says 3rd character, but your example replaces the 4th character with '#')

With GNU awk for the 3rd arg to match():
$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball

Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)
This is just an additional solution with awk because you mentioned it and used also awk tag:
$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball
This is a most simple solution without RegEx. It will also work on non-GNU awk.

This should work with any version of awk:
awk '{
for(i=1;i<=NF;i++){
# Note that string indexes start at 1 in awk !
$i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
}
print
}' file
Note: If a word is less than 3 characters long, like it, it will be printed as It#

if your data in 'd' file, tried on gnu sed:
sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

Related

Insert line number in a file

Would like to insert line number at specific location in file
e.g.
apple
ball
should be
(1) apple
(2) ball
Using command
sed '/./=' <FileName>| sed '/./N; s/\n/ /'
It generates
1 Apple
2 Ball
1st solution: This should be an easy task for awk.
awk '{print "("FNR") "$0}' Input_file
2nd solution: With pure sed as per OP's attempt try:
sed '=' Input_file | sed 'N; s/^/(/;s/\n/) /'
Easy to do with perl instead:
perl -ne 'print "($.) $_"' foo.txt
If you want to modify the file in-place instead of just printing out the numbered lines on standard output:
perl -ni -e 'print "($.) $_"' foo.txt
Many ways are there to insert line numbers in a file
some of them are :-
1.Using cat command
cat -n file.txt > newfile.txt
2.Using nl command
nl -b a file.txt
Awk and perl both are very usefull and powerfull. But if, like me, you are reluctant to learn yet another programming language, you can complete this task with the bash commands you probably know already.
With bash you can
increment a sequence number n: $((++n))
read all lines from a file foo into a variable l: while read -r l;do ...;done <foo, where the option -r serves to treat backslashes as just characters.
print formatted output to a line: printf "plain text %i %s\n" number string
Now suppose you want to enclose your sequence number in parentheses, and format them to 8 digits with leading zeroes, then you combine all this to get:
n=0;while read -r l;do printf "(%08i) %s\n" $((++n)) "$l";done <foo >numberedfoo
Note that you do not need to initialize the variable n to use it as a sequence number further on. But if you experiment with this command a few times without reinitializing n, your lines will be numbered from where your previous try stopped incrementing.
Finally, if you don't like the C-like formatting syntax of printf, just use plain echo, and leave the formatting to bash variable expansion. Here is how to format a number like in the command above (do type a space before the -, and a ; before the echo) :
nformat="0000000$n"; echo "(${nformat: -8}) ...";

How To Delete All Words Before X Characters

I'm using code from this question How To Delete All Words After X Characters and I'm having a trouble keeping (not deleting) all the words after 30 characters.
Original code:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i-1; }1'
My attempt:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i+1; }1'
Basically, I understand I need to change the NF which was NF=i-1 so I tried changing it to NF=i+1 but obviously I'm only getting one field. How can I specify NF to print the rest of the line?
Sample data:
StackOverflow Users Are Brilliant And Hard Working
#character 30 ---------------^
Desired output:
And Hard Working
If you could please help me keep the rest of the line by using NF, I would really appreciate your positive input and support.
It is much easier using gnu grep:
grep -oP '^.{30}\w*\W*\K.*' file
And Hard Working
Where \K is used for reseting matched information.
RegEx Breakup:
^: Start
.{30}: Match first 30 characters
\w*: followed by 0 or more word characters
\W*: followed by 0 or more non-word characters
\K: reset matched information so far
.*: Match anything after this position
Using awk you can use this solution:
awk '{sub(/^.{30}[_[:alnum:]]*[[:blank:]]*/, "")} 1' file
And Hard Working
Finally a sed solution:
sed -E 's/^.{30}[_[:alnum:]]*[[:blank:]]*//' file
And Hard Working
another awk
awk '{print substr($0, index(substr($0,30),FS)+30)}'
find the delimiter index after the 30th char, take a substring from that index on.
I can't imagine why your considering anything related to NF for this since you're not doing anything with fields, you're just splitting each line at a blank char. It sounds like this is all you need for both questions, using GNU awk for gensub():
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\1",1)}' file
StackOverflow Users Are Brilliant
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\2",1)}' file
And Hard Working
or it's briefer using GNU sed:
$ sed -E 's/(.{30}\S*)\s+(.*)/\1/' file
StackOverflow Users Are Brilliant
$ sed -E 's/(.{30}\S*)\s+(.*)/\2/' file
And Hard Working
With the use of NF, you can try
awk '{for(i=1;i<=NF;i++){a+=length($i)+1;if(a>30){for(j=i+1;j<=NF;j++)b=b $j" ";print b;exit}}}'
cut -c30- file | cut -d' ' -f2-
this will keep only the words that start after 30th character (index >= 31)

How to use grep and sed in order to replace the substring after searching some specific string?

I want to know how to use two 'grep' and 'sed' utilities or something else in order to replace the substring. I will explain what I want to do below.
We have the file 'test.txt' with the following string:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='AA5', A6='keyword_A'
After searching 'keyword_A' using grep, I want to replace the value of A5 with other string, for example, "NEW".
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
I tried to use two commands like
grep keyword_A test.txt | sed -e 's/blabla/blabla/'
After trying all I know, I gave up at all.
Please let me know the right solution.
First, you never need grep and sed. Sed has a full regular-expression search engine, so it is a superset of grep. This command will read test.txt, change the lines that you've indicated, and print the entire result on standard output:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" < test.txt
If you want to store the results back into the file test.txt, use the -i (in-place editing) switch to sed:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" -i.bak test.txt
If you want to select only the indicated lines, modify those, and print only those lines to standard out, use a combination of the p (print) command and the -n (no output) switch.
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/gp" -n test.txt
Using grep+sed is always the wrong approach. Here's one way to do it with GNU awk:
$ awk '/keyword_A/{ $0=gensub(/(A5({[^}]+})?=\047)[^\047]+/,"\\1NEW",1) } 1' file
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
Using a couple variables you could define the keyword and replacement ( if they change at all ):
q="keyword_A"
r="NEW"
Then with sed:
sed -r "s/^(.+\{.+\}=')(.+)('.+"${q}".+)$/\1"${r}"\3/" file
Result:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
A5="NEW"
A6="keyword_A"
# with sed
sed "s/='[^']*\(',[[:blank:]]*A6='${A6}'\)/='${A5}\1/" YourFile
# with awk
awk -F "'" -v A5="${A5}" -v A6="${A6}" '
BEGIN { OFS="\047" }
$12 == A6 { $10 = A5; $0 = $0 }
7
' YourFile
Change by the end of the string, for sed and using ' as field separator in awk instead of traditional space.
assuming there is no ' in value (or need to treat the escaping method) for awk version
We can just directly replace the fifth column when the sting keyword_A is found as shown below:
awk -F, 'BEGIN{OFS=",";}/keyword_A/{$5="A5{ATTR}='"'"NEW"'"'"}1' filename
Couple of slight alternatives:
sed -r "/keyword_A/s/(A5[^']*')[^']*/\1NEW/"
awk -F"'" '/keyword_A/{$10 = "NEW"}1' OFS="'"
Of course the negative with awk is afterwards you would have to rename the new file.

Separate a text file with sed

I have the following sample file:
evtlog.161202.002609.debugevtlog.161201.162408.debugevtlog.161202.011046.debugevtlog.161202.002809.debugevtlog.161201.160035.debugevtlog.161201.155140.debugevtlog.161201.232156.debugevtlog.161201.145017.debugevtlog.161201.154816.debug
I want to separate the string and add a newline after matching "debug" like this:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
So far I tried almost everything with sed, but it doesn't seem to do what I want.
sed 's/debug/{G}' latest_evtlogs.out
sed '/debug/i "SAD"' latest_evtlogs.out
etc...
sed 's/debug/\n/g' latest_evtlogs.out doesn't work when I add it as a pipe in the script , but it does when I run it manually.
Here's how I generate the file:
printf $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g') >> latest_evtlogs.out
Initially I wanted to just add newline with awk, but it doesn't work either.
Any ideas why I can't separate the string with a newline ?
I'm using :
Distributor ID: Debian
Description: Debian GNU/Linux 5.0.10 (lenny)
Release: 5.0.10
Codename: lenny
Just add a new line after debug:
sed 's/debug/&\n/g' file
Note & prints back the matched text, so it is a way to print "debug" back.
This returns:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
evtlog.161202.011046.debug
evtlog.161202.002809.debug
evtlog.161201.160035.debug
evtlog.161201.155140.debug
evtlog.161201.232156.debug
evtlog.161201.145017.debug
evtlog.161201.154816.debug
The problem is, that you are using the output of sed in a command expansion. In this context your shell will replace all newlines with spaces. The spaces are then used to do the word splitting, so that printf sees each line as a separate argument, interpreting the first line as the format argument and ignoring the rest as there are printf-placeholders in the format.
It should work if you drop the outer printf $() from your command and just redirect the output from your pipeline to your file:
ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g' >> latest_evtlogs.out
Maybe Perl is "happier" than sed on your system:
perl -pe 's/debug/&\n/g' < YourLogFile
Get will append what is in the hold buffer unto the pattern space (Usually just the current line read from the input file) So this cannot be used.
insert will print the specified text to standard output. So this cannot be used.
What you you want to to replace all debug with debug^J, where ^J is a newline, dependent on the sed version, you can either do:
sed 's/debug/&\n/g' input_file
But \n is - afaik - not strictly specified in POSIX sed. One can however use c strings:
sed 's/debug/&'$'\n''/g' input_file
Or a multi line string:
sed 's/debug/&\
/g' input_file
Thank you all for the answers.I finally did it like this :
echo $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/&\n/g') > temp.out
sed 's/ /\n/g' /share/sqa/dumps/5314577631/checks/temp.out > latest_evtlogs.out
It's not at all elegant, but it finally works.

Pick a specific value in a program output (Bash)

I'm running LIBSVM in linux terminal called by a C program. Ok, i need to pick the output but the format is the following
Accuracy = 80% (24/30) (classification)
I need to pick only the "80" value as an integer. I tried with sed and came to this command:
sed 's/[^0-9^'%']//g' 'f' >> f
This is filtering all integers in the output and, thus, isn't working yet, so I need help. Thanks in advance
Try grep in PCRE mode (-P), printing only the matched parts (-o), with a lookahead assertion:
$ echo "Accuracy = 80% (24/30) (classification)" | grep -Po '[0-9]+(?=%)'
80
The regexp:
[0-9] # match a digit
+ # one or more times
(?=%) # assert that the digits are followed by a %
It is very trivial with awk. Identify the column you need and strip the '%' sign from it. The /^Accuracy/ regex ensures that the action is only performed on the lines starting with Accuracy. You don't need it if your file only contains one line.
awk '/^Accuracy/{sub(/%/,"");print $3}' inputFile
Alternatively, you can set space and % as field separators and do
awk -F'[ %]' '/^Accuracy/{print $3}' inputFile
If you want to do it with sed then you can try something like:
sed '/^Accuracy/s/.* \(.*\)%.*/\1/' inputFile
This might work for you (GNU sed):
sed -nr '/^Accuracy = ([^%]*)%.*/s//\1/p' file

Resources