How to multiple a number by 2 (double)present in a particular line number of a file, in Linux? - linux

File_A
Name: John Smith
Grade: 8
Institute: Baldwin
Number of entries: 125
State: Texas
File_B
Name: David Buck
Grade: 9
Institute: High Prime
Number of entries: 123
State: California
There are many such similar files in which the Number of entries (present at line number 4 in all files) has to doubled.
For File_A it should be 250 and for File_B 246.
How to do this for all files in Linux?(using sed or awk or any other commands)
Tried commands:
sed -i '4s/$2/$2*2/g' *.txt (nothing happening from this)
awk "FNR==4 {sub($2,$2*2)}" *.txt (getting syntax error)

With your shown samples please try following awk code. Simple explanation would be look/search for string /Number of entries: and then multiply 2 into value of last field and save it within itself, then print lines by mentioning 1.
awk '/Number of entries:/{$NF = ($NF * 2)} 1' File_A File_B
Run above command it will print output on screen, once you are Happy with output and want to save output into Input_file itself then you can try awk's -inplace option(available in GNU awk 4.1+ version etc).
Also if your files extension is .txt then pass it to above awk program itself, awk can read multiple files itself.

This might work for you (GNU sed and shell):
sed -Ei '4s/(.* )(.*)/echo "\1$((\2*2))"/e' file1 file2 filen
For line four of each file input, split the values into two back references and echo back those values using shell arithmetic to double the second value.
N.B. The -i option allows for address of line four to be found in all input files and those files to be amended in situ.

Using sed
$ sed '/^Number of entries/s/[[:digit:]]\+/$((&*2))/;s/^/echo /e' input_file

I want to explain why what you have tried failed, firstly
sed -i '4s/$2/$2*2/g' *.txt
$ has not special meaning for GNU sed, that it is literal dollar sign, also GNU sed does not support arithmetic, so above command is: at 4th line replace dollar sign folowed by 2 using dollar sign followed by 2 followed by asterix followed by 2 and do so globally. You do not have literal $2 at 4th line of file which is firstly rammed so nothing happens.
awk "FNR==4 {sub($2,$2*2)}" *.txt
You should not use " for enclosing awk command unless you want to summon mind-boggling bugs. You should use ' in which case syntax error will be gone, however behavior will be not as desired. In order to do that your code might be reworked to
awk 'BEGIN{FS=OFS=": "}FNR==4{$2*=2}{print}' *.txt
Observe that I specify FS and OFS to inform GNU AWK that field are separated and should be separated by : rather than one-or-more whitespace characters (default) and do not use sub function (which is for working with regular expression), but rather simply increase 2 times operator (*=2) and I also print line, as without it output would be empty. If you want to know more about FS or OFS read 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

mawk 'BEGIN{ _+=_^=FS=OFS="Number of entries: " } NF<_ || $NF *=_'
Name: John Smith
Grade: 8
Institute: Baldwin
Number of entries: 250
State: Texas
Name: David Buck
Grade: 9
Institute: High Prime
Number of entries: 246
State: California

Thank you for all the answers.
With your help I was able to figure out simple solution by understanding and combining your answers.
Here it is (which worked in my environment):
To display on terminal:
awk 'FNR==4 {sub($4,$4*2)} 1' File_A
To move to some file:
awk 'FNR==4 {sub($4,$4*2)} 1' File_A > temp_A
To perform changes inside file using inplace:
awk -i inplace 'FNR==4 {sub($4,$4*2)} 1' *.txt
$4 being 4th parameter in the line;
FNR==4 being the line number 4;
1 at the end helps in printing everything

Related

Replacing characters in each line on a file in linux

I have a file with different word in each line.
My goal is to replace the first character to a capital letter and replace the 3rd character to "#".
For example: football will be exchanged to Foo#ball.
I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.
With GNU sed and two s commands:
echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'
Output:
Foo#ball
See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion
This might work for you (GNU sed):
sed 's/\(...\)./\u\1#/' file
With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:
line="${line^}" # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}" # make 4th character '#'
Example Input File
$ cat file
football
soccer
baseball
Example Use/Output
$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball
While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.
(note: your question says 3rd character, but your example replaces the 4th character with '#')
With GNU awk for the 3rd arg to match():
$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball
Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)
This is just an additional solution with awk because you mentioned it and used also awk tag:
$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball
This is a most simple solution without RegEx. It will also work on non-GNU awk.
This should work with any version of awk:
awk '{
for(i=1;i<=NF;i++){
# Note that string indexes start at 1 in awk !
$i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
}
print
}' file
Note: If a word is less than 3 characters long, like it, it will be printed as It#
if your data in 'd' file, tried on gnu sed:
sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

Change some field separators in awk

I have a input file
1.txt
joshwin_xc8#yahoo.com:1802752:2222:
ihearttofurkey#yahoo.com:1802756:111113
www.rothmany#mail.com:xxmyaduh:13#;:3A
and I want an output file:
out.txt
joshwin_xc8#yahoo.com||o||1802752||o||2222:
ihearttofurkey#yahoo.com||o||1802756||o||111113
www.rothmany#mail.com||o||xxmyaduh||o||13#;:3A
I want to replace the first two ':' in 1.txt with '||o||', but with the script I am using
awk -F: '{print $1,$2,$3}' OFS="||o||" 3.txt
But it is not giving the expected output.
Any help would be highly appreciated.
Perl solution:
perl -pe 's/:/||o||/ for $_, $_' 1.txt
-p reads the input line by line and prints each line after processing it
s/// is similar to substitution you might know from sed
for in postposition runs the previous command for every element in the following list
$_ keeps the line being processed
For higher numbers, you can use for ($_) x N where N is the number. For example, to substitute the first 7 occurrences:
perl -pe 's/:/||o||/ for ($_) x 7' 1.txt
Following sed may also help you in same.
sed 's/:/||o||/;s/:/||o||/' Input_file
Explanation: Simply substituting 1st occurrence of colon with ||o|| and then 2nd occurrence of colon now becomes 1st occurrence of colon now and substituting that colon with ||o|| as per OP's requirement.
Perl solution also, but I think the idea can apply to other languages: using the limit parameter of split:
perl -nE 'print join q(||o||), split q(:), $_, 3' file
(q quotes because I'm on Windows)
Suppose if we need to replace first 2 occurrence of : use below code
Like this you can change as per your requirement suppose if you need to change for first 7 occurences change {1..2} to {1..7}.
Out put will be saved in orginal file. it wont display the output
for i in {1..2}
> do
> sed -i "s/:/||o||/1" p.txt
> done

How To Delete All Words Before X Characters

I'm using code from this question How To Delete All Words After X Characters and I'm having a trouble keeping (not deleting) all the words after 30 characters.
Original code:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i-1; }1'
My attempt:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i+1; }1'
Basically, I understand I need to change the NF which was NF=i-1 so I tried changing it to NF=i+1 but obviously I'm only getting one field. How can I specify NF to print the rest of the line?
Sample data:
StackOverflow Users Are Brilliant And Hard Working
#character 30 ---------------^
Desired output:
And Hard Working
If you could please help me keep the rest of the line by using NF, I would really appreciate your positive input and support.
It is much easier using gnu grep:
grep -oP '^.{30}\w*\W*\K.*' file
And Hard Working
Where \K is used for reseting matched information.
RegEx Breakup:
^: Start
.{30}: Match first 30 characters
\w*: followed by 0 or more word characters
\W*: followed by 0 or more non-word characters
\K: reset matched information so far
.*: Match anything after this position
Using awk you can use this solution:
awk '{sub(/^.{30}[_[:alnum:]]*[[:blank:]]*/, "")} 1' file
And Hard Working
Finally a sed solution:
sed -E 's/^.{30}[_[:alnum:]]*[[:blank:]]*//' file
And Hard Working
another awk
awk '{print substr($0, index(substr($0,30),FS)+30)}'
find the delimiter index after the 30th char, take a substring from that index on.
I can't imagine why your considering anything related to NF for this since you're not doing anything with fields, you're just splitting each line at a blank char. It sounds like this is all you need for both questions, using GNU awk for gensub():
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\1",1)}' file
StackOverflow Users Are Brilliant
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\2",1)}' file
And Hard Working
or it's briefer using GNU sed:
$ sed -E 's/(.{30}\S*)\s+(.*)/\1/' file
StackOverflow Users Are Brilliant
$ sed -E 's/(.{30}\S*)\s+(.*)/\2/' file
And Hard Working
With the use of NF, you can try
awk '{for(i=1;i<=NF;i++){a+=length($i)+1;if(a>30){for(j=i+1;j<=NF;j++)b=b $j" ";print b;exit}}}'
cut -c30- file | cut -d' ' -f2-
this will keep only the words that start after 30th character (index >= 31)

Using Sed or Awk to divide a file into two based on whether a line contains a numeric value

I have used sed and awk for little while now, but I am having a challenge with the below problem. I am asking for an experienced sed/awk guru to help.
I have a file where some lines have numbers and some lines do not, like:
afjjdjfj.uihuihi
trfg.rtyhd
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
rtygfd.ijhniuh
etc.
I would like to have exactly two files out of this one, where every line is represented in one of the two files (none are deleted).
One containing all lines with any numbers 0-9 on them so given above file result would be:
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
and another file containing the rest of the lines that do not have any numbers 0-9 on them, so given the above, file it would be:
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
I've tried different strategies in both sed and awk and nothing is giving me exactly what I need.
What would be the best sed or awk one liner to solve this problem?
Thank you for your time,
Tom
Easily with Awk:
awk '/[0-9]/{print > file1; next} {print > file2}' inputfile
With single GNU sed command:
sed -ne '/[0-9]/w with_digits.txt' -e '//!w no_digits.txt' input
Results:
> cat no_digits.txt
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
> cat with_digits.txt
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
w filename Write the pattern space to filename.
If you don't mind running twice over the input, you can use just grep:
grep '[0-9]' input > with_digits
grep -v '[0-9]' input > without_digits
perl -MFile::Slurp -lpe '/\d/ ? append_file("digits.txt",$_) : append_file("no_digits.txt",$_)' input.txt

Use tr to replace single new lines but not multiple new lines

Hi I have a file with data in the following format:
262353824192
Motley Crue Too Fast For Love Vinyl LP Leathur Records LR123 rare 3rd pressing
http://www.ebay.co.uk/itm/Motley-Crue-Too-Fast-Love-Vinyl-LP-Leathur-Records-LR123-rare-3rd-pressing-/262353824192
301870324112
TRAFFIC Same UK 1st press vinyl LP in gatefold / booklet sleeve Island pink eye
http://www.ebay.co.uk/itm/TRAFFIC-Same-UK-1st-press-vinyl-LP-gatefold-booklet-sleeve-Island-pink-eye-/301870324112
141948187203
NOW That's What I Call Music LP'S Joblot 2-14 MINT CONDITION Vinyl
http://www.ebay.co.uk/itm/NOW-Thats-Call-Music-LPS-Joblot-2-14-MINT-CONDITION-Vinyl-/141948187203
I would like replace the single new lines with a pipe, but leave the double new lines as they are. I have tried:
tr '\n' '|' < text.txt
But this replaces all new lines with | so the separate products are no longer on different lines. I basically want a | delimiter between the product number, title and url, but each separate product on a different line. How can I achieve this?
Use tr and a little bit of sed:
tr "\n" "|" < text.txt | sed 's/||\+/\n/g'
You could use awk to do this:
awk ' /^$/ { print; } /./ { printf("%s|", $0); } END {print '\n'}' text.txt
This will find any blank line and just print it as-is. If it fin
ds any value on the line it will use printf and stick a pipe after it. At the end of processing it prints a newline character to finish up.
This has already been partially answered HERE, but not completely.
I would add an additional transform to change double newlines to some character (hash in this case), then replace the hashes with a newline (or two if you want to go back to the original formatting of those) after changing the single newlines to be pipes.
sed -e ':a' -e 'N' -e '$!ba' -e 's/\n\n/#/g' -e 's/\n/|/g' -e 's/#/\n/g'
This gives the output:
262353824192|Motley Crue Too Fast For Love Vinyl LP Leathur Records LR123 rare 3rd pressing|http://www.ebay.co.uk/itm/Motley-Crue-Too-Fast-Love-Vinyl-LP-Leathur-Records-LR123-rare-3rd-pressing-/262353824192
301870324112|TRAFFIC Same UK 1st press vinyl LP in gatefold / booklet sleeve Island pink eye|http://www.ebay.co.uk/itm/TRAFFIC-Same-UK-1st-press-vinyl-LP-gatefold-booklet-sleeve-Island-pink-eye-/301870324112
141948187203|NOW That's What I Call Music LP'S Joblot 2-14 MINT CONDITION Vinyl|http://www.ebay.co.uk/itm/NOW-Thats-Call-Music-LPS-Joblot-2-14-MINT-CONDITION-Vinyl-/141948187203
awk to the rescue!
awk -F'\n' -v RS= -v OFS='|' '{$1=$1;printf "%s", $0 RT}' file
this preserves spacing between paragraphs, 3 lines as in the original file.
I made a very specific solution to your problem with awk (specific because it assumes you always have the same number of new lines between the groups of records).
awk 'BEGIN {RS="\n\n\n"; FS="\n"; OFS="|"} {print $1,$2,$3}' < text.txt
It sets the record separator to 3 newlines, field separator to one newline, and the output field separator to pipe. Then for each record (every block seperated by 3 newlines), it prints the first 3 fields (that are separated by one newline), and on the output it separates them with a pipe
Just use sed:
sergey#x50n:~> cat in.txt | tr '\n' '|' | sed -e 's/||\+/\n\n/g; s/|$/\n/'
262353824192|Motley Crue Too Fast For Love Vinyl LP Leathur Records LR123 rare 3rd pressing|http://www.ebay.co.uk/itm/Motley-Crue-Too-Fast-Love-Vinyl-LP-Leathur-Records-LR123-rare-3rd-pressing-/262353824192
301870324112|TRAFFIC Same UK 1st press vinyl LP in gatefold / booklet sleeve Island pink eye|http://www.ebay.co.uk/itm/TRAFFIC-Same-UK-1st-press-vinyl-LP-gatefold-booklet-sleeve-Island-pink-eye-/301870324112
141948187203|NOW That's What I Call Music LP'S Joblot 2-14 MINT CONDITION Vinyl|http://www.ebay.co.uk/itm/NOW-Thats-Call-Music-LPS-Joblot-2-14-MINT-CONDITION-Vinyl-/141948187203
First we replace all newlines with a pipe using tr as in your example.
Then the first expression in sed command (i.e. s/||\+/\n\n/g;) replaces all occurrences of more than one pipe with two newlines. You also may replace them with one line if you do not want blank lines between the lines of output. And the second expression of sed replaces the trailing pipe with a newline to produce more readable output (or more "conventional" empty line at the end of file).
Also note that \+ in sed regex is a GNU extension. Thus if you are using non-GNU implementation of sed (FreeBSD, AIX or so), use standard syntax: |||* instead of ||\+.

Resources