Change some field separators in awk - linux

I have a input file
1.txt
joshwin_xc8#yahoo.com:1802752:2222:
ihearttofurkey#yahoo.com:1802756:111113
www.rothmany#mail.com:xxmyaduh:13#;:3A
and I want an output file:
out.txt
joshwin_xc8#yahoo.com||o||1802752||o||2222:
ihearttofurkey#yahoo.com||o||1802756||o||111113
www.rothmany#mail.com||o||xxmyaduh||o||13#;:3A
I want to replace the first two ':' in 1.txt with '||o||', but with the script I am using
awk -F: '{print $1,$2,$3}' OFS="||o||" 3.txt
But it is not giving the expected output.
Any help would be highly appreciated.

Perl solution:
perl -pe 's/:/||o||/ for $_, $_' 1.txt
-p reads the input line by line and prints each line after processing it
s/// is similar to substitution you might know from sed
for in postposition runs the previous command for every element in the following list
$_ keeps the line being processed
For higher numbers, you can use for ($_) x N where N is the number. For example, to substitute the first 7 occurrences:
perl -pe 's/:/||o||/ for ($_) x 7' 1.txt

Following sed may also help you in same.
sed 's/:/||o||/;s/:/||o||/' Input_file
Explanation: Simply substituting 1st occurrence of colon with ||o|| and then 2nd occurrence of colon now becomes 1st occurrence of colon now and substituting that colon with ||o|| as per OP's requirement.

Perl solution also, but I think the idea can apply to other languages: using the limit parameter of split:
perl -nE 'print join q(||o||), split q(:), $_, 3' file
(q quotes because I'm on Windows)

Suppose if we need to replace first 2 occurrence of : use below code
Like this you can change as per your requirement suppose if you need to change for first 7 occurences change {1..2} to {1..7}.
Out put will be saved in orginal file. it wont display the output
for i in {1..2}
> do
> sed -i "s/:/||o||/1" p.txt
> done

Related

Fetching the value of variable stored in a file

I am trying to fetch the output of a variable stored in a file in another shell script.
Example:
cat abc.log
var1=2
var2=2
var3=25
I am writing a script to fetch the value of var3.
Thank you in advance.
awk -F= '$1 ~ /^[[:space:]]*var3/ { print $2 }' abc.log
Set the field delimiter to = and then where the line contains "var3", print the second field.
Alternatively, you could:
source abc.log
and then:
echo $var3
Using sed you can isolate 25 with particularity with:
sed -n '/^[[:space:]]*var3=/s/^[^=]*=//p' file
Explanation
This is the general substitution form s/find/replace/ with a matching expression preceding it. The total form is /match/s/find/replace/. The option -n suppresses the normal printing of pattern-space and the p at the end tells sed to print the line where the match and substitution took place. Specifically,
/match/ locates a line with any number of preceding whitespace characters followed by var3=. The POSIX [:space:] character class matches any whitespace,
the /find/ is all characters anchored from the '^' beginning that are not the [^=] character and then match the literal '=' character, and finally
the /replace/ is the empty-string leaving the 25 alone which is printed.
Example Use/Output
$ sed -n '/^[[:space:]]*var3=/s/^[^=]*=//p' file
25
A grep one-liner, if your grep has support for Perl-compatible regular expressions (the -P option; not all greps support that)
grep -Po '^\s*var3=\K.*' abc.log
or,
grep -Po '^\s*var3=\K.*' abc.log | tail -n1
in order to get the last value of the var3, if multiple var3s is a possibility.

Insert line number in a file

Would like to insert line number at specific location in file
e.g.
apple
ball
should be
(1) apple
(2) ball
Using command
sed '/./=' <FileName>| sed '/./N; s/\n/ /'
It generates
1 Apple
2 Ball
1st solution: This should be an easy task for awk.
awk '{print "("FNR") "$0}' Input_file
2nd solution: With pure sed as per OP's attempt try:
sed '=' Input_file | sed 'N; s/^/(/;s/\n/) /'
Easy to do with perl instead:
perl -ne 'print "($.) $_"' foo.txt
If you want to modify the file in-place instead of just printing out the numbered lines on standard output:
perl -ni -e 'print "($.) $_"' foo.txt
Many ways are there to insert line numbers in a file
some of them are :-
1.Using cat command
cat -n file.txt > newfile.txt
2.Using nl command
nl -b a file.txt
Awk and perl both are very usefull and powerfull. But if, like me, you are reluctant to learn yet another programming language, you can complete this task with the bash commands you probably know already.
With bash you can
increment a sequence number n: $((++n))
read all lines from a file foo into a variable l: while read -r l;do ...;done <foo, where the option -r serves to treat backslashes as just characters.
print formatted output to a line: printf "plain text %i %s\n" number string
Now suppose you want to enclose your sequence number in parentheses, and format them to 8 digits with leading zeroes, then you combine all this to get:
n=0;while read -r l;do printf "(%08i) %s\n" $((++n)) "$l";done <foo >numberedfoo
Note that you do not need to initialize the variable n to use it as a sequence number further on. But if you experiment with this command a few times without reinitializing n, your lines will be numbered from where your previous try stopped incrementing.
Finally, if you don't like the C-like formatting syntax of printf, just use plain echo, and leave the formatting to bash variable expansion. Here is how to format a number like in the command above (do type a space before the -, and a ; before the echo) :
nformat="0000000$n"; echo "(${nformat: -8}) ...";

Replacing characters in each line on a file in linux

I have a file with different word in each line.
My goal is to replace the first character to a capital letter and replace the 3rd character to "#".
For example: football will be exchanged to Foo#ball.
I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.
With GNU sed and two s commands:
echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'
Output:
Foo#ball
See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion
This might work for you (GNU sed):
sed 's/\(...\)./\u\1#/' file
With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:
line="${line^}" # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}" # make 4th character '#'
Example Input File
$ cat file
football
soccer
baseball
Example Use/Output
$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball
While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.
(note: your question says 3rd character, but your example replaces the 4th character with '#')
With GNU awk for the 3rd arg to match():
$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball
Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)
This is just an additional solution with awk because you mentioned it and used also awk tag:
$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball
This is a most simple solution without RegEx. It will also work on non-GNU awk.
This should work with any version of awk:
awk '{
for(i=1;i<=NF;i++){
# Note that string indexes start at 1 in awk !
$i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
}
print
}' file
Note: If a word is less than 3 characters long, like it, it will be printed as It#
if your data in 'd' file, tried on gnu sed:
sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

Using Sed or Awk to divide a file into two based on whether a line contains a numeric value

I have used sed and awk for little while now, but I am having a challenge with the below problem. I am asking for an experienced sed/awk guru to help.
I have a file where some lines have numbers and some lines do not, like:
afjjdjfj.uihuihi
trfg.rtyhd
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
rtygfd.ijhniuh
etc.
I would like to have exactly two files out of this one, where every line is represented in one of the two files (none are deleted).
One containing all lines with any numbers 0-9 on them so given above file result would be:
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
and another file containing the rest of the lines that do not have any numbers 0-9 on them, so given the above, file it would be:
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
I've tried different strategies in both sed and awk and nothing is giving me exactly what I need.
What would be the best sed or awk one liner to solve this problem?
Thank you for your time,
Tom
Easily with Awk:
awk '/[0-9]/{print > file1; next} {print > file2}' inputfile
With single GNU sed command:
sed -ne '/[0-9]/w with_digits.txt' -e '//!w no_digits.txt' input
Results:
> cat no_digits.txt
afjjdjfj.uihuihi
trfg.rtyhd
rtygfd.ijhniuh
> cat with_digits.txt
0rtgfd.tjbghhh
hbvfd4.rtgbvdgf
00fhfg.fdrgf
w filename Write the pattern space to filename.
If you don't mind running twice over the input, you can use just grep:
grep '[0-9]' input > with_digits
grep -v '[0-9]' input > without_digits
perl -MFile::Slurp -lpe '/\d/ ? append_file("digits.txt",$_) : append_file("no_digits.txt",$_)' input.txt

Linux Bash: extracting text from file int variable

I haven't found anything that clearly answers my question. Although very close, I think...
I have a file with a line:
# Skipsdata for serienummer 1158
I want to extract the 4 digit number at the end and put it into a variable, this number changes from file to file so I can't just search for "1158". But the "# Skipsdata for serienummer" always remains the same.
I believe that either grep, sed or awk may be the answer but I'm not 100 % clear on their usage.
Using Awk as
numberRequired=$(awk '/# Skipsdata for serienummer/{print $NF}' file)
printf "%s\n" "$numberRequired"
1158
You can use grep with the -o switch, which prints only the matched part instead of the whole line.
Print all numbers at the end of lines from file yourFile
grep -Po '\d+$' yourFile
Print all four digit numbers at the end of lines like described in your question:
grep -Po '^# Skipsdata for serienummer \K\d{4}$' yourFile
-P enables perl style regexes which support \d and especially \K.
\d matches any digit (0-9).
\d{4} matches exactly four digits.
\K lets grep forget the previously matched part, such that only the part afterwards is printed.
There are multiple ways to find your number. Assuming the input data is in a file called inputfile:
mynumber=$(sed -n 's/# Skipsdata for serienummer //p' <inputfile) will print only the number and ignore all the other lines;
mynumber=$(grep '^# Skipsdata for serienummer' inputfile | cut -d ' ' -f 5) will filter the relevant lines first, then only output the 5th field (the number)

Resources