using sed to extract certain strings from config file

using sed to extract certain strings from config file - string

I am trying to export certain strings from below output, however i have no experience with sed/awk and i need some advise how can i proceed with that.
Input:
name Cleartext-Password := "password", Service-Type := Framed-User
Framed-IP-Address := 127.0.0.1,
MS-Primary-DNS-Server := 8.8.8.8,
Fall-Through = Yes,
Mikrotik-Rate-Limit = 20M/30M
The output should be:
name;password;127.0.0.1;20M;30M;
I am not sure if this is correct way to do that, but i have tried to remove everything between my required string, for example:
sed 's/ Cleartext-Password := "/;/'
However i think this is dirty way and not the clever one.
Could you please let me know what i need to look for in order to create working sed/awk solution for this?

Could you please try following based on your shown samples. Written and tested it in site
https://ideone.com/eWXv3w
Since OP's Input_file has control M characters so added gsub(/\r/,"") in code here.
awk '
BEGIN{ OFS=";" }
{ gsub(/\r/,"") }
match($0,/Cleartext-Password[^,]*/){
val=substr($0,RSTART,RLENGTH)
gsub(/Cleartext-Password[^"]*|"/,"",val)
val=$1 OFS val
next
}
/Framed-IP-Address/{
sub(/,$/,"")
val=val OFS $NF
next
}
/Mikrotik-Rate-Limit/{
print val, $NF
val=""
}' Input_file
Explanation: In BEGIN section of program setting OFS to semi colon as per question. Then using match function of awk to match regex from string Cleartext...Cleartext-Password[^,]* till first comma comes. If regex matches perfectly then capturing that sub-string in variable val here. Now using gsub to globally substitute everything from Cleartext-Password and all un-necessary stuff there as per required output.
Then checking if line contains Framed-IP-Address if it's found then send substituting , from last of line and adding that line last field to variable val here.
Now checking condition if a line contains Mikrotik-Rate-Limit then simply printing value of val and last field here, nullifying val here too.

There are a number of ways to approach this with awk, the key is to match part of the record with the regular expression to identify the record you are operating on and then isolate the wanted test and output in the desired format.
One approach would be:
awk '
/Cleartext-Password/ { printf "%s;%s;", $1, substr($4,2,length($4)-3) }
/Framed-IP-Address/ { printf "%s;", substr($NF,1,length($NF)-1) }
/Mikrotik-Rate-Limit/{ sub(/\//,";",$NF); printf "%s;\n", $NF }
' config
Example Use/Output
With your sample input in the file named config, you would receive:
name;password;127.0.0.1;20M;30M;
Look things over and let me know if I misunderstood anywhere.

This might work for you (GNU sed):
sed -nE -e '/Cleartext-Password/{s/ .*:=\s"(.*)",.*/;\1/;h}' \
-e '/Framed-IP-Address/{s/.*:= (.*),/\1/;H}' \
-e '/Mikrotik-Rate-Limit/{s#.*= (.*)/(.*)#\1;\2#;H;g;y/\n/;/;p}' file
Turn off implicit printing by invoking the -n option.
Reduce back slashes by invoking the -E option.
Stash the fields of the record in the hold space and when all fields have been collected, copy the hold space to the pattern space, replace newlines by the field separators and print the result.
You may prefer:
sed -nE '/Cleartext-Password/{s/ .*:=\s"(.*)",.*/;\1/;h};
/Framed-IP-Address/{s/.*:= (.*),/\1/;H};
/Mikrotik-Rate-Limit/{s#.*= (.*)/(.*)#\1;\2#;H;g;y/\n/;/;p}' file

Related

AWK how to process multiple files and comparing them IN CONTROL FILE! (not command line one-liner)

I read all of answers for similar problems but they are not working for me because my files are not uniformal, they contain several control headers and in such case is safer to create script than one-liner and all the answers focused on one-liners. In theory one-liners commands should be convertible to script but I am struggling to achieve:
printing the control headers
print only the records started with 16 in <file 1> where value of column 2 NOT EXISTS in column 2 of the <file 2>
I end up with this:
BEGIN {
FS="\x01";
OFS="\x01";
RS="\x02\n";
ORS="\x02\n";
file1=ARGV[1];
file2=ARGV[2];
count=0;
}
/^#/ {
print;
count++;
}
# reset counters after control headers
NR=1;
FNR=1;
# Below gives syntax error
/^16/ AND NR==FNR {
a[$2];next; 'FNR==1 || !$2 in a' file1 file2
}
END {
}
Googling only gives me results for command line processing and documentation is also silent in that regard. Does it mean it cannot be done?

Perhaps try:
script.awk:
BEGIN {
OFS = FS = "\x01"
ORS = RS = "\x02\n"
}
NR==FNR {
if (/^16/) a[$2]
next
}
/^16/ && !($2 in a) || /^#/
Note the parentheses: !$2 in a would be parsed as (!$2) in a
Invoke with:
awk -f script.awk FILE2 FILE1
Note order of FILE1 / FILE2 is reversed; FILE2 must be read first to pre-populate the lookup table.

First of all, short answer to my question should be "NOT POSSIBLE", if anyone read question carefully and knew AWK in full that is obvious answer, I wish I knew it sooner instead of wasting few days trying to write script.
Also, there is no such thing as minimal reproducible example (this was always constant pain on TeX groups) - I need full example working, if it works on 1 row there is no guarantee if it works on 2 rows and my number of rows is ~ 127 mln.
If you read code carefully than you would know what is not working - I put in comment section what is giving syntax error. Anyway, as #Daweo suggested there is no way to use logic operator in pattern section. So because we don't need printing in first file the whole trick is to do conditional in second brackets:
awk -F, 'BEGIN{} NR==FNR{a[$1];next} !($1 in a) { if (/^16/) print $0} ' set1.txt set2.txt
assuming in above example that separator is comma. I don't know where assumption about multiple RS support only in gnu awk came from. On MacOS BSD awk it works exactly the same, but in fact RS="\x02\n" is single separator not two separators.

Replace multiline string with sed

I have a file that's basically an INI/CFG file the looks like this:
[thing-a]
attribute1=foo
attribute2=bar
attribute3=foobar
attribute4=barfoo
[thing-b]
attribute1=dog
attribute3=foofoo
attribute4=castles
[thing-c]
attribute1=foo
attribute4=barfoo
[thing-d]
attribute1=123455
attribute2=dogs
attribute3=biscuits
attribute4=1234
Each 'thing' has a set of attributes that could include all the same ones or a subset there of.
I am trying to write a small bash script that will replace the attributes for 'thing-c' with a predefined block $a1, $a2 & $a3 are generated elsewhere in the wider script:
NEW_BLOCK="[thing-c]
attribute1=${a1}
attribute2=${a2}
attribute3=${a3}"
I can find the right block with sed like this:
THING_BLOCK=$(sed -nr "/^\[thing-c\]/ { :l /^\s*[^#].*/ p; n; /^\[/ q; b l; }" ./myThingFile)
I'm not sure if i've gone down a rabbit hole or what with this and I'm pretty sure there is a better way of doing it.
I'm wanting to do what is:
sed "s/${THING_BLOCK}/${NEW_BLOCK}/"
But I can't quite figure out the multiline aspect to this and I'm not sure what the best route to take is.
Is there a way to do this sort of multiline find and replace with sed (or a better way with bash)

Is there a way to do this sort of multiline find and replace ...
Yes there is indeed a better way, albeit using awk:
awk -v blk="$NEW_BLOCK" -v RS= '{ORS = RT} $1 == "[thing-c]" {$0 = blk} 1' file
Using -v RS= we use an empty record separator that splits records in input file on each new line.

Another awk. Store the replacement to file2 and:
$ awk -v RS="" '
NR==FNR {
b=$0
next
}
$1~/thing-c/ {
$0=b
}
{
print (++c==1?"":ORS) $0
}' file2 file1
Output:
[thing-a]
attribute1=foo
attribute2=bar
attribute3=foobar
attribute4=barfoo
[thing-b]
attribute1=dog
attribute3=foofoo
attribute4=castles
[thing-c]
attribute1=${a1}
attribute2=${a2}
attribute3=${a3}
[thing-d]
attribute1=123455
attribute2=dogs
attribute3=biscuits
attribute4=1234

When you want to use sed(IMHO awk is better here), you must have "nice" data (no special characters that sed will try to handle and [ inside block thing-3).
I tested with
read -d '' -r NEW_BLOCK <<END
[thing-c]
attribute1=${a1}
attribute2=${a2}
attribute3=${a3}
END
For my solution I first need to replace newlines in $NEW_BLOCK with the two characters \n.
echo "This is the replacement string: ${NEW_BLOCK//$'\n'/\\n}"
With the "multi-line" option "-z" you can do
sed -rz "s/\[thing-c\][^[]*/${NEW_BLOCK//$'\n'/\\n}\n\n/" myThingFile

BASH - Extract Data from String

I have a log that returns thousands of lines of data, I want to extract a few values from that.
In the log there is only one line containing the unquie unit reference so I can grep for that using:
grep "unit=Central-C152" logfile.txt
That produces a line of output similar to the following:
a3cd23e,85d58f5,53f534abef7e7,unit=Central-C152,locale=32325687-8595-9856-1236-12546975,11="School",1="Mr Green",2="Qual",3="SWE",8="report",5="channel",7="reset",6="velum"
The format of the line may change in that the order of the values won't always be in the same position.
I'm trying to work out how to get the value of 2 and 7 in to separate variables.
I had thought about cut on , or = but as the values aren't in a set order I couldn't work out that best way to do it.
I' trying to get:
var state=value of 2 without quotes
var mode=value of 7 without quotes
Can anyone advise on the best way to do this ?
Thanks

Could you please try following to create variable's values.
state=$(awk '/unit=Central-C152/ && match($0,/2=\"[^"]*/){print substr($0,RSTART+3,RLENGTH-3)}' Input_file)
mode=$(awk '/unit=Central-C152/ && match($0,/7=\"[^"]*/){print substr($0,RSTART+3,RLENGTH-3)}' Input_file)
You could print them too by doing following.
echo "$state"
echo "$mode"
Explanation: Adding explanation of command too now.
awk ' ##Starting awk program here.
/unit=Central-C152/ && match($0,/2=\"[^"]*/){ ##Checking condition if a line has string (unit=Central-C152) and using match using REGEX to check from 2 to till "
print substr($0,RSTART+3,RLENGTH-3) ##Printing substring starting from RSTART+3 till RLENGTH-3 characters.
}
' Input_file ##Mentioning Input_file name here.

You are probably better off doing all of the processing in Awk.
awk -F, '/unit=Central-C152/ {
for(i=1;i<=NF;++i)
if($i ~ /^[27]="/) {
b[++k] = $i
sub(/^[27]="/, "", b[k])
sub(/"$/, "", b[k])
gsub(/\\/, "", b[k])
}
print "state " b[1] ", mode " b[2]
}' logfile.txt
This presupposes that the fields always occur in the same order (2 before 7). Maybe you need to change or disable the gsub to remove backslashes in the values.
If you want to do more than print the values, refactoring whatever Bash code you have into Awk is often a better approach than doing this processing in Bash.

Assuming you already have the line in a variable such as with:
line="$(grep 'unit=Central-C152' logfile.txt | head -1)"
You can then simply use the built-in parameter substitution features of bash:
f2=${line#*2=\"} ; f2=${f2%%\"*} ; echo ${f2}
f7=${line#*7=\"} ; f7=${f7%%\"*} ; echo ${f7}
The first command on each line strips off the first part of the line up to and including the <field-number>=". The second command then strips everything off that beyond (and including) the first quote. The third, of course, simply echos the value.
When I run those commands against your input line, I see:
Qual
reset
which is, from what I can see, what you were after.

Use sed to find and replace a number following by its successor in bash

I have a string that contains multiple occurrences of number ranges, which are separated by a comma, e.g.,
2-12,59-89,90-102,103-492,593-3990,3991-4930
Now I would like to remove all directly neighbouring ranges and remove them from the string, i.e., remove anything that is of the form -(x),(x+1), to get something like this:
2-12,59-492,593-4930
Can anyone think of a method to accomplish this? I can honestly not post anything that I have tried, because all my tries were highly unsuccessful. To me it seems like it is not possible to actually find anything of the form -(x),(x+1) using sed, since that would require doing operations or comparisons of a found number by another number that has to be part of the command that is currently searching for numbers.
If everybody agrees that sed is NOT the correct tool for doing this, I will do it another way, but I am still interested if it's possible.

with awk
awk -F, -v RS="-" -v ORS="-" '$2!=$1+1' file
with appropriate separator setting, print the record when second field is not +1.
RS is the record separator and ORS is the outpout record separator.
test:
> awk -F, -v RS="-" -v ORS="-"
'$2!=$1+1' <<< "2-12,59-89,90-102,103-492,593-3990,3991-4930"
2-12,59-492,593-4930

awk solution:
awk -F'-' '{ r=$1;
for (i=2; i<=NF; i++) {
split($i, a, ",");
r=sprintf("%s%s", r, a[2]-a[1]==1? "" : FS $i)
}
print r
}' file
-F'-' - treat -(hyphen) as field separator
r - resulting string
split($i, a, ",") - split adjacent range boundaries into array a by separator ,
a[2]-a[1]==1 - crucial condition, reflects (x),(x+1)
The output:
2-12,59-492,593-4930

This might work for you (GNU sed):
sed -r ' s/^/\n/;:a;ta;s/\n([^-]*-)([0-9]*)(.*,)/\1\n\2\n\2\n\3/;Td;:b;s/(\n.*\n.*)9(_*\n)/\1_\2/;tb;s/(\n.*\n)(_*\n)/\10\2/;s/$/\n0123456789/;s/(\n.*\n[0-9]*)([0-8])(_*\n.*)\n.*\2(.).*/\1\4\3/;:z;tz;s/(\n.*\n[^_]*)_([^\n]*\n)/\10\2/;tz;:c;tc;s/([0-9]*-)\n(.*)\n(.*)\n,(\3)-/\n\1/;ta;s/\n(.*)\n.*\n,/\1,\n/;ta;:d;s/\n//g' file
This proof-of-concept sed solution, iteratively increments and compares the end of one range with the start of another. If the comparison is true it removes both and repeats, otherwise it moves on to the next range and repeats until all ranges have been compared.

Linux cut, paste

I have to write a script file to cut the following column and paste it the end of the same row in a new .arff file. I guess the file type doesn't matter.
Current file:
63,male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50'
67,male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1'
The output should be:
male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67
how can I do this? using a Linux script file?

sed -r 's/^([^,]*),(.*)$/\2,\1/' Input_file
Brief explanation,
^([^,]*) would match the first field which separated by commas, and \1 behind refer to the match
(.*)$ would be the remainding part except the first comma, and \2 would refer to the match

Shorter awk solution:
$ awk -F, '{$(NF+1)=$1;sub($1",","")}1' OFS=, input.txt
gives:
male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67
Explanation:
{$(NF+1)=$1 # add extra field with value of field $1
sub($1",","") # search for string "$1," in $0, replace it with ""
}1 # print $0
EDIT: Reading your comments following your question, looks like your swapping more columns than just the first to the end of the line. You might consider using a swap function that you call multiple times:
func swap(i,j){s=$i; $i=$j; $j=s}
However, this won't work whenever you want to move a column to the end of the line. So let's change that function:
func swap(i,j){
s=$i
if (j>NF){
for (k=i;k<NF;k++) $k=$(k+1)
$NF=s
} else {
$i=$j
$j=s
}
}
So now you can do this:
$ cat tst.awk
BEGIN{FS=OFS=","}
{swap(1,NF+1); swap(2,5)}1
func swap(i,j){
s=$i
if (j>NF){
for (k=i;k<NF;k++) $k=$(k+1)
$NF=s
} else {
$i=$j
$j=s
}
}
and:
$ awk -f tst.awk input.txt
male,t,145,233,typ_angina,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,f,160,286,asympt,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67

Why using sed or awk, the shell can handle this easily
while read l;do echo ${l#*,},${l%%,*};done <infile
If it's a win file with \r
while read l;do f=${l%[[:cntrl:]]};echo ${f#*,},${l%%,*};done <infile
If you want to keep the file in place.
printf "%s" "$(while read l;do f=${l%[[:cntrl:]]};printf "%s\n" "${f#*,},${l%%,*}";done <infile)">infile

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

using sed to extract certain strings from config file - string

Related

AWK how to process multiple files and comparing them IN CONTROL FILE! (not command line one-liner)

Replace multiline string with sed

BASH - Extract Data from String

Use sed to find and replace a number following by its successor in bash

Linux cut, paste

Categories

Resources