How to delete a block of text in a file Shell Script? - linux

I have the following scenario I have a block of text and example
basketball:
ball: round
being that I don't know exactly what's inside basketball: but I like to delete everything inside it example:
men:
height: 170
athlete: basketball
women:
height:180
athlete: basketball
I want to delete only the men block ignoring whatever is above or below this key

The AWK script filter.awk below removes all men sections which contains basketball. Is that what you mean? Run with awk -f filter.awk input.txt.
/^[A-Za-z0-9]/ {
if (sectionWanted) {
printf "%s", section
}
sectionWanted = 1
section = ""
sectionName = $1
}
/basketball/ && sectionName == "men:" {
sectionWanted = 0
}
{
section = section $0 "\n"
}
END {
if (sectionWanted) {
printf "%s", section
}
}

Related

How to combine two line by matching the pattern in shell script

I was trying to combine the two lines.
example of my data is ::
Hello Reach World Test
Reach me Test out .
I would like to combine this as ::
Output
Hello Reach World Test Reach me Test out .i.e Only if last word matches Test and Begin matches Reach .
I was trying with
awk '/Test$/ { printf("%s\t", $0); next } 1' .
Could anyone please let me know how to match it and combine.
Does this awk script do what you want:
BEGIN { flag = "0"; line = "" }
{
if ( flag == "1" ) {
if ( $0 ~ "^Reach" )
print line " " $0
else {
print line
print $0
}
line = ""
flag = "0"
} else {
if ( $0 ~ "Test$" ) {
line = $0
flag = "1"
} else
print $0
}
}

how to iterate over two sets of data?

I'm trying to create my own program to do a recursive listing: each line corresponds to the full path of a single file. The tricky part I'm working on now is: I don't want bind mounts to trick my program into listing files twice.
So I already have a program that produces the right output except that if /foo is bind mounted to /bar then my program incorrectly lists
/foo/file
/bar/file
I need the program to list just what's below (EDIT: even if it was asked to list the contents of /foo)
/bar/file
One approach I thought of is to mount | grep bind | awk '{print $1 " " $3}' and then iterate over this to sed every line of the output, then sort -u.
My question is how do I iterate over the original output (a bunch of lines) and the output from mount (another bunch of lines)? (or is there a better approach) This needs to be POSIX (EDIT: and work with /bin/sh)
Place the 'mount | grep bind' command into the AWK within a BEGIN block and store the data.
Something like:
PROG | awk 'BEGIN{
# Define the data you want to store
# Assign to global arrays
command = "mount | grep bind";
while ((command | getline) > 0) {
count++;
mount[count] = $1;
mountPt[count] = $3
}
}
# Assuming input is line-by-line and that mountPt is the value
# that is undesired
{
replaceLine=0
for (i=1; i<=count; i++) {
idx = index($1, mountPt[i]);
if (idx == 1) {
replaceLine = 1;
break;
}
}
if (replaceLine == 1) {
sub(mountPt[i], mount[i], $1);
}
if (printed[$1] != 1) {
print $1;
}
printed[$1] = 1;
} '
Where I assume your current program, PROG, outputs to stdout.
find YourPath -print > YourFiles.txt
mount > Bind.txt
awk 'FNR == NR && $0 ~ /bind/ {
Bind[ $1] = $3
if( ( ThisLevel = split( $3, Unused, "/") - 1 ) > Level) Level = ThisLevel
}
FNR != NR && $0 !~ /^ *$/ {
RealName = $0
for( ThisLevel = Level; ThisLevel > 0; ThisLevel--){
match( $0, "(/[^/]*){" ThisLevel "}" )
UnBind = Bind[ substr( $0, 1, RLENGTH) ]
if( UnBind !~ /^$/) {
RealName = UnBind substr( $0, RLENGTH + 1)
ThisLevel = 0
}
}
if( ! File[ RealName]++) print RealName
}
' Bind.txt YourFiles.txt
search based on a exact path/bind comparaison from a bind array loaded first
Bind.txt and YourFiles.txt could be a direct redirection to be "1" instruction and no temporary files
have to be adapted (first part of awk) if path in bind are using space character (assume not here)
file path are changed live when reading, compare to an existing bind relation
print file if not yet known

remove a line with special character with given pattern

I'm trying to get the lines with special characters which is not prefixed with \. Below are the special characters:
^$%.*+?!(){}[]|\
I need to check all the above special characters which is not prefixed with \ in 2nd column. I'm trying with awk to complete this, but no luck. I want the output as below.
input.txt
1,ap^ple
2,o$range
3,bu+tter
4,gr(ape
5,sm\(ok\e
6,ra\in
7,p+la\\y
8,wor\+k
output.txt
1,ap^ple
2,o$range
3,bu+tter
4,gr(ape
5,sm\(ok\e
6,ra\in
7,p+la\\y
7th row and 5 row are in output.txt because there is 2 special charcters(one is with backslash another without backslash)
"final" final edit: I wanted to allow "\x" whatever x is, but the OP seems to not want that, so I fixed it too.
After trying to find a "clever" regexp (which choked on "\\" or any impair number of "\", but apparently worked for the rest...)
I re-wrote it in awk to do it in a "state automata" way:
The idea:
If in "normal mode", we encounter a special char other than "\" ? : we print the line!
If in "normal mode", we encounter a "\" ? : we enter "escaped mode", and in that mode, ignore the next char
(but if we don't have a next char, we need to print that line too!)
the script:
awk -F"," '
{
IN_ESCAPED_MODE=0 ;
for (i=1 ; i<=length($2) ; i++)
{ char=substr($2,i,1)
if ( IN_ESCAPED_MODE == 0)
{ if ( index(".^$%*+?!(){}[]|",char) > 0 )
{ print $0 ; break ;
}
if ( index("\\" , char ) > 0 )
{ IN_ESCAPED_MODE=1 ; continue ;
}
}
if ( IN_ESCAPED_MODE == 1)
{ if ( index(".^$%*+?!(){}[]|\\",char) > 0 )
{ IN_ESCAPED_MODE=0 ; continue ;
}
else
{ IN_ESCAPED_MODE=0 ; print $0; break;
}
}
}
if (IN_ESCAPED_MODE == 1)
{
print $0 ; break ;
}
}
' input.txt > output.txt
With this change, you will have the same output as the OP, which prints a line when it contains "\e" for example... Which I find weird: to me "\e" is fine, we can "escape" anything?
With that input:
1,ap^ple
2,o$range
3,bu+tter
4,gr(ape
5,sm\(ok\e
6,ra\in
7,p+la\\y
8,wor\+k
10,\
11,\\
12,\\\
13,.
14,\.
15,..
16,^
17,\^
18,$
19,\$
20,%
21,\%
22,*
23,\*
24,+
25,\+
26,?
27,\?
28,!
29,\!
30,(
31,\(
32,)
33,\)
34,{
35,\{
36,}
37,\}
38,[
39,\[
40,]
41,\]
42,|
43,\|
it outputs:
1,ap^ple
2,o$range
3,bu+tter
4,gr(ape
5,sm\(ok\e
6,ra\in
7,p+la\\y
10,\
12,\\\
13,.
15,..
16,^
18,$
20,%
22,*
24,+
26,?
28,!
30,(
32,)
34,{
36,}
38,[
40,]
42,|
(so it appears to really work this time !)
If you prefer to allow any "\x" and NOT only if "x" is a SPECIAL char:
change the "middle lines":
if ( IN_ESCAPED_MODE == 1)
{ if ( index(".^$%*+?!(){}[]|\\",char) > 0 )
{ IN_ESCAPED_MODE=0 ; continue ;
}
else
{ IN_ESCAPED_MODE=0 ; print $0; break;
}
}
into:
if ( IN_ESCAPED_MODE == 1)
{ IN_ESCAPED_MODE=0 ; continue ;
}
for historical reason : the regexp (which worked in "most" cases but choked in some, for example if there was "\\") :
egrep '[^\][].^$%*+?!(){}[|]|[^\][\][^].^$%*+?!(){}[|\]' input.txt > output.txt
But that one will not display the line 12, for example...
A good read: http://www.regular-expressions.info/charclass.html .... and http://www.gnu.org/software/gawk/manual/html_node/Gory-Details.html (scary ...)
You can try the following:
awk '
{
line=$0
sub(/\\[\^$%.*+?!(){}\[\]|\\]/,"")
if(/[\^$%.*+?!(){}\[\]|\\]/)
print line
}' input.txt
sed '/[]\\^$%.*+?!(){}[|]/ {
h
s/\\[]\\^$%.*+?!(){}[|]/_/g
/[]\\^$%.*+?!(){}[|]/ {
x
p
}
}' YourFile
Depending of shell and sed could be interpreted (especialy the \) differently. Works on my AIX/KSH

Generate JSON using Dynamic Variable in Shell script

I need to generate JSON output from my shell script.
I need to get Ram slot details of a particular machine and generate JSON using those details.
To get Ram details I am using system_profiler SPMemoryDataType
It produces details as follows.
BANK 0/DIMM0:
Size: 2 GB
Type: DDR3
Speed: 1600 MHz
Status: OK
Manufacturer: 0x802C
Part Number: 0x384A54463235363634485A2D3147364D3120
Serial Number: 0xE98388E6
BANK 1/DIMM0:
Size: 2 GB
Type: DDR3
Speed: 1600 MHz
Status: OK
Manufacturer: 0x802C
Part Number: 0x384A54463235363634485A2D3147364D3120
Serial Number: 0xE98388E5
From that I should form JSON like this
[
{"Bank":"0/DIMM0","Serial Number":"0xE98388E6","Status":"OK"},
{"Bank":"1/DIMM0","Serial Number":"0xE98388E5","Status":"OK"}
]
To extract separate details like bank, Serial Number, Status we can use
system_profiler SPMemoryDataType | awk '/Bank/
system_profiler SPMemoryDataType | awk '/Serial/
system_profiler SPMemoryDataType | awk '/Status/
I am sure that there is a need of Dynamic variable to do form json from the results. But since I am new to shell script I am confused. Is there any way to generate JSON from the output?
#!/usr/bin/awk -f
$1 == "BANK" {
bank = $2
sub(/:/, "", bank)
while (getline > 0) {
if ($1 == "Serial" && $2 == "Number:") {
serial_number = $3
} else if ($1 == "Status:") {
status = $2
}
if (serial_number != "" && status != "") {
entries[++e] = "{\"Bank\":\"" bank "\",\"Serial Number\":\"" serial_number "\",\"Status\":\"" status "\"}"
break
}
}
bank = serial_number = status = ""
}
END {
print "["
if (e > 0) {
printf "%s", entries[1]
for (i = 2; i <= e; ++i) {
printf ",\n%s", entries[i]
}
print ""
}
print "]"
}
Usage:
awk -f script.awk file
system_profiler SPMemoryDataType | awk -f script.awk
Example output:
[
{"Bank":"0/DIMM0","Serial Number":"0xE98388E6","Status":"OK"},
{"Bank":"1/DIMM0","Serial Number":"0xE98388E5","Status":"OK"}
]
Using within a shell script:
#!/bin/bash
system_profiler SPMemoryDataType | awk '$1 == "BANK" {
bank = $2
sub(/:/, "", bank)
while (getline > 0) {
if ($1 == "Serial" && $2 == "Number:") {
serial_number = $3
} else if ($1 == "Status:") {
status = $2
}
if (serial_number != "" && status != "") {
entries[++e] = "{\"Bank\":\"" bank "\",\"Serial Number\":\"" serial_number "\",\"Status\":\"" status "\"}"
break
}
}
bank = serial_number = status = ""
}
END {
print "["
if (e > 0) {
printf "%s", entries[1]
for (i = 2; i <= e; ++i) {
printf ",\n%s", entries[i]
}
print ""
}
print "]"
}'
A one-liner:
system_profiler SPMemoryDataType | awk '$1=="BANK"{bank=$2;sub(/:/,"",bank);while(getline>0){if($1=="Serial"&&$2=="Number:"){serial_number=$3}else if($1=="Status:"){status=$2};if(serial_number!=""&&status!=""){entries[++e]="{\"Bank\":\""bank"\",\"SerialNumber\":\""serial_number"\",\"Status\":\""status"\"}";break}};bank=serial_number=status=""}END{print "[";if(e>0){printf "%s",entries[1];for(i=2;i<=e;++i){printf ",\n%s",entries[i]};print""};print "]"}'
There are some few libraries to do so.. One such thing is https://github.com/jeganathgt/libjson-sh . It is standalone shell script library, provides easy handy API's to generate json output in console.
Ex :
json_init
json_add_string "serial" "$<cmd>"
json_add_string "bankinfo" "$<cmd>"
json_add_string "status" "$<cmd>"
json_dump

Move line(s) to follow another line in a file

I got a file that has a line in the file like this:
check=('78905905f5a4ed82160c327f3fd34cba')
I'd like to be able to move this line to follow a line that looks like this:
files=('somefile.txt')
The array though at times that can span multiple lines, for example:
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...')
text
in between
check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
The array/line always ends in a ) and no text in between will contain a closed parenthesis.
I got some advice that awk can do this:
awk '/files/{
f=0
print $0
for(i=1;i<=d;i++){ print a[i] }
g=0
delete a # remove array after found
next
}
/check/{ f=1; g=1 }
f{ a[++d]=$0 }
!g' file
This will only span one line though. I was told to expand the search:
awk '/source/ && /\)$/{
f=0
print $0
for(i=1;i<=d;i++){ print a[i] }
g=0
delete a # remove array after found
next
}
/md5sum/ && /\)$/{ f=1; g=1 }
f{ a[++d]=$0 }
!g'
Just learning awk so I'd appreciate help with this. Or if there is another tool that can do this, I'd like to hear about it. Someone told me that 'ed' these types of capabilities.
To answer your last question first, yes, awk is the typical Unix tool for this, other candidates are the incredibly powerful Perl, Python, or .. my favorite .. Ruby. One advantage of awk is that it's always there; it's part of the base system. Another way to solve this kind of problem is with an editor script that controls ed(1) or ex(1).
Ok, new program for the revised question. This program will move the "check" lines either up or down as necessary so that they follow the "files" lines.
BEGIN {
checkAt = 0
filesAt = 0
scanning = 0
}
/check=\(/ {
checkAt = NR
scanning = 1
}
/files=\(/ {
filesAt = NR
scanning = 1
}
/)$/ {
if (scanning) {
if (checkAt > filesAt) {
checkEnd = NR
} else {
filesEnd = NR
}
scanning = 0
}
}
{
lines[NR] = $0
}
END {
for (i = 1; i <= NR; ++i) {
if (checkAt <= i && i <= checkEnd) {
continue
}
print lines[i]
if (i == filesEnd) {
for (j = checkAt; j <= checkEnd; ++j) {
print lines[j]
}
}
}
}
I looked in to doing this with Awk, but it looked like you wouldn't really get anything clever out of it, it would just be the same logic, but with some Awk pain to go with it, so I did it in Perl :)
#!/usr/bin/perl
open(IN, $ARGV[0]) || die("Could not open file: " . $ARGV[0]);
my $buffer="";
foreach $line (<IN>) {
if ($line =~ /^check=/) {
$flag = 1;
$buffer .= $line;
} elsif ($flag == 1 && $line =~/\)/) {
$flag = 0;
$buffer .= $line;
} elsif ($flag == 1) {
$buffer .= $line;
} elsif ($flag == 0 && $line =~ /^files=/) {
$flag = 2;
print $line;
} elsif ($flag == 2 && $line =~ /\)/) {
$flag = 0;
print $line;
if (length($buffer) > 0) {
print $buffer;
$buffer = "";
}
} else {
print $line;
}
}
And the output :)
Chill:~ rus$ cat test check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...')
asdasdasd
check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...')
asdsd
check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...')
Chill:~ rus$ ./t.pl test
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...') check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
asdasdasd
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...') check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
asdsd
text in between
files=('somefile.txt'
'file2.png'
'another.txt'
'andanother...') check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
ta da ?! :D
Here's how to do it with sed:
sed -e /^check=(/,/)/{H;d} -e /)/{G;s/\n//} &lt filename
This assumes that there are no right parentheses after the "files=..." If there are then you'll need more precision:
sed -e /^check=(/,/)/{H;d} -e /^files=(/,/)/{/)/{G;s/\n//}} &lt filename
EDIT:
Working in bash? All right, try this:
sed -e /^check=(/,/)/H -e /^check=(/,/)/d -e '/)/G;s/\n//' &lt filename
This seems to work, but it's not clear to me why this variant and not a few other obvious ones. This dance-of-the-special-characters is always a problem with regexs.
#todd, I seem to have left you in the lurch after providing you the awk solution haven't i. ? :).
here's another method, this time not using method of flags. there are some loose ends (hint: check the patterns p,q and output again) that i leave it to you to tidy up.
gawk 'BEGIN{
RS="check=[(]"
q="files=(.*\047)" # pattern to replace files= part
p=".*(files=(.*\047)).*" # to get the whole files= part to variable
}
NR>1{
b=gensub(p, "\\1","g",$0) # get the files=part to var b
printf "%s\n\n",b
printf "check=("
gsub(q,"",$0)
print $0
}' file
NB: gensub is specific to gawk so if you have gawk, then that's alright
output
$ more file
check=('5277a9164001a4276837b59dade26af2'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between one
files=('somefile1.txt'
'file1.png'
'another1.txt'
'andanother1...')
asdasdasd blah blah
check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between two
files=('somefile2.txt'
'file2.png'
'another2.txt'
'andanother2...')
asdsd blaasdf aslasdfaslj aslfjsldfsa 123e12
check=('78905905fblah blah5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between
files=('somefile3.txt'
'file3.png'
'another3.txt'
'andanother3...')
$ ./shell.sh
files=('somefile1.txt'
'file1.png'
'another1.txt'
'andanother1...'
check=('5277a9164001a4276837b59dade26af2'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between one
)
asdasdasd blah blah
files=('somefile2.txt'
'file2.png'
'another2.txt'
'andanother2...'
check=('78905905f5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between two
)
asdsd blaasdf aslasdfaslj aslfjsldfsa 123e12
files=('somefile3.txt'
'file3.png'
'another3.txt'
'andanother3...'
check=('78905905fblah blah5a4ed82160c327f3fd34cba'
'5277a9164001a4276837b59dade26af2'
'3f8b60b6fbb993c18442b62ea661aa6b')
text in between
)
This might work for you:
sed ':a;$!N;/^files=.*\ncheck=/{/.*)$/!ba;s/\([^)]*)\)\(.*\)\(\ncheck=.*\)/\1\3\2/p;d};/^files=.*/ba;P;D' file

Resources