How to search for a string from/after certain line in text file using bash script?
E.g. I want to search for first occurrence of "version:" string but not at start of file but at line no. say 35 which contains text *-disk:0 so that I would get product name of disk-0 only and nothing else.
My current approach is as follows where line_no was line no. of the line where disk:0 is present. But sometimes there is vendor name also present in-between the disk:0 and version. At that time, this logic fails.
ver_line_no=$(echo $(( line_no + 6 )))
ver_line_text=`sed -n ${ver_line_no}p $1`
check_if_present=`echo $fver_line_text | grep "version:"`
Background:
I am trying to parse output of lshw commmand.
*-disk:0
description: ATA Disk
product: SAMSUNG
physical id: 0
bus info: scsi#z:y:z:0
logical name: /dev/sda
version: abc
serial: pqr
size: 2048GiB
capabilities: axy, pqr
configuration: pqr, abc, ghj
*-disk:1
description: ATA Disk
product: TOSHIBA
physical id: 0
bus info: scsi#p:q:z:0
logical name: /dev/sdb
version: nmh
serial: pqasd
size: 2048GiB
capabilities: axy, pqr
configuration: pqr, abc, ghj
This is the sample information.
I want to print information in tabular format using bash script.
You should be able to cut out the block you want with sed, then use grep:
sudo lshw | sed -n '/\*-disk:0/,/\s*\*/p' | grep 'version:'
The sed command does not print any lines (-n), then finds the block between *-disk:0 and the next * and prints only that (p).
sed solution:
If you want to search for first occurrence after given line number (e.g. 10).
l=10
lshw | sed -n "${l},$ {/version/{p;q}}"
If you want to search for first occurrence after given line content (e.g. *-disk:0)
lshw | sed -n '/*-disk:0/,${/version/{p;q}}'
You may use this awk that searches for *-disk:0 in a file to toggle a flag to true:
awk -F: '$1 ~ /\*-disk$/{p = ($2 == 0)} p && /^[[:blank:]]*version:/' file
version: abc
You need to print all lines up the end. The $ represents the end in sed.
sed -p '6,$p'
Will print lines from 6th line to the end. Be aware of quoting, so that $ doesn't get expanded.
You want:
ver_line_no=$(( line_no + 6 ))
ver_line_text=$(sed -n "${ver_line_no}"',$p' "$1")
check_if_present=$(echo "$fver_line_text" | grep "version:")
Notes:
Backticks ` are deprecated. Use $( ...) command substitution instead.
Always quote your variables.
Doing echo $(( .. )) is just repeating yourself. Just $(( ... )).
Sometimes on systems without sed you can use cut -d $'\n' -f6-.
A general solution using awk (assuming that every disk has a version).
File 'find_disk_version.awk'
/disk:/ {
disk_found="true"
disk_name=$1
}
/version:/ {
if (disk_found) {
print disk_name" "$2
disk_found=""
}
}
Test file 'test':
Something_else
version: ver_something_else
serial: blabla
configuration: foo, bar
*-disk:0
description: ATA Disk
version: ver.0
serial: pqr
*-disk:1
version: ver1
serial: pqasd
configuration: pqr, abc, ghj
Something_else_again
version: ver_somethingelse_again
serial: foobar
configuration: bar, foo
*-disk:2
version: ver2
serial: pqasd
configuration: pqr, abc, ghj
Output:
$ cat test | awk -f find_disk_version.awk
*-disk:0 ver.0
*-disk:1 ver1
*-disk:2 ver2
Instead of 'cat test' can be your command 'lshw'
P.S. the script will not work if there is a disk without version.
Related
I have the following multi-line output:
Volume Name: GATE02-SYSTEM
Size: 151934468096
Volume Name: GATE02-S
Size: 2000112582656
Is it possible to convert strings like:
Volume Name: GATE02-SYSTEM
Size: 141.5 Gb
Volume Name: GATE02-S
Size: 1862.75 Gb
with sed/awk? I looked for answers for the only number output like:
echo "151934468096" | awk '{ byte =$1 /1024/1024**2 ; print byte " GB" }'
but can't figure out how to apply it to my case.
EDIT: I'd like to clarify my case. I have the following one-liner:
esxcfg-info | grep -A 17 "\\==+Vm FileSystem" | grep -B 15 "[I]s Accessible.*true" | grep -B 4 -A 11 "Head Extent.*ATA.*$" | sed -r 's/[.]{2,}/: /g;s/^ {12}//g;s/\|----//g' | egrep -i "([V]olume Name|^[S]ize:)"
which generates output above. I want to know what to add more after the last command to get the desired output.
Using (GNU) sed and bc:
... | sed 's#Size: \([0-9]*\)#printf "Size: %s" $(echo "scale = 2; \1 / 1024 ^ 3" | bc)GiB#e'
^^^^ ^^^^^^^^^^ ^^^^^^ ^^^^^^^^^ ^^ ^^^^^^^^^^ ^^ ^
1 2 3 4 5 6 7 8
Matching Size: lines.
Capturing the number of bytes (potentially vulnerable if empty or zero, strengthen for production use)
Replacing with printf output
scale = sets the number of decimals for bc
Using the sed capture group (number of bytes)
The mathematical operation on the byte number
Piping all that to bc for evaluation
Telling sed to execute the "replace" part of the s statement in a subshell.
Another option is numfmt (where available, since GNU coreutils 8.21):
... | sed 's#Size: \([0-9]*\)#printf "%s" $(echo \1 | numfmt --to=iec-i --suffix=B)#e'
This doesn't give you control over the decimals, but "works" for all sizes (i.e. gives TiB where the number is big enough, and does not truncate to "0.00 GiB" for too-small numbers).
printf "Volume Name: GATE02-SYSTEM\nSize: 151934468096\nVolume Name: GATE02-S\nSize: 2000112582656" |
awk '/^Size: / { $2 = $2/1024/1024**2 " Gb" }1'
I'm working now on a project. In this project for some reasons I need to exclude first string from the output (or file) that matches the pattern. The difficulty is in that I need to exclude just one string, just first string from the stream.
For example, if I have:
1 abc
2 qwerty
3 open
4 abc
5 talk
After some script working I should have this:
2 qwerty
3 open
4 abc
5 talk
NOTE: I don't know anything about digits before words, so I can't filter the output using knowledge about them.
I've written small script with grep, but it cuts out every string, that matches the pattern:
'some program' | grep -v "abc"
Read info about awk, sed, etc. but didn't understand if I can solve my problem.
Anything helps, Thank you.
Using awk:
some program | awk '{ if (/abc/ && !seen) { seen = 1 } else print }'
Alternatively, using only filters:
some program | awk '!/abc/ || seen { print } /abc/ && !seen { seen = 1 }'
You can use Ex editor. For example to remove the first pattern from the file:
ex +"/abc/d" -scwq file.txt
From the input (replace cat with your program):
ex +"/abc/d" +%p -scq! <(cat file.txt)
You can also read from stdin by replacing cat with /dev/stdin.
Explanation:
+cmd - execute Ex/Vim command
/pattern/d - find the pattern and delete,
%p - print the current buffer
-s - silent mode
-cq! - execute quite without saving (!)
<(cmd) - shell process substitution
give line numbers using sed which you want to delete
sed 1,2d
instead of 1 2 use line numbers that you want to delete
otherwise you can use
sed '/pattrent to match/d'
here we can have
sed '0,/abc/{//d;}'
You can also use a list of commands { list; } to read the first line and print the rest:
command | { read first_line; cat -; }
Simple example:
$ cat file
1 abc
2 qwerty
3 open
4 abc
5 talk
$ cat file | { read first_line; cat -; }
2 qwerty
3 open
4 abc
5 talk
awk '!/1/' file
2 qwerty
3 open
4 abc
5 talk
Thats all!
I have to count all '=' between two pattern i.e '{' and '}'
Sample:
{
100="1";
101="2";
102="3";
};
{
104="1,2,3";
};
{
105="1,2,3";
};
Expected Output:
3
1
1
A very cryptic perl answer:
perl -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ge'
The tr function returns the number of characters transliterated.
With the new requirements, we can make a couple of small changes:
perl -0777 -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ges'
-0777 reads the entire file/stream into a single string
the s flag to the s/// function allows . to handle newlines like a plain character.
Perl to the rescue:
perl -lne '$c = 0; $c += ("$1" =~ tr/=//) while /\{(.*?)\}/g; print $c' < input
-n reads the input line by line
-l adds a newline to each print
/\{(.*?)\}/g is a regular expression. The ? makes the asterisk frugal, i.e. matching the shortest possible string.
The (...) parentheses create a capture group, refered to as $1.
tr is normally used to transliterate (i.e. replace one character by another), but here it just counts the number of equal signs.
+= adds the number to $c.
Awk is here too
grep -o '{[^}]\+}'|awk -v FS='=' '{print NF-1}'
example
echo '{100="1";101="2";102="3";};
{104="1,2,3";};
{105="1,2,3";};'|grep -o '{[^}]\+}'|awk -v FS='=' '{print NF-1}'
output
3
1
1
First some test input (a line with a = outside the curly brackets and inside the content, one without brackets and one with only 2 brackets)
echo '== {100="1";101="2";102="3=3=3=3";} =;
a=b
{c=d}
{}'
Handle line without brackets (put a dummy char so you will not end up with an empty string)
sed -e 's/^[^{]*$/x/'
Handle line without equal sign (put a dummy char so you will not end up with an empty string)
sed -e 's/{[^=]*}/x/'
Remove stuff outside the brackets
sed -e 's/.*{\(.*\)}/\1/'
Remove stuff inside the double quotes (do not count fields there)
sed -e 's/"[^"]*"//g'
Use #repzero method to count equal signs
awk -F "=" '{print NF-1}'
Combine stuff
echo -e '{100="1";101="2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$/x/' -e 's/{[^=]*}/x/' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{print NF-1}'
The ugly temp fields x and replacing {} can be solved inside awk:
echo -e '= {100="1";101="2=2=2=2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$//' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{if (NF>0) c=NF-1; else c=0; print c}'
or shorter
echo -e '= {100="1";101="2=2=2=2";102="3";};\na=b\n{c=d}\n{}' |
sed -e 's/^[^{]*$//' -e 's/.*{\(.*\)}/\1/' -e 's/"[^"]*"//g' |
awk -F "=" '{print (NF>0) ? NF-1 : 0; }'
No harder sed than done ... in.
Restricting this answer to the environment as tagged, namely:
linux shell unix sed wc
will actually not require the use of wc (or awk, perl, or any other app.).
Though echo is used, a file source can easily exclude its use.
As for bash, it is the shell.
The actual environment used is documented at the end.
NB. Exploitation of GNU specific extensions has been used for brevity
but appropriately annotated to make a more generic implementation.
Also brace bracketed { text } will not include braces in the text.
It is implicit that such braces should be present as {} pairs but
the text src. dangling brace does not directly violate this tenet.
This is a foray into the world of `sed`'ng to gain some fluency in it's use for other purposes.
The ideas expounded upon here are used to cross pollinate another SO problem solution in order
to aquire more familiarity with vetting vagaries of vernacular version variances. Consequently
this pedantic exercice hopefully helps with the pedagogy of others beyond personal edification.
To test easily, at least in the environment noted below, judiciously highlight the appropriate
code section, carefully excluding a dangling pipe |, and then, to a CLI command line interface
drag & drop, copy & paste or use middle click to enter the code.
The other SO problem. linux - Is it possible to do simple arithmetic in sed addresses?
# _______________________________ always needed ________________________________
echo -e '\n
\n = = = {\n } = = = each = is outside the braces
\na\nb\n { } so therefore are not counted
\nc\n { = = = = = = = } while the ones here do count
{\n100="1";\n101="2";\n102="3";\n};
\n {\n104="1,2,3";\n};
a\nb\nc\n {\n105="1,2,3";\n};
{ dangling brace ignored junk = = = \n' |
# _____________ prepatory conditioning needed for final solutions _____________
sed ' s/{/\n{\n/g;
s/}/\n}\n/g; ' | # guarantee but one brace to a line
sed -n '/{/ h; # so sed addressing can "work" here
/{/,/}/ H; # use hHold buffer for only { ... }
/}/ { x; s/[^=]*//g; p } ' | # then make each {} set a line of =
# ____ stop code hi-lite selection in ^--^ here include quote not pipe ____
# ____ outputs the following exclusive of the shell " # " comment quotes _____
#
#
# =======
# ===
# =
# =
# _________________________________________________________________________
# ____________________________ "simple" GNU solution ____________________________
sed -e '/^$/ { s//0/;b }; # handle null data as 0 case: next!
s/=/\n/g; # to easily count an = make it a nl
s/\n$//g; # echo adds an extra nl - delete it
s/.*/echo "&" | sed -n $=/; # sed = command w/ $ counts last nl
e ' # who knew only GNU say you ah phoo
# 0
# 0
# 7
# 3
# 1
# 1
# _________________________________________________________________________
# ________________________ generic incomplete "solution" ________________________
sed -e '/^$/ { s//echo 0/;b }; # handle null data as 0 case: next!
s/=$//g; # echo adds an extra nl - delete it
s/=/\\\\n/g; # to easily count an = make it a nl
s/.*/echo -e & | sed -n $=/; '
# _______________________________________________________________________________
The paradigm used for the algorithm is instigated by the prolegomena study below.
The idea is to isolate groups of = signs between { } braces for counting.
These are found and each group is put on a separate line with ALL other adorning characters removed.
It is noted that sed can easily "count", actually enumerate, nl or \n line ends via =.
The first "solution" uses these sed commands:
print
branch w/o label starts a new cycle
h/Hold for filling this sed buffer
exchanage to swap the hold and pattern buffers
= to enumerate the current sed input line
substitute s/.../.../; with global flag s/.../.../g;
and most particularly the GNU specific
evaluate (execute can not remember the actual mnemonic but irrelevantly synonymous)
The GNU specific execute command is avoided in the generic code. It does not print the answer but
instead produces code that will print the answer. Run it to observe. To fully automate this, many
mechanisms can be used not the least of which is the sed write command to put these lines in a
shell file to be excuted or even embed the output in bash evaluation parentheses $( ) etc.
Note also that various sed example scripts can "count" and these too can be used efficaciously.
The interested reader can entertain these other pursuits.
prolegomena:
concept from counting # of lines between braces
sed -n '/{/=;/}/=;'
to
sed -n '/}/=;/{/=;' |
sed -n 'h;n;G;s/\n/ - /;
2s/^/ Between sets of {} \n the nl # count is\n /;
2!s/^/ /;
p'
testing "done in":
linuxuser#ubuntu:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic
linuxuser#ubuntu:~$ sed --version -----> sed (GNU sed) 4.4
And for giggles an awk-only alternative:
echo '{
> 100="1";
> 101="2";
> 102="3";
> };
> {
> 104="1,2,3";
> };
> {
> 105="1,2,3";
> };' | awk 'BEGIN{RS="\n};";FS="\n"}{c=gsub(/=/,""); if(NF>2){print c}}'
3
1
1
In my Packages file i have multiple packages. I'm able to check the file if a string is inside, and if so, i would like to get the version of the file.
Package: depictiontest
Version: 1.0
Filename: ./debs/com.icr8zy.depictiontest.deb
Size: 810
Description: Do not install. Testing Depiction.
Name: Depiction Test
so the above is part of the many similar looking info of a package. Each time i detected if the package exists i would like to get the Version. is there any possible way?
btw, this is what i use to get check if the file exists.
if grep -q "$filename" /location/Packages; then
#file exists
#get file version <-- stuck here
else
#file does not exists
fi
EDIT:
Sorry but maybe i wasn't clear in explaining myself, I would already have the Name of the package and would like to extract the Version of that package only. I do not need a loop to get all the Names and Versions. Hope this clears it... :)
How do you extract the file name in the first place? Why not parse the whole file, then filter out nonexistent file names.
awk '/^Package:/{p=$2}
/^Version:/{v=$2}
/^Filename:/{f=$2}
/^$/{print p, v, f}' Packages |
while read p v f; do
test -e "$f" || continue
echo "$p $v"
done
This is not robust with e.g. file names with spaces, but Packages files don't have file names with spaces anyway. (Your example filename is nonstandard, though; let's assume it's no worse than this.)
You want to make sure there's an empty line at the end of Packages, or force it with { sed '$/^$/d' Packages; echo; } | awk ...
Edit: This assumes a fairly well-formed Packages file, with an empty line between records. If a record lacks one of these fields, the output will repeat the value from the previous record - that's nasty. If there are multiple adjacent empty lines, it will output the same package twice. Etc. If you want robust parsing, I'd switch to Perl or Python, or use a standard Debian tool (I'm sure there must be one).
With grep, you can pick a certain amount of lines before or after a keyword.
egrep -A1 "^Package: depictiontest" /path/to/file
would yield 1 additional line after the match.
egrep -B1 "^Filename: .*depictiontest.*" /path/to/file
would yield 1 additional line before the match.
egrep "^(Package|Version): " "^Package: depictiontest" /path/to/file
would lead to package and Version lines only, so would rely on them being in the correct order, to find out easily, which version belongs to which package.
If the order is same then you can parse the whole file and feed values in to an array -
awk -F": " '
/^Package/{p=$2;getline;v=$2;getline;f=$2;ary[p"\n"v"\n"f"\n"]}
END{for (x in ary) print x}' file
Test:
[jaypal:~/Temp] cat file
Package: depictiontest
Version: 1.0
Filename: ./debs/com.icr8zy.depictiontest.deb
Size: 810
Description: Do not install. Testing Depiction.
Name: Depiction2fdf Test
Package: depi444ctiontest
Version: 1.05
Filename: ./debs/coffm.icr8zy.depictiontest.deb
Size: 810
Description: Do not install. Testing Depiction.
Name: Depiction Test
Package: depic33tiontest
Version: 1.01
Filename: ./d3ebs/com.icr8zy.depictiontest.deb
Size: 810
Description: Do not install. Testing Depiction.
Name: Depiction Test
[jaypal:~/Temp] awk -F": " '/^Package/{p=$2;getline;v=$2;getline;f=$2;ary[p"\n"v"\n"f"\n"]}END{for (x in ary) print x}' file
depi444ctiontest
1.05
./debs/coffm.icr8zy.depictiontest.deb
depic33tiontest
1.01
./d3ebs/com.icr8zy.depictiontest.deb
depictiontest
1.0
./debs/com.icr8zy.depictiontest.deb
If "Version: ..." line is always exactly 1 line before the "Filename: ..." line, then you may try something like this:
line_number=$(grep -n "$filename" /location/Packages | head -1 | cut -d: -f1)
if (( $line_number > 0 )); then
#file exists
version=$(head -n $(( $line_number - 1 )) /location/Packages | tail -1 | cut -d' ' -f2)
else
#file doesn't exist
fi
The simplest awk implementation that I can think of:
$ awk -F':' -v package='depictiontest' '
$1 == "Package" {
trimmed_package_name = gensub(/^ */, "", "", $2)
found_package = (trimmed_package_name == package)
}
found_package && $1 == "Version" {
trimmed_version_number = gensub(/^ */, "", "", $2)
print trimmed_version_number
}
' Packages
1.0
This processes the file (Packages) line-by-line and sets the found_package flag if the line starts with 'Package' and the value after the field separator (-F), which is : (and any whitespace) is the value passed into the package variable (-v). Then, if the flag is set an we find a line beginning with a 'Version' field, we print the value after the field separator (trimming leading whitespace). If another 'Package' field is found and the name is not the one that we are looking for, the flag is reset and the subsequent version number will not be printed.
I need to get a row based on column value just like querying a database. I have a command output like this,
Name ID Mem VCPUs State
Time(s)
Domain-0 0 15485 16 r-----
1779042.1
prime95-01 512 1
-b---- 61.9
Here I need to list only those rows where state is "r". Something like this,
Domain-0 0 15485 16
r----- 1779042.1
I have tried using "grep" and "awk" but still I am not able to succeed.
Any help me is much appreciated
Regards,
Raaj
There is a variaty of tools available for filtering.
If you only want lines with "r-----" grep is more than enough:
command | grep "r-----"
Or
cat filename | grep "r-----"
grep can handle this for you:
yourcommand | grep -- 'r-----'
It's often useful to save the (full) output to a file to analyse later. For this I use tee.
yourcommand | tee somefile | grep 'r-----'
If you want to find the line containing "-b----" a little later on without re-running yourcommand, you can just use:
grep -- '-b----' somefile
No need for cat here!
I recommend putting -- after your call to grep since your patterns contain minus-signs and if the minus-sign is at the beginning of the pattern, this would look like an option argument to grep rather than a part of the pattern.
try:
awk '$5 ~ /^r.*/ { print }'
Like this:
cat file | awk '$5 ~ /^r.*/ { print }'
grep solution:
command | grep -E "^([^ ]+ ){4}r"
What this does (-E switches on extended regexp):
The first caret (^) matches the beginning of the line.
[^ ] matches exactly one occurence of a non-space character, the following modifier (+) allows it to also match more occurences.
Grouped together with the trailing space in ([^ ]+ ), it matches any sequence of non-space characters followed by a single space. The modifyer {4} requires this construct to be matched exactly four times.
The single "r" is then the literal character you are searching for.
In plain words this could be written like "If the line starts <^> with four strings that are followed by a space <([^ ]+ ){4}> and the next character is , then the line matches."
A very good introduction into regular expressions has been written by Jan Goyvaerts (http://www.regular-expressions.info/quickstart.html).
Filtering by awk cmd in linux:-
Firstly find the column for this cmd and store file2 :-
awk '/Domain-0 0 15485 /' file1 >file2
Output:-
Domain-0 0 15485 16
r----- 1779042.1
after that awk cmd in file2:-
awk '{print $1,$2,$3,$4,"\n",$5,$6}' file2
Final Output:-
Domain-0 0 15485 16
r----- 1779042.1