Filtering a Block - linux

I have multiple blocks of the below pattern
<APPLIANCE>
<ID>12233</ID>
<UUID>xxxx-xxxx-xxxx-xxxx-xxxxxxx</UUID>
<NAME>xxxxxxx</NAME>
<STATUS>Offline</STATUS>
</APPLIANCE>
<APPLIANCE>
<ID>12234</ID>
<UUID>xxxx-xxxx-xxxx-xxxx-xxxxxxx</UUID>
<NAME>yyyyy</NAME>
<STATUS>Offline</STATUS>
</APPLIANCE>
I want to extract a block with Particular ID and Particular Name.
The output should display
For example :-
<ID>12234</ID>
<NAME>yyyyy</NAME>
I wanted to do using grep, sed, awk
Thanks.

This sed should work for you:
sed -n '/<ID>12234/,/<NAME>/{//p}' file
But you'd better use an xml parser as xmllint or xmlstarlet to parse valid xml files.

Related

extracting specific lines containing text and numbers from log file using awk

I haven't used my linux skills in a while and I'm struggling with extracting certain lines out of a csv log file.
The file is structured as:
code,client_id,local_timestamp,operation_code,error_code,etc
I want to extract only those lines of the file with a specific code and a positive client_id greater than 0.
for example if I have the lines:
message_received,1,134,20,0,xxx<br>
message_ack,0,135,10,1,xxx<br>
message_received,0,140,20,1,xxx<br>
message_sent,1,150,30,0,xxx
I only want to extract those lines having code message_received and positive client_id > 0, resulting in just the first line:
message_received,1,134,20,0,xxx
I want to use awk somewhat like:
awk '/message_received,[[:digit:]]>0'/ my log.csv which I know isn't quite correct.. but how do I achieve this in a one liner?
This is probably what you want:
awk -F, '($1=="message_received") && ($2>0)' mylog.csv
If not, edit your question to clarify.

Using grep for multiple patterns from multiple lines in a output file

I have a data output something like this captured in a file.
List item1
attrib1: someval11
attrib2: someval12
attrib3: someval13
attrib4: someval14
List item2
attrib1: someval21
attrib2: someval12
attrib4: someval24
attrib3: someval23
List item3
attrib1: someval31
attrib2: someval32
attrib3: someval33
attrib4: someval34
I want to extract attrib1, attrib3, attrib4 from the list of data only if "attrib2 is someval12".
note that attrib3 and attrib4 could be in any order after attrib2.
so far I tried to use grep with -A and -B option but I need to specify line number and that is sort of hardcoding which I don't want to do it.
grep -B 1 -A 1 -A 2 "attrib2: someval12" | egrep -w "attrib1|attrib3|attrib4"
can i use any other option of grep which doesn't involve specifying the before and after occurence for this example?
Grep and other tools (like join, sort, uniq) work on the principle "one record per line". It is therefore possible to use a 3-step pipe:
Convert each list item to a single line, using sed.
Do the filtering, using grep.
Convert back to the original format, using sed.
First you need to pick a character that is known not to occur in the input, and use it as separator character. For example, '|'.
Then, find the sed command for step 1, which transforms the input to the format
List item1|attrib1: someval11|attrib2: someval12|attrib3: someval13|attrib4: someval14|
List item2|attrib1: someval21|attrib2: someval12|attrib4: someval24|attrib3: someval23|
List item3|attrib1: someval31|attrib2: someval32|attrib3: someval33|attrib4: someval34|
Now step 2 is easy.

Bash script key/value pair regardless of bash version

I am writing a curl bash script to test webservices. I will have file_1 which would contain the URL paths
/path/to/url/1/{dynamic_path}.xml
/path/to/url/2/list.xml?{query_param}
Since the values in between {} is dynamic, I am creating a separate file, which will have values for these params. the input would be in key-value pair i.e.,
dynamic_path=123
query_param=shipment
By combining two files, the input should become
/path/to/url/1/123.xml
/path/to/url/2/list.xml?shipment
This is the background of my problem. Now my questions
I am doing it in bash script, and the approach I am using is first reading the file with parameters and parse it based on '=' and store it in key/value pair. so it will be easy to replace i.e., for each url I will find the substring between {} and whatever the text it comes with, I will use it as the key to fetch the value from the array
My approach sounds okay (at least to me) BUT, I just realized that
declare -A input_map is only supported in bashscript higher than 4.0. Now, I am not 100% sure what would be the target environment for my script, since it could run in multiple department.
Is there anything better you could suggest ? Any other approach ? Any other design ?
P.S:
This is the first time i am working on bash script.
Here's a risky way to do it: Assuming the values are in a file named "values"
. values
eval "$( sed 's/^/echo "/; s/{/${/; s/$/"/' file_1 )"
Basically, stick a dollar sign in front of the braces and transform each line into an echo statement.
More effort, with awk:
awk '
NR==FNR {split($0, a, /=/); v[a[1]]=a[2]; next}
(i=index($0, "{")) && (j=index($0,"}")) {
key=substr($0,i+1, j-i-1)
print substr($0, 1, i-1) v[key] substr($0, j+1)
}
' values file_1
There are many ways to do this. You seem to think of putting all inputs in a hashmap, and then iterate over that hashmap. In shell scripting it's more common and practical to process things as a stream using pipelines.
For example, your inputs could be in a csv file:
123,shipment
345,order
Then you could process this file like this:
while IFS=, read path param; do
sed -e "s/{dynamic_path}/$path/" -e "s/{query_param}/$param/" file_1
done < input.csv
The output will be:
/path/to/url/1/123.xml
/path/to/url/2/list.xml?shipment
/path/to/url/1/345.xml
/path/to/url/2/list.xml?order
But this is just an example, there can be so many other ways.
You should definitely start by writing a proof of concept and test it on your deployment server. This example should work in old versions of bash too.

Extracting JSON variable using bash

I need to extract the variable from a JSON encoded file and assign it to a variable in Bash.
excerpt...from file.json
"VariableA": "VariableA data",
"VariableB": [
"VariableB1",
"VariableB2",
"VariableB3",
"VariableB3"
],
I've gotten somewhere with this
variableA=$(fgrep -m 1 "VariableA" file.json )
but it returns the whole line. I just want the data
For the VariableB I need to replace the list with comma separated values.
I've looked at awk, sed, grep, regexpressions and really given the learning curve...need to know which one to use, or a better solution.
Thanks for your suggestions...but this is perfect
git://github.com/kristopolous/TickTick.git
You are better off using a JSON parser. There are many listed at http://json.org/ including two for the BASH shell.
http://kmkeen.com/jshon/
https://github.com/dominictarr/JSON.sh
There is powerful command-line JSON tool jq.
Extracting single value is easy:
variableA=$(jq .VariableA file.json)
For comma separated array contents try this
variableB=$(jq '.VariableB | #csv' file.json)
or
variableB=$(jq '.VariableB | .[]' file.json | tr '\n' ',' | head -c-1)
If you're open to using Perl they have a 'open()' function that will pipe a file with the json function 'to_json'. And if you want to extract json you can use the 'from_json' function. You can check it out here:
http://search.cpan.org/~rjbs/perl-5.16.0/lib/open.pm
http://metacpan.org/pod/JSON#to_json
http://metacpan.org/pod/JSON#from_json ( you might also try using decode json as well)

Extract the hole .xsl content from a .str file to an xsl/txt file

I am doing some forensics learning, and got a .str file that has an entire .xsl file:
I need to extract all that .xsl file from the .str file. I have used something like:
cat pc1.str | grep "<From>" > talk.txt
The problem is that I get almost all text, but not in a readable format. I think I am only getting all that has From inside.
Can you help me to get the text from <?xml version="1.0"?> to </log>?
Edit for clarity: I want to get all text, beginning from the xml until the /log.
The .str file is created by strings.
Here is the actual file I am using:
https://www.dropbox.com/s/j02elywhkhpbqvg/pc1.str?dl=0
From line 20893696 to 20919817.
I'd probably use perl:
#!/usr/bin/perl
use strict;
use warnings;
while ( <> ) {
print if m,<?xml version, .. m,</log>,
}
This makes use of the 'range' operator that returns true if a file is between two markers. By default, it uses the record separators $/ which is newline. If your data has newlines it's easy, but you can iterate based on bytes instead. (Just bear in mind that you may have to worry about overlapping a boundary).
E.g.
$/ = \80;
Will read 80 bytes at a time.
If you want all the lines of your .str file from the line that contains <?xml version="1.0"?> to the first line that contains </log> then this should work.
awk '/<?xml version="1.0"?>/{p=1} p; /<\/log>/{exit}' pc1.str
Match the opening line and set p=1. If p is truth-y then print the current line. Match the line with the closing tag and exit.
If you want output without the radix field from the file then something like this should work.
cut -f 2 pc1.str | awk '/<?xml version="1.0"?>/{p=1} p; /<\/log>/{exit}'
This adds cut to trim off the first radix field (awk isn't as good at field ranges).
If you also want to ignore anything before the opening xml marker and after the closing </log> tag something like this should work (untested).
cut -f 2 pc1.str | awk '/<?xml version="1.0"?>/{p=1; $0=substr($0, 1, index($0, "<?xml version=\"1.0\"?>"))} {sub(/^.*<\/log>/, $0, "&")} p; /<\/log>/{exit}'
This uses substr and sub to remove parts of lines that aren't desired.

Resources