sort output of describe-instances? - aws-cli

I saw the previous question on this topic, but the answer was just "pipe it to a scripting language!", which I find unsatisfying. I know that JMESPath has sort_by, and sort, but I can't figure out how to use them.
I have
aws ec2 describe-instances \
--filters "Name=tag:Group,Values=production" "Name=instance-state-name,Values=running" "Name=tag:Name,Values=prod-*-${CURRENT_SHA}-*" \
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`] | [0].Value]' \
--output table
And it outputs the right data, just in a random order. I want to sort by the last column of the data, Tag Name, aka Tags[?Key==`Name`], which in raw form looks like this:
{
"Tags": [{
"Value": "application-server-ab3634b34364a-2",
"Key": "Name"
}, {
"Value": "production",
"Key": "Group"
}]
}
Thoughts?

short answer
add
[] | sort_by(#, &[3])
at the end of your expression. The brackets ([]) will flatten the structure, sort_by(...) will sort the result (which is a four-column table) by the fourth column. The full query will be:
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`] | [0].Value][] | sort_by(#, &[3])'
long answer
inspecting your current query result
According to the describe-instances docs, the structure of the describe-instances output looks like this:
{
"Reservations": [
{
"Instances": [
{
"LaunchTime": "..LaunchTime..",
"InstanceId": "R1I1",
"PrivateIpAddress": "..PrivateIpAddress..",
"Tags": [{"Key": "Name", "Value": "foo"}]
},
{
"LaunchTime": "..LaunchTime..",
"InstanceId": "R1I2",
"PrivateIpAddress": "..PrivateIpAddress..",
"Tags": [{"Key": "Name", "Value": "baz"}]
}
]
},
{
"Instances": [
{
"LaunchTime": "..LaunchTime..",
"InstanceId": "R2I1",
"PrivateIpAddress": "..PrivateIpAddress..",
"Tags": [{"Key": "Name", "Value": "bar"}]
}
]
}
]
}
Using your original query
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`] | [0].Value]'
will output
[
[
[
"..LaunchTime..",
"R1I1",
"..PrivateIpAddress..",
"foo"
],
[
"..LaunchTime..",
"R1I2",
"..PrivateIpAddress..",
"baz"
]
],
[
[
"..LaunchTime..",
"R2I1",
"..PrivateIpAddress..",
"bar"
]
]
]
flattening the query result
You can see in the above result of your query that you're getting a list of tables ([[{},{}],[{}]]). I suppose you instead want a single non-nested table ([{},{},{}]). To achieve that, simply add [] at the end of your query, i.e.
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`] | [0].Value][]'
This will flatten the structure, resulting in
[
[
"..LaunchTime..",
"R1I1",
"..PrivateIpAddress..",
"foo"
],
[
"..LaunchTime..",
"R1I2",
"..PrivateIpAddress..",
"baz"
],
[
"..LaunchTime..",
"R2I1",
"..PrivateIpAddress..",
"bar"
]
]
Now it's time to sort the table.
sorting the table
When using sort_by you shouldn't forget to prepend the expression by & (ampersand). This way you specify a reference to that expression, which is then passed to sort_by.
example: data | sort_by(#, &#) is equivalent to data | sort(#).
The TagName in the table you create ([LaunchTime,InstanceId,PrivateIpAddress,TagName]) is the fourth column. You can get that column by piping the table to the expression [3]:
TableExpression | [3]
But instead, you want to sort the table by the fourth column. You can do so like this:
TableExpression | sort_by(#, &[3])
and the resulting query will be:
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`][] | [0].Value] | sort_by(#, &[3])'
Query result:
[
[
"..LaunchTime..",
"R2I1",
"..PrivateIpAddress..",
"bar"
],
[
"..LaunchTime..",
"R1I2",
"..PrivateIpAddress..",
"baz"
],
[
"..LaunchTime..",
"R1I1",
"..PrivateIpAddress..",
"foo"
]
]

As an enhancement to #ColinK's answer, I wanted to sort a table that had custom column headers but struggled with the syntax. I eventually got it to work so I thought I'd share in case someone else wanted to do the same. I added a column for State and sorted by that column.
--query 'sort_by(Reservations[*].Instances[*].{LaunchTime:LaunchTime, ID:InstanceId,IP:PrivateIpAddress,State:State.Name,Name:Tags[?Key==`Name`] | [0].Value}[], &State)'

Here is an other example that works also:
aws ec2 describe-instances --query 'Reservations[*].Instances[*].{Name:Tags[?Key==`Name`]|[0].Value,Instance:InstanceId} | sort_by(#, &[0].Name)'

The answer is to add | sort_by(#, &#[0][3])
aws ec2 describe-instances \
--filters "Name=tag:Group,Values=production" "Name=instance-state-name,Values=running" "Name=tag:Name,Values=prod-*-${CURRENT_SHA}-*" \
--query 'Reservations[*].Instances[*].[LaunchTime,InstanceId,PrivateIpAddress,Tags[?Key==`Name`] | [0].Value]| sort_by(#, &#[0][3])' \
--output table

Related

BASH How to treat JSON inside a variable to remove only a specific part of the text

I have a big JSON inside a var and I need to remove only from that specific comma ( and I have incontable number of others comma before that ) until the penultimate Curly Brackets..
in short, Only the BOLD text..... ( text between ** and next ** )
edit
originally there is no ** in json, I put it in the code just to show where it starts and ends what I want to remove
##################################################
}
]
}**,
"meta": {
"timeout": 0,
"priority": "LOW_PRIORITY",
"validationType": "SAME_FINGERS",
"labelFilters": [],
"externalIDs": [
{
"name": "chaveProcesso",
"key": "01025.2021.0002170"
}
]
}**
}
It would help if you showed more context, but basically you want something like:
jq 'del(.meta)'
or:
jq 'with_entries(select(.key != "meta"))'
eg:
#!/bin/sh
json='{
"foo": 5,
"meta": {
"timeout": 0,
"priority": "LOW_PRIORITY",
"validationType": "SAME_FINGERS",
"labelFilters": [],
"externalIDs": [
{
"name": "chaveProcesso",
"key": "01025.2021.0002170"
}
]
}
}'
echo "$json" | jq 'del(.meta)'

Replacing a value for a given key in Kusto

I am trying to use the .set-or-replace command to amend the "subject" entry below from sample/consumption/backups to sample/consumption/backup but I am not having much look in the world of Kusto.
I can't seem to reference the sub headings within Records, data.
"source_": CustomEventRawRecords,
"Records": [
{
"metadataVersion": "1",
"dataVersion": "",
"eventType": "consumptionRecorded",
"eventTime": "1970-01-01T00:00:00.0000000Z",
"subject": "sample/consumption/backups",
"topic": "/subscriptions/1234567890id/resourceGroups/rg/providers/Microsoft.EventGrid/topics/webhook",
"data": {
"resourceId": "/subscriptions/1234567890id/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/vm"
},
"id": "1234567890id"
}
],
Command I've tried to get to work;
.set-or-replace [async] CustomEventRawRecords [with (subject = sample/consumption/backup [, ...])] <| QueryOrCommand
If you're already manipulating the data, why not turn it into a columnar representation? that way you can easily make the corrections you want to make and also get the full richness of the tabular operators plus an intellisense experience that will help you formulate queries easily
here's an example query that will do that:
execute query in browser
datatable (x: dynamic)[dynamic({"source_": "CustomEventRawRecords",
"Records": [
{
"metadataVersion": "1",
"dataVersion": "",
"eventType": "consumptionRecorded",
"eventTime": "1970-01-01T00:00:00.0000000Z",
"subject": "sample/consumption/backups",
"topic": "/subscriptions/1234567890id/resourceGroups/rg/providers/Microsoft.EventGrid/topics/webhook",
"data": {
"resourceId": "/subscriptions/1234567890id/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/vm"
},
"id": "1234567890id"
}
]})]
| extend records = x.Records
| mv-expand record=records
| extend subject = tostring(record.subject)
| extend subject = iff(subject == "sample/consumption/backups", "sample/consumption/backup", subject)
| extend metadataVersion = tostring(record.metadataVersion)
| extend dataVersion = tostring(record.dataVersion)
| extend eventType = tostring(record.eventType)
| extend topic= tostring(record.topic)
| extend data = record.data
| extend id = tostring(record.id)
| project-away x, records, record

jq parsing and linux formatting to desired output

I am trying to format json output and exclude an element when a condition is met.
1) In this case I'd like to exclude any element that contains "valueFrom" using jq
[{
"name": "var1",
"value": "var1value"
},
{
"name": "var2",
"value": "var2value"
},
{
"name": "var3",
"value": "var3value"
},
{
"name": "var4",
"value": "var4value"
},
{ # <<< exclude this element as valueFrom exists
"name": "var5",
"valueFrom": {
"secretKeyRef": {
"key": "var5",
"name": "var5value"
}
}
}
]
After excluding the element mentioned above I am trying to return a result set that looks like this.
var1: var1value
var2: var2value
var3: var3value
var4: var4value
Any feedback is appreciated. Thanks.
Select array items that doesn't have the valueFrom key using a combination of select/1, has/1, and not/0. Then format the objects as you please.
$ jq -r '.[] | select(has("valueFrom") | not) | "\(.name): \(.value)"' input.json

jmespath sort whole string as one

I'm trying to sort a numerical field however it seems to parse each character in turn so 9 is 'higher' than 11' but lower than 91
Is there a way to sort by the whole string?
Example data:
{
"testing": [
{"name": "01"},
{"name": "3"},
{"name": "9"},
{"name": "91"},
{"name": "11"},
{"name": "2"}
]
}
Query:
reverse(sort_by(testing, &name))[*].[name]
result:
[
"91"
],
[
"9"
],
[
"3"
],
[
"2"
],
[
"11"
],
[
"01"
]
]
This can be tried at http://jmespath.org/
edit:
So I can get the correct output by piping it to sort -V but is there not an easier way?
Context
Jmespath latest version as of 2020-09-12
Use-case
DevBobDylan wants to sort string items in a JSON ArrayOfObject table
Solution
use Jmespath pipe operator to chain expressions together
Example
testing|[*].name|sort(#)
Screenshot

Removing pattern from multiple lines using sed or awk in two places in the same line

I have a JSON file with 12,166,466 of lines.
I want to remove quotes from values on keys:
"timestamp": "1538564256",and "score": "10", to look like
"timestamp": 1538564256, and "score": 10,.
Input:
{
"title": "DNS domain", ,
"timestamp": "1538564256",
"domain": {
"dns": [
"www.google.com"
]
},
"score": "10",
"link": "www.bit.ky/sdasd/asddsa"
"id": "c-1eOWYB9XD0VZRJuWL6"
}, {
"title": "DNS domain",
"timestamp": "1538564256",
"domain": {
"dns": [
"google.de"
]
},
"score": "10",
"link": "www.bit.ky/sdasd/asddsa",
"id": "du1eOWYB9XD0VZRJuWL6"
}
}
Expected output:
{
"title": "DNS domain", ,
"timestamp": 1538564256,
"domain": {
"dns": [
"www.google.com"
]
},
"score": 10,
"link": "www.bit.ky/sdasd/asddsa"
"id": "c-1eOWYB9XD0VZRJuWL6"
}, {
"title": "DNS domain",
"timestamp": 1538564256,
"domain": {
"dns": [
"google.de"
]
},
**"score": 10,**
"link": "www.bit.ky/sdasd/asddsa",
"id": "du1eOWYB9XD0VZRJuWL6"
}
}
I have tried:
sed -E '
s/"timestamp": "/"timestamp": /g
s/"score": "/"score": /g
'
the first part is quite straightforward, but how to remove ", at that the end of the line that contains "timestamp" and "score"? How do I access that using sed or even awk, or other tool with the mind that I have 12 million lines to process?
Assuming that you fix your JSON input file like this:
<file jq .
[
{
"title": "DNS domain",
"timestamp": "1538564256",
"domain": {
"dns": [
"www.google.com"
]
},
"score": "10",
"link": "www.bit.ky/sdasd/asddsa",
"id": "c-1eOWYB9XD0VZRJuWL6"
},
{
"title": "DNS domain",
"timestamp": "1538564256",
"domain": {
"dns": [
"google.de"
]
},
"score": "10",
"link": "www.bit.ky/sdasd/asddsa",
"id": "du1eOWYB9XD0VZRJuWL6"
}
]
You can use jq and its tonumber function to change the wanted strings to values:
<file jq '.[].timestamp |= tonumber | .[].score |= tonumber'
If the JSON structure matches roughly your example (e. g., there won't be any other whitespace characters between "timestamp", the colon, and the value), then this awk should be ok. If available, using jq for JSON transformation is the better choice by far!
awk '{print gensub(/("(timestamp|score)": )"([0-9]+)"/, "\\1\\3", "g")}' file
Be warned that tonumber can lose precision. If using tonumber is inadmissible, and if the output is produced by jq (or otherwise linearized vertically), then using awk as proposed elsewhere on this page is a good way to go. (If your awk does not have gensub, then the awk program can be easily adapted.) Here is the same thing using sed, assuming its flag for extended regex processing is -E:
sed -E -e 's/"(timestamp|score)": "([0-9]+)"/"\1": \2/'
For reference, if there's any doubt about where the relevant keys are located, here's a filter in jq that is agnostic about that:
walk(if type == "object"
then if has("timestamp") then .timestamp|=tonumber else . end
| if has("score") then .score|=tonumber else end
else . end)
If your jq does not have walk/1, then simply snarf its def from the web, e.g. from https://raw.githubusercontent.com/stedolan/jq/master/src/builtin.jq
If you wanted to convert all number-valued strings to numbers, you could write:
walk(if type=="object" then map_values(tonumber? // .) else . end)
This might work for you (GNU sed):
sed ':a;/"timestamp":\s*"1538564256",/{s/"//3g;:b;n;/timestamp/ba;/"score":\s*"10"/s/"//3g;Tb}' file
On encountering a line that contains "timestamp": "1538564256", remove the 3rd or more "'s. Then read on until another line containing timestamp and repeat or a line containing "score": "10 and remove the 3rd or more "'s.

Resources