jmespath sort whole string as one - aws-cli

I'm trying to sort a numerical field however it seems to parse each character in turn so 9 is 'higher' than 11' but lower than 91
Is there a way to sort by the whole string?
Example data:
{
"testing": [
{"name": "01"},
{"name": "3"},
{"name": "9"},
{"name": "91"},
{"name": "11"},
{"name": "2"}
]
}
Query:
reverse(sort_by(testing, &name))[*].[name]
result:
[
"91"
],
[
"9"
],
[
"3"
],
[
"2"
],
[
"11"
],
[
"01"
]
]
This can be tried at http://jmespath.org/
edit:
So I can get the correct output by piping it to sort -V but is there not an easier way?

Context
Jmespath latest version as of 2020-09-12
Use-case
DevBobDylan wants to sort string items in a JSON ArrayOfObject table
Solution
use Jmespath pipe operator to chain expressions together
Example
testing|[*].name|sort(#)
Screenshot

Related

Python how to check the dictonary have value in list and change to string type?

I have below dictionary in which value for some of keys are appeared as list.
i need to find those values which are list type and need to convert to string
details=
{
"dsply_nm": [
"1981 test test"
],
"eff_dt": [
"2021-04-21T00:01:00-04:00"
],
"exp_dt": [
"2022-04-21T00:01:00-04:00"
],
"pwr": [
"14"
],
"is_excl": [
"true"
],
"is_incl": [
"false"
],
"len": [
"14"
],
"max_spd": [
"14"
],
"id": "58",
"type_nm": "single",
"updt_dttm": "2021-04-21T13:40:11.148-04:00",
"nbr": "3",
"typ": "premium",
"nm": "test"
}
below code not seems to be working and its not returning as string
test={}
for k,v in details.items():
if isinstance(v, list):
test[k]="".join(str(v))
test[k]=v
print(test)
expected should be as below
{'dsply_nm':'1981 test test',
"eff_dt": "2021-04-21T00:01:00-04:00",
"exp_dt":"2022-04-21T00:01:00-04:00",
"pwr":"14",
"is_excl":"true",
"is_incl": "false",
"len":"14",
"max_spd":"14",
"id":"58",
"updt_dttm":"2021-04-21T13:40:11.148-04:00",
"nbr":"3",
"typ": "premium",
"nm":"test"
}
anybody can help on this?
Your code is almost correct - you forgot else clause and should not use str(v) before join.
This code works fine:
test={}
for k,v in details.items():
if isinstance(v, list):
test[k]="".join(v)
else:
test[k]=v
print(test)

using grok pattern to skip a part from mesage

how can we get just extract the sessionid number from a pattern in grok
for example
"sessionid$:999"
I am trying to use %{DATA:line} but it gets
"line": [
[
" Sessionid$:999"
]
how can just get the session number and ignore "sessionId$" in it
Thanks
Try this:
GROK pattern:
Sessionid.\s*:%{NUMBER:line}
OUTPUT:
{
"line": [
[
"999"
]
],
"BASE10NUM": [
[
"999"
]
]
}

How to search for a particular word from a non iterable json object

I'm working on resume parser. Now I am trying to add dynamism to code. So lets consider there are two resumes in a folder, I'm parsing them and returning required information as below in a json format in my API.
[
{
"dob": [],
"education": [
[
"MCA",
"2009"
]
],
"email": "Id:abcd#gmail.com",
"mobile_number": "+911234567",
"name": "abcd",
"personal_website": [
"abcd#gmail.com",
"http://www.arthavidhya.com",
"http://www.myhealthwatcher.com",
"http://www.i-southernworld.com/",
"http://www.i-chiletravel.com/"
],
"skills": [
"Bootstrap",
"Javascript",
"Js",
"Jquery",
"Interactive",
"My-sql",
"Css",
"Ajax",
"Apache",
"Php",
"Codignator"
]
},
{
"dob": [],
"education": [
[
"Btech",
"2018"
]
],
"email": "xyz#gmail.com",
"mobile_number": "+91987654321",
"name": "xyz",
"personal_website": [
"xyz#gmail.com",
"https://github.com/xyz"
],
"skills": [
"Sqlite",
"Tensorflow",
"C++",
"C",
"Php",
"Android",
"Mysql",
"Flask",
"Nlp",
"Javascript",
"Css",
"Keras",
"Machine learning",
"Python"
]
}
Now lets say I am passing input as 'Keras' then only the second file should be displayed. What logic can be used here? The difficulty I'm facing here is that my output is not iterable. Can someone give me hint on how to tackle this?
Please comment if my question is not clear or require any more details.
I think you can do something like this
def function1(json_object, name):
match = []
for dict in json_object:
for key, value in dict.items():
if (isinstance(value, str) and value == name) or (isinstance(value, list) and name in value):
match.append(dict)
return match
def main():
print(json.dumps(function1(DATA, "Keras"), indent=2))
Basically we are checking if the value in dict is a string or list and trying to match with the string or list items.
Up to you to make it case insensitive or whatever change you want
Your output is very much iterable because it is a list.
What you can do is, loop through each resume, check if the required skills exists in the skill part of the resume, if it does, add it to the list of resumes that have the same skill. Finally, return the list in json. I hope the code snippet below will make it more clear!
import json
#resumes is the json data you have, skill is a string of the skill you're looking for.
def match_skill(resumes, skill):
resumes_with_skill = []
for resume in resumes:
if skill in resume['skills']:
resumes_with_skill.append(resume)
return json.dumps(resumes_with_skill)
match_skill(data, 'Keras')

logstash grok, parse a line with json filter

I am using ELK(elastic search, kibana, logstash, filebeat) to collect logs. I have a log file with following lines, every line has a json, my target is to using Logstash Grok to take out of key/value pair in the json and forward it to elastic search.
2018-03-28 13:23:01 charge:{"oldbalance":5000,"managefee":0,"afterbalance":"5001","cardid":"123456789","txamt":1}
2018-03-28 13:23:01 manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}
I am using Grok Debugger to make regex pattern and see the result. My current regex is:
%{TIMESTAMP_ISO8601} %{SPACE} %{WORD:$:data}:{%{QUOTEDSTRING:key1}:%{BASE10NUM:value1}[,}]%{QUOTEDSTRING:key2}:%{BASE10NUM:value2}[,}]%{QUOTEDSTRING:key3}:%{QUOTEDSTRING:value3}[,}]%{QUOTEDSTRING:key4}:%{QUOTEDSTRING:value4}[,}]%{QUOTEDSTRING:key5}:%{BASE10NUM:value5}[,}]
As one could see it is hard coded since the keys in json in real log could be any word, the value could be integer, double or string, what's more, the length of the keys varies. so my solution is not acceptable. My solution result is shown as follows, just for reference. I am using Grok patterns.
My question is that trying to extract keys in json is wise or not since elastic search use json also? Second, if I try to take keys/values out of json, are there correct,concise Grok patterns?
current result of Grok patterns give following output when parsing first line in above lines.
{
"TIMESTAMP_ISO8601": [
[
"2018-03-28 13:23:01"
]
],
"YEAR": [
[
"2018"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"28"
]
],
"HOUR": [
[
"13",
null
]
],
"MINUTE": [
[
"23",
null
]
],
"SECOND": [
[
"01"
]
],
"ISO8601_TIMEZONE": [
[
null
]
],
"SPACE": [
[
""
]
],
"WORD": [
[
"charge"
]
],
"key1": [
[
""oldbalance""
]
],
"value1": [
[
"5000"
]
],
"key2": [
[
""managefee""
]
],
"value2": [
[
"0"
]
],
"key3": [
[
""afterbalance""
]
],
"value3": [
[
""5001""
]
],
"key4": [
[
""cardid""
]
],
"value4": [
[
""123456789""
]
],
"key5": [
[
""txamt""
]
],
"value5": [
[
"1"
]
]
}
second edit
Is it possible to use Json filter of Logstash? but in my case Json is part of line/event, not whole event is Json.
===========================================================
Third edition
I do not see updated solution functions well to parse json. My regex is as follows:
filter {
grok {
match => {
"message" => [
"%{TIMESTAMP_ISO8601}%{SPACE}%{GREEDYDATA:json_data}"
]
}
}
}
filter {
json{
source => "json_data"
target => "parsed_json"
}
}
It does not have key:value pair, instead it is msg+json string. The parsed json is not parsed.
Testing data is as below:
2018-03-28 13:23:01 manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}
2018-03-28 13:23:03 payment:{"cuurentValue":5001,"reload":0,"newbalance":"5002","posid":"987654321","something":"new3","additionalFields":2}
2018-03-28 13:24:07 management:{"cuurentValue":5002,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}
[2018-06-04T15:01:30,017][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"manage:{\"cuurentValue\":5000,\"payment\":0,\"newbalance\":\"5001\",\"posid\":\"123456789\",\"something\":\"new2\",\"additionalFields\":1}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'manage': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"manage:{"cuurentValue":5000,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}"; line: 1, column: 8]>}
[2018-06-04T15:01:30,017][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"payment:{\"cuurentValue\":5001,\"reload\":0,\"newbalance\":\"5002\",\"posid\":\"987654321\",\"something\":\"new3\",\"additionalFields\":2}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'payment': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"payment:{"cuurentValue":5001,"reload":0,"newbalance":"5002","posid":"987654321","something":"new3","additionalFields":2}"; line: 1, column: 9]>}
[2018-06-04T15:01:34,986][WARN ][logstash.filters.json ] Error parsing json {:source=>"json_data", :raw=>"management:{\"cuurentValue\":5002,\"payment\":0,\"newbalance\":\"5001\",\"posid\":\"123456789\",\"something\":\"new2\",\"additionalFields\":1}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'management': was expecting ('true', 'false' or 'null')
at [Source: (byte[])"management:{"cuurentValue":5002,"payment":0,"newbalance":"5001","posid":"123456789","something":"new2","additionalFields":1}"; line: 1, column: 12]>}
Please check the result:
You can use GREEDYDATA to assign entire block of json to a separate field like this,
%{TIMESTAMP_ISO8601}%{SPACE}%{GREEDYDATA:json_data}
This will create a separate file for your json data,
{
"TIMESTAMP_ISO8601": [
[
"2018-03-28 13:23:01"
]
],
"json_data": [
[
"charge:{"oldbalance":5000,"managefee":0,"afterbalance":"5001","cardid":"123456789","txamt":1}"
]
]
}
Then apply a json filter on json_data field as follows,
json{
source => "json_data"
target => "parsed_json"
}

Logstash parse date/time

I have the following I'm trying to parse with GROK:
Hello|STATSTIME=20-AUG-15 12.20.03.051000 PM|World
I can parse the first bunch of it with GROK like so:
match => ["message","%{WORD:FW}\|STATSTIME=%{MONTHDAY:MDAY}-%{WORD:MON}-%{INT:YY} %{INT:HH}"]
Anything further than that gives me an error. I can't figure out how to quote the : character, : does not work and %{TIME:time} does not work. I'd like to be able to get the whole thing as a timestamp, but can't get it broken up. Any ideas?
You can use this to debug grok expressions
The time format is as shown here
To parse 12.20.03.051000
%{INT:hour}.%{INT:min}.%{INT:sec}.%{INT:ms}
Output will be something like this
{
"hour": [
[
"12"
]
],
"min": [
[
"20"
]
],
"sec": [
[
"03"
]
],
"ms": [
[
"051000"
]
]
}

Resources