Replacing a value for a given key in Kusto - azure

I am trying to use the .set-or-replace command to amend the "subject" entry below from sample/consumption/backups to sample/consumption/backup but I am not having much look in the world of Kusto.
I can't seem to reference the sub headings within Records, data.
"source_": CustomEventRawRecords,
"Records": [
{
"metadataVersion": "1",
"dataVersion": "",
"eventType": "consumptionRecorded",
"eventTime": "1970-01-01T00:00:00.0000000Z",
"subject": "sample/consumption/backups",
"topic": "/subscriptions/1234567890id/resourceGroups/rg/providers/Microsoft.EventGrid/topics/webhook",
"data": {
"resourceId": "/subscriptions/1234567890id/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/vm"
},
"id": "1234567890id"
}
],
Command I've tried to get to work;
.set-or-replace [async] CustomEventRawRecords [with (subject = sample/consumption/backup [, ...])] <| QueryOrCommand

If you're already manipulating the data, why not turn it into a columnar representation? that way you can easily make the corrections you want to make and also get the full richness of the tabular operators plus an intellisense experience that will help you formulate queries easily
here's an example query that will do that:
execute query in browser
datatable (x: dynamic)[dynamic({"source_": "CustomEventRawRecords",
"Records": [
{
"metadataVersion": "1",
"dataVersion": "",
"eventType": "consumptionRecorded",
"eventTime": "1970-01-01T00:00:00.0000000Z",
"subject": "sample/consumption/backups",
"topic": "/subscriptions/1234567890id/resourceGroups/rg/providers/Microsoft.EventGrid/topics/webhook",
"data": {
"resourceId": "/subscriptions/1234567890id/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/vm"
},
"id": "1234567890id"
}
]})]
| extend records = x.Records
| mv-expand record=records
| extend subject = tostring(record.subject)
| extend subject = iff(subject == "sample/consumption/backups", "sample/consumption/backup", subject)
| extend metadataVersion = tostring(record.metadataVersion)
| extend dataVersion = tostring(record.dataVersion)
| extend eventType = tostring(record.eventType)
| extend topic= tostring(record.topic)
| extend data = record.data
| extend id = tostring(record.id)
| project-away x, records, record

Related

Azure resource graph query re-write default tag response

I am trying to alter the default response of Azure Resource graph query into similar that Azure portal uses. My query is:
resourcecontainers | where type == "microsoft.resources/subscriptions" | project name, tags
From where the response for tags is:
"tags": {
"TagA": "TeamA",
"TagB": "TeamB",
"TagC": "TeamC"
},
I would like to alter this into:
"tags": [
{
"name": "TagA",
"value": "TeamA"
},
{
"name": "TagB",
"value": "TeamB"
},
{
"name": "TagC",
"value": "TeamC"
}
]
How to do it? All examples I have found are either for only one tag or static set of tags. Mine would need to support dynamic amount of tags.
As you have confirmed to achieve the above requirement we can do it using the below query
resourcecontainers
| where type =~ 'microsoft.resources/subscriptions'
| mvexpand parsejson(tags)
| extend tagname = tostring(bag_keys(tags)[0])
| extend tagvalue = tostring(tags[tagname])
| project name, tagname, tagvalue
Output:-
For more information please refer this BLOG

How can I sort price string in ElasticSearch?

In my example i tried to sort but i have no success. My problem is because my price is string and the price is like that => 1.300,00. When I sort string price i have that for exemplo. 0,00 | 1,00 | 1.000,00 | 2,00.
I wanna format format in double for sort or like similar that.
How can i do that ?
It is not a good idea to keep Price as a keyword in Elastic search best approach would be to map price as scaled float in elastic search like this:
New Mapping:
PUT [index_name]/_mapping
{
"properties": {
"price2": {
"type": "scaled_float",
"scaling_factor": 100
}
}
}
To solve your problem you can add new mapping and convert your value from string to numeric value:
Update by query:
POST [index_name]/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source['price2'] = ctx._source['price'].replace(',','')"
}
}
This query will convert your keyword value to string and map it in another field named price2, then you will need to have an ingest pipeline to do the process to new entries:
Ingest pipeline:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"script": {
"description": "Extract 'tags' from 'env' field",
"lang": "painless",
"source": "ctx['price2'] = ctx['price'].replace(',','')"
}
}
]
},
"docs": [
{
"_source": {
"price": "5,000.00"
}
}
]
}
You need to remove _simulate and add this ingest pipeline to your index.

How to give my own _id while inserting data in Elasticsearch?

I have a sample database as below:
SNO
Name
Address
99123
Mike
Texas
88124
Tom
California
I want to keep my SNO in elastic search _id to make it easier to update documents according to my SNO.
Python code to create an index:
abc = {
"settings": {
"number_of_shards": 2,
"number_of_replicas": 2
}
}
es.indices.create(index='test',body = abc)
I fetched data from postman as below:
{
"_index": "test",
"_id": "13",
"_data": {
"FirstName": "Sample4",
"LastName": "ABCDEFG",
"Designation": "ABCDEF",
"Salary": "99",
"DateOfJoining": "2020-05-05",
"Address": "ABCDE",
"Gender": "ABCDE",
"Age": "21",
"MaritalStatus": "ABCDE",
"Interests": "ABCDEF",
"timestamp": "2020-05-05T14:42:46.394115",
"country": "Nepal"
}
}
And Insert code in python is below:
req_JSON = request.json
input_index = req_JSON['_index']
input_id = req_JSON['_id']
input_data = req_JSON['_data']
doc = input_data
res = es.index(index=input_index, body=doc)
I thought _id will remain the same as what I had given but it generated the auto _id.
You can simply do it like this:
res = es.index(index=input_index, body=doc, id=input_id)
^
|
add this

Type mismatch when importing JSON into Neo4J with Py2Neo

I have a JSON file, the head of which looks like this:
"exports": {
"type": "WordsAndPhrases",
"date": "2018-08-02T10:07:58.047669Z",
"relevantYears": "2012,2013,2014,2015,2016,2017",
"Words": {
"H1": "WORDS AND PHRASES:",
"Word": [
{
"Phrase": {
"id": "phrase_2011001932",
"title": "A common"
},
"Document": "Law of Property Act 1925, s 193(1) (as amended)",
"Refs": {
"CaseTitle": {
"id": "2011201246",
"title": "ADM Milling Ltd v Tewkesbury Town Council"
},
"title": "None",
"citations": "Lewison J [2011] EWHC 595 (Ch); [2012] Ch 99; [2011] 3 WLR 674, Ch D"
}
},
I am using the following script to import the JSON data into Neo4J:
import json
from py2neo import Graph, authenticate
authenticate("localhost:7474", "neo4j", "foobar")
graph = Graph()
with open('wp.json') as data_file:
json = json.load(data_file)
query = """
WITH {json} AS document
UNWIND document.exports.Words.Word AS Word
MERGE (Phrase:a {phrase: Word.Phrase.title})
MERGE (Document:b {document: Word.Document})
FOREACH (case in Word.Refs.CaseTitle.title | MERGE (Report:z {report: case}))
"""
# Send Cypher query.
print (graph.run(query, json = json).dump())
The first two MERGE queries work fine. However, the FOREACH query is proving to be problematic. I'm using the FOREACH query to deal with instances where there are multiple CaseTitle properties in a single block, for example:
{
"Phrase": {
"id": "phrase_2011002042",
"title": "Acts contrary to purposes and principles of United Nations"
},
"Document": "Council Directive 2004/83/EC, art 12(2)(c)",
"Refs": [
{
"CaseTitle": {
"id": "2011201814",
"title": "Federal Republic of Germany v B"
},
"title": "None",
"citations": "(Joined Cases C-57/09 and C-101/09); [2012] 1 WLR 1076, ECJ"
},
{
"CaseTitle": {
"id": "2016008987",
"title": "Commissaire général aux réfugiés et aux apatrides v Lounani"
},
"title": "None",
"citations": "EU:C:2017:71; [2017] 4 WLR 52, ECJ"
}
]
},
When I run the script, the following error occurs:
py2neo.database.status.CypherTypeError: Type mismatch: expected a map but was List{Map{title -> String("None"), CaseTitle -> Map{title -> String("Federal Republic of Germany v B"), id -> String("2011201814")}, citations -> String("(Joined Cases C-57/09 and C-101/09); [2012] 1 WLR 1076, ECJ")}, Map{title -> String("None"), CaseTitle -> Map{title -> String("Commissaire général aux réfugiés et aux apatrides v Lounani"), id -> String("2016008987")}, citations -> String("EU:C:2017:71; [2017] 4 WLR 52, ECJ")}}
The JSON appears to be valid. Can anyone recommend a way of dealing with this error?
Instead of Word.Refs.CaseTitle.title you should use Word.Refs[0].CaseTitle.title. In your JSON it is clear that refs is an array, and you're treating it like an object. The error message is saying this -- when it tries to dereference "CaseTitle" under "Refs", it is expecting a map, but what you gave it was a list of maps.

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?
I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}
ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.
If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Resources