Arangodb groupby query multiple fields - arangodb

I have a database of products in an arangodb collection in which a product has multiple sizes. The issue is that for each size, the same product is repeated. But each product has a common group number. Like this:
{"name": "product1", "description": "someDescription", size: 5,price: 12 groupNumber: 12}
{"name": "product1", "description": "someDescription", size: 15, price: 26, groupNumber: 12}
{"name": "product1", "description": "someDescription", size: 25, price: 84, groupNumber: 12}
{"name": "product1", "description": "someDescription", size: 35, price: 106, groupNumber: 12}
{"name": "product2", "description": "someDescription", size: 5, price: 12, groupNumber: 11}
{"name": "product2", "description": "someDescription", size: 15, price: 22, groupNumber: 11}
{"name": "product2", "description": "someDescription", size: 25, price: 32, groupNumber: 11}
{"name": "product2", "description": "someDescription", size: 35, price: 43, groupNumber: 11}
I have to now display the list of products(in a web page) but each product should appear only once with sizes and prices in an array for each product like this:
product1 someDescription sizes: 5,15,25,35, prices: 12,26,84,106
product2 someDescription sizes: 5,15,25,35, prices: 12,22,32,43
...
How do I do it?

Ignoring the groupNumber and grouping by name, the query looks like this:
FOR p IN products
COLLECT description = p.description, name = p.name INTO groups
RETURN {
"name" : name,
"description": description,
"prices" : groups[*].p.price,
"sizes" : groups[*].p.size
}
Given your (corrected) example data, the query returns:
[
{
"name": "product1",
"description": "someDescription",
"prices": [
12,
84,
106,
26
],
"sizes": [
5,
25,
35,
15
]
},
{
"name": "product2",
"description": "someDescription",
"prices": [
43,
32,
22,
12
],
"sizes": [
35,
25,
15,
5
]
}
]
The grouped values aren't sorted, but the positions of sizes and prices correspond, you can alleviate this fact to zip the values into a size-price map:
FOR p IN products
COLLECT description = p.description, name = p.name INTO groups
RETURN {
"name" : name,
"description": description,
"size_price_map" : ZIP(groups[*].p.size, groups[*].p.price)
}
yielding:
[
{
"name": "product1",
"description": "someDescription",
"size_price_map": {
"5": 12,
"15": 26,
"25": 84,
"35": 106
}
},
{
"name": "product2",
"description": "someDescription",
"size_price_map": {
"5": 12,
"15": 22,
"25": 32,
"35": 43
}
}
]

Related

How to get a variable from a list inside a dictionary?

I have a data like this;
{"result": [{"name": "Mil", "age": 21, "id": 1}, {"name": "Jen", "age": 23, "id": 2}, {"name": "Rosa", "age": 26, "id": 3}]}
How can I get just the 'age' value of each individual from a data like this in Python?
You can access the ages like this:
d = {"result": [{"name": "Mil", "age": 21, "id": 1}, {"name": "Jen", "age": 23, "id": 2}, {"name": "Rosa", "age": 26, "id": 3}]}
ages = [res["age"] for res in d["result"]] # [21, 23, 26]
Indeed, d["result"] is a list of dicts. So, in a more expanded form, it is the same as:
d = {"result": [{"name": "Mil", "age": 21, "id": 1}, {"name": "Jen", "age": 23, "id": 2}, {"name": "Rosa", "age": 26, "id": 3}]}
results = d["result"] # a list
ages = []
for res in results:
age = res["age"]
ages.append(age)
print(ages) # [21, 23, 26]
You can this code:
dic={"result": [{"name": "Mil", "age": 21, "id": 1}, {"name": "Jen", "age": 23, "id": 2}, {"name": "Rosa", "age": 26, "id": 3}]}
lst = dic["result"]
age_lst =[]
for i in range(len(lst)):
age =lst[i]["age"]
age_lst = [age]+age_lst
print(age_lst)

timeUnit does not work after a flatten and flod transformation

Is it possible to use timeUnit after a flatten and flod transformation?
In the example below it doesnt work!
If I remove the timeUnit from the x axis it plots, but without the good things that come with the timeUnit.
Thanks
This is an example code that can be executed in the link below
https://vega.github.io/editor/#/edited
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "Sales in a Year.",
"width": 500,
"height": 200,
"data": {
"values": [
{"timestamp": ["2019-01-01","2019-02-01","2019-03-01","2019-04-01","2019-05-01","2019-06-01",
"2019-07-01","2019-08-01","2019-09-01","2019-10-01","2019-11-01","2019-12-01"],
"cars" : [55, 43, 91, 81, 53, 19, 87, 52, 52, 44, 52, 52],
"bikes" : [12, 6, 2, 0, 0, 0, 0, 0, 0, 3, 9, 15]}
]
},
"transform": [
{"flatten": ["timestamp", "cars", "bikes"]},
{"fold": ["cars", "bikes"]}
],
"mark": {"type":"bar", "tooltip": true, "cornerRadiusEnd": 4},
"encoding": {
"x": {"field": "timestamp",
"timeUnit": "month",
"type": "ordinal",
"title": "",
"axis": {"labelAngle": 0}},
"y": {"field": "value",
"type": "quantitative",
"title": "Soiling Loss"},
"color":{"field": "key",
"type": "nominal"}
}
}
For convenience, strings in input data with a simple temporal encoding are automatically parsed as dates, but such parsing is not applied to data that is the result of a transformation.
In this case, you can do the parsing manually with a calculate transform (view in editor):
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "Sales in a Year.",
"width": 500,
"height": 200,
"data": {
"values": [
{
"timestamp": [
"2019-01-01",
"2019-02-01",
"2019-03-01",
"2019-04-01",
"2019-05-01",
"2019-06-01",
"2019-07-01",
"2019-08-01",
"2019-09-01",
"2019-10-01",
"2019-11-01",
"2019-12-01"
],
"cars": [55, 43, 91, 81, 53, 19, 87, 52, 52, 44, 52, 52],
"bikes": [12, 6, 2, 0, 0, 0, 0, 0, 0, 3, 9, 15]
}
]
},
"transform": [
{"flatten": ["timestamp", "cars", "bikes"]},
{"fold": ["cars", "bikes"]},
{"calculate": "toDate(datum.timestamp)", "as": "timestamp"}
],
"mark": {"type": "bar", "tooltip": true, "cornerRadiusEnd": 4},
"encoding": {
"x": {
"field": "timestamp",
"timeUnit": "month",
"type": "ordinal",
"title": "",
"axis": {"labelAngle": 0}
},
"y": {"field": "value", "type": "quantitative", "title": "Soiling Loss"},
"color": {"field": "key", "type": "nominal"}
}
}

Node JS How to build a nested object using parent ids

for my internship, I need to build a nested object using parent IDs I don't want children attribute array.
I have an array of object with id and parent id and I use npm flatnest to do it. This works for a one-level hierarchy but the code must be adapted for a multi-hierarchy level.
I don't know how to adapt that to multi-hierarchy level.
This is my array of Object
var fn = require("flatnest");
const flat =
[
{ "id": 1, "name": 'Restaurants', 'parentId': 0},
{ "id": 2, "name": 'family restaurant', 'parentId': 1, 'value':'Excellent'},
{ "id": 3, "name": 'Sun restaurant', 'parentId': 1,'value':""},
{ "id": 4, "name": 'Sun restaurant 1', 'parentId': 3, 'value':'Good'},
{ "id": 5, "name": 'Sun restaurant 2', 'parentId': 3, 'value':"bad"},
{ "id": 6, "name": 'Hotels', 'parentId': 0,'value':""},
{ "id": 7, "name": 'Space Hotel', 'parentId': 6,'value':""},
{ "id": 8, "name": 'Sun Hotel', 'parentId': 7,'value':'Nice'},
{ "id": 9, "name": 'Moon Hotel', 'parentId': 7,'value':""},
{ "id": 10, "name": 'Moon Hotel 1', 'parentId': 9, 'value':"Excellent"},
{ "id": 11, "name": 'Moon Hotel 2', 'parentId': 9, 'value':"Worst"},
];
To use nested function of npm flatnest, I have to flat my array of Object (const flat)
My code to flat :
var transform={};
for(var i=0;i<flat.length;i++)
{
if(typeof flat[parseInt(i)+1] !== 'undefined' )
{
if(flat[i].id==flat[i+1].parentId)
{
var t = flat[i].name;
transform[t.concat(".").concat(flat[i+1].name)]=flat[i+1].value;
}else{
transform[t.concat(".").concat(flat[i+1].name)]=flat[i+1].value;
}
}
}
console.log(transform)
var nested = fn.nest(transform)
console.log(nested)
I expect the output of console.log(transform) to be
{ 'Restaurants.family restaurant':'Excellent',
'Restaurants.Sun restaurant.Sun restaurant 1': 'Good',
'Restaurants.Sun restaurant.Sun restaurant 2': 'bad',
'Hotels.Space Hotel.Sun Hotel': 'Nice',
'Hotels.Space Hotel.Moon Hotel.Moon Hotel 1': 'Excellent',
'Hotels.Space Hotel.Moon Hotel.Moon Hotel 2' : 'Worst'}
Then by using nested function :
var nested = fn.nest(transform)
console.log(nested)
The output must be exactly like that :
"Restaurants":{
"family restaurant":"Excellent",
"Sun restaurant":{
"Sun restaurant 1":"Good",
"Sun restaurant 2":"bad"
}
},
"Hotels":{
"Space Hotel":{
"Sun Hotel":"Nice",
"Moon Hotel":{
"Moon Hotel 1":"Excellent",
"Moon Hotel 2":"Worst"
}
}
}
}
but the actual output of console.log(transform) is :
{'Restaurants.family restaurant':'Excellent',
'Restaurant.Sun restaurant':'',
'Sun restaurant.Sun restaurant 1':'Good',
'Sun restaurant.Sun restaurant 2':'bad',
'Sun restaurant.Hotels':'',
'Hotels.Space Hotel':'',
'Space Hotel.Sun Hotel':'Nice'
'Space Hotel.Moon Hotel':'',
'Moon Hotel.Moon Hotel 1':'Excellent',
'Moon Hotel.Moon Hotel 2': 'Worst'}
I'm not using flatnest. But the below code works for me. Please check and let me know if it doesn't work for any scenario.
const flat = [{
"id": 1,
"name": 'Restaurants',
'parentId': 0
},
{
"id": 2,
"name": 'family restaurant',
'parentId': 1,
'value': 'Excellent'
},
{
"id": 3,
"name": 'Sun restaurant',
'parentId': 1,
'value': ""
},
{
"id": 4,
"name": 'Sun restaurant 1',
'parentId': 3,
'value': 'Good'
},
{
"id": 5,
"name": 'Sun restaurant 2',
'parentId': 3,
'value': "bad"
},
{
"id": 6,
"name": 'Hotels',
'parentId': 0,
'value': ""
},
{
"id": 7,
"name": 'Space Hotel',
'parentId': 6,
'value': ""
},
{
"id": 8,
"name": 'Sun Hotel',
'parentId': 7,
'value': 'Nice'
},
{
"id": 9,
"name": 'Moon Hotel',
'parentId': 7,
'value': ""
},
{
"id": 10,
"name": 'Moon Hotel 1',
'parentId': 9,
'value': "Excellent"
},
{
"id": 11,
"name": 'Moon Hotel 2',
'parentId': 9,
'value': "Worst"
},
];
const map = new Map();
const result = flat.reduce((acc, curr) => {
let val = {}
if (curr.parentId == 0)
acc[curr.name] = val;
else {
if (map.get(curr.parentId)) {
if (curr.value != '')
val = curr.value;
map.get(curr.parentId)[curr.name] = val;
}
}
map.set(curr.id, val);
return acc;
}, {});
console.log(JSON.stringify(result));

How could I get a dict having max value for a key in a list containing multiple dict?

I have this list
result = [{"name": "A", "score": 35, "other_details": ""},
{"name": "A", "score": 60,"other_details": ""},
{"name": "B", "score": 45, "other_details": ""},
{"name": "B", "score": 34, "other_details": ""},
{"name":"C", "score": 65, "other_details": ""}]
Now, I want to get the whole dictionary on the basis of maximum score for each name.
My expected output is:
[{"name": "A", "score": 60,"other_details": ""}]
[{"name": "B", "score": 45, "other_details": ""}]
[{"name":"C", "score": 65, "other_details": ""}]
Using itertools.groupby
Ex:
from itertools import groupby
result = [{"name": "A", "score": 35, "other_details": ""},
{"name": "A", "score": 60,"other_details": ""},
{"name": "B", "score": 45, "other_details": ""},
{"name": "B", "score": 34, "other_details": ""},
{"name":"C", "score": 65, "other_details": ""}]
data_result = [max(list(v), key=lambda x: x["score"]) for k, v in groupby(sorted(result, key=lambda x: x["name"]), lambda x: x["name"])]
print(data_result)
Output:
[{'name': 'A', 'other_details': '', 'score': 60},
{'name': 'B', 'other_details': '', 'score': 45},
{'name': 'C', 'other_details': '', 'score': 65}]

Kibana: searching for a specific phrase, returns without results, while another search returns the phrase

Looks like a simple usecase but for some reason I just can't figure out how to do this, or google a clear example.
Lets say I have a message stored in logstash
message:
"info: 2015-11-28 22:02:19,232:common:INFO:ENV: Production
User:None:Username:None:LOG: publishing to bus "
And I want to search in kibana (version 4) for the phrase:"publishing to bus"
I'll get a set of results
But if I'll search for: "None:LOG: publishing to bus"
Then I get "No results found".
While Obviously this phrase does exists and is returned by the previous search.
So my question is basically - What is going on? What is the correct way to search for a possible long phrase and why does the second example fail.
EDIT:
The stored JSON.
{
"_index": "logz-ngdxrkmolklnvngumaitximbohqwbocg-151206_v1",
"_type": "django_logger",
"_id": "AVF2DPxZZst_8_8_m-se",
"_score": null,
"_source": {
"log": " publishing to bus {'user_id': 8866, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}",
"logger": "common",
"log_level": "INFO",
"message": "2015-12-06 06:47:21,298:common:INFO:ENV: Production User:None:Username:None:LOG: publishing to bus {'user_id': 8866, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}",
"type": "django_logger",
"tags": [
"celery"
],
"path": "//path/to/logs/out.log",
"environment": "Staging",
"#timestamp": "2015-12-06T06:47:21.298+00:00",
"user_id": "None",
"host": "path.to.host",
"timestamp": "2015-12-06 06:47:21,298",
"username": "None"
},
"fields": {
"#timestamp": [
1449384441298
]
},
"highlight": {
"message": [
"2015-12-06 06:47:21,298:common:INFO:ENV: Staging User:None:Username:None:LOG: #kibana-highlighted-field#publishing#/kibana-highlighted-field# #kibana-highlighted-field#to#/kibana-highlighted-field# #kibana-highlighted-field#bus#/kibana-highlighted-field# {'user_id': **, 'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441, 'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD': Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event': 'AccountInstrumentsUpdated', 'minute': 1449384420}"
]
},
"sort": [
1449384441298
]
}
Accodrding to Elasticsearch, it uses standard analyzer as default. The standard analyzer tokenizes the message field as follows:
"2015-12-06 06:47:21,298:common:INFO:ENV: Production
User:None:Username:None:LOG: publishing to bus {'user_id': 8866,
'event_id': 'aibRBPcLxcAzsEVRtFZVU5', 'timestamp': 1449384441,
'quotes': {}, 'rates': {u'EURUSD': Decimal('1.061025'), u'GBPUSD':
Decimal('1.494125'), u'EURGBP': Decimal('0.710150')}, 'event':
'AccountInstrumentsUpdated', 'minute': 1449384420}"
{
"tokens": [
{
"token": "2015",
"start_offset": 0,
"end_offset": 4,
"type": "<NUM>",
"position": 0
},
{
"token": "12",
"start_offset": 5,
"end_offset": 7,
"type": "<NUM>",
"position": 1
},
{
"token": "06",
"start_offset": 8,
"end_offset": 10,
"type": "<NUM>",
"position": 2
},
{
"token": "06",
"start_offset": 11,
"end_offset": 13,
"type": "<NUM>",
"position": 3
},
{
"token": "47",
"start_offset": 14,
"end_offset": 16,
"type": "<NUM>",
"position": 4
},
{
"token": "21,298",
"start_offset": 17,
"end_offset": 23,
"type": "<NUM>",
"position": 5
},
{
"token": "common:info:env",
"start_offset": 24,
"end_offset": 39,
"type": "<ALPHANUM>",
"position": 6
},
{
"token": "production",
"start_offset": 41,
"end_offset": 51,
"type": "<ALPHANUM>",
"position": 7
},
{
"token": "user:none:username:none:log",
"start_offset": 52,
"end_offset": 79,
"type": "<ALPHANUM>",
"position": 8
},
{
"token": "publishing",
"start_offset": 81,
"end_offset": 91,
"type": "<ALPHANUM>",
"position": 9
},
{
"token": "to",
"start_offset": 92,
"end_offset": 94,
"type": "<ALPHANUM>",
"position": 10
},
{
"token": "bus",
"start_offset": 95,
"end_offset": 98,
"type": "<ALPHANUM>",
"position": 11
},
{
"token": "user_id",
"start_offset": 100,
"end_offset": 107,
"type": "<ALPHANUM>",
"position": 12
},
{
"token": "8866",
"start_offset": 109,
"end_offset": 113,
"type": "<NUM>",
"position": 13
},
{
"token": "event_id",
"start_offset": 115,
"end_offset": 123,
"type": "<ALPHANUM>",
"position": 14
},
{
"token": "aibrbpclxcazsevrtfzvu5",
"start_offset": 125,
"end_offset": 147,
"type": "<ALPHANUM>",
"position": 15
},
{
"token": "timestamp",
"start_offset": 149,
"end_offset": 158,
"type": "<ALPHANUM>",
"position": 16
},
{
"token": "1449384441",
"start_offset": 160,
"end_offset": 170,
"type": "<NUM>",
"position": 17
},
{
"token": "quotes",
"start_offset": 172,
"end_offset": 178,
"type": "<ALPHANUM>",
"position": 18
},
{
"token": "rates",
"start_offset": 184,
"end_offset": 189,
"type": "<ALPHANUM>",
"position": 19
},
{
"token": "ueurusd",
"start_offset": 192,
"end_offset": 199,
"type": "<ALPHANUM>",
"position": 20
},
{
"token": "decimal",
"start_offset": 201,
"end_offset": 208,
"type": "<ALPHANUM>",
"position": 21
},
{
"token": "1.061025",
"start_offset": 209,
"end_offset": 217,
"type": "<NUM>",
"position": 22
},
{
"token": "ugbpusd",
"start_offset": 220,
"end_offset": 227,
"type": "<ALPHANUM>",
"position": 23
},
{
"token": "decimal",
"start_offset": 229,
"end_offset": 236,
"type": "<ALPHANUM>",
"position": 24
},
{
"token": "1.494125",
"start_offset": 237,
"end_offset": 245,
"type": "<NUM>",
"position": 25
},
{
"token": "ueurgbp",
"start_offset": 248,
"end_offset": 255,
"type": "<ALPHANUM>",
"position": 26
},
{
"token": "decimal",
"start_offset": 257,
"end_offset": 264,
"type": "<ALPHANUM>",
"position": 27
},
{
"token": "0.710150",
"start_offset": 265,
"end_offset": 273,
"type": "<NUM>",
"position": 28
},
{
"token": "event",
"start_offset": 277,
"end_offset": 282,
"type": "<ALPHANUM>",
"position": 29
},
{
"token": "accountinstrumentsupdated",
"start_offset": 284,
"end_offset": 309,
"type": "<ALPHANUM>",
"position": 30
},
{
"token": "minute",
"start_offset": 311,
"end_offset": 317,
"type": "<ALPHANUM>",
"position": 31
},
{
"token": "1449384420",
"start_offset": 319,
"end_offset": 329,
"type": "<NUM>",
"position": 32
}
]
}
The phrase "Production User:None:Username:None:LOG: publishing to bus "
{
"token": "production",
"start_offset": 41,
"end_offset": 51,
"type": "<ALPHANUM>",
"position": 7
},
{
"token": "user:none:username:none:log",
"start_offset": 52,
"end_offset": 79,
"type": "<ALPHANUM>",
"position": 8
},
{
"token": "publishing",
"start_offset": 81,
"end_offset": 91,
"type": "<ALPHANUM>",
"position": 9
},
{
"token": "to",
"start_offset": 92,
"end_offset": 94,
"type": "<ALPHANUM>",
"position": 10
},
{
"token": "bus",
"start_offset": 95,
"end_offset": 98,
"type": "<ALPHANUM>",
"position": 11
}
So if you search "publishing to bus" the elasticsearch matches the above three token and return the document.
if you search "None:LOG: publishing to bus" "None:LOG:" doesn't match fully so it doesn't return the document.
you can try "User:None:Username:None:LOG: publishing to bus" to get the result.
There are some problems in Kibana with special character as : | and -. When kibana found that kind of character they save in different parts, not in the same field. For that is easy to find publishing to bus or None or log. The solution is that you must indicate to kibana that the field wil not be analyzed.

Resources