GROOVY: Split the string into substrings and pair them with the key - string

I need to split the string into substrings and pair them with the key value.
I'm new to the groovy language, so I'd appreciate it if you can help :)
I have this:
{"key": "a", "tag": ""},
{"key": "b", "tag": "one, two"}
I want to get this
{"key": "a", "tag": ""},
{"key": "b", "tag": "one"}
{"key": "b", "tag": "two"}

Use String::split(String) to split the strings.
Use Collection::flatten(Closure) to convert each entry object into any number of output objects (so for this particular problem: convert each entry with unsplit tag value into a separate entry for each comma-separated value in tag). You can also use Java 8 streams and the flatMap method to achieve the same result, flatten is just specific to Groovy (not necessarily better).
I don't think it would be good to give you a full solution, though, so I'll leave that up to you.

Related

ArangoDB bindVars with dot?

I would like to sort ArangoDB query results by various properties of a nested object however the bound vars seem not to work with dots in the names so
query: FOR a IN collection SORT #key ASC RETURN a
bindVars: #key = 'a.b.c.d'
(or) #key = 'a.x.y'
does not work
Is there a way how to "eval" the bound string to the nested property?
EDIT:
I found in the docs that
"key": [ "a", "b", "c" ] should work but it does not work for me.
The document reference (here: a) needs to remain in the query. It must not be part of the bind variable.
FOR a IN collection SORT a.#key ASC RETURN a
{ "key": ["b", "c", "d"] }
If you want to sort by two attributes:
FOR a IN collection SORT a.#key1 ASC, a.#key2 DESC RETURN a
{ "key1": ["b", "c", "d"], "key2": ["x", "y"] }

How to get a json array in a one row in hive, not trying to explode

I think my question is very simple but I am not able to find any specific answers.
I have a JSON like this:
{"data":
{"updated": "yes",
"car": [{"type": "fancy", "name": "bmw"},
{"type": "normal", "name": "honda"}]
}
}
I want to put this in a hive table as:
updated car
yes "{"type": "fancy", "name": "bmw"},{"type": "normal", "name": "honda"}"
I have used from_json, to_json but can't get it to work. Again, I don't want to break it down into multiple rows based on array elements.
Any help is appreciated.
to_json should work!
df.select(col("data.updated").alias("updated"), to_json(col("data.car")).alias("car")).show(false)
Output:
+-------+----------------------------------------------------------------+
|updated|car |
+-------+----------------------------------------------------------------+
|yes |[{"name":"bmw","type":"fancy"},{"name":"honda","type":"normal"}]|
+-------+----------------------------------------------------------------+

insert values into a list of dictionaries

I want to add dictionary values to my list of dictionaries whose keys are not unique but same to all the dictionaries.and these value are in turn again added to the google sheet under their respective columns(keys) in a particular index.
I initially thought I would make a list of keys and then the list of values to add and insert them..but I don't think that's the approach.
I'm stuck!
[`list of dictionaries`
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]
You need, create a list with the keys, like:
KEYS = ['name', 'age']
and for witch new values:
values = ['jane', 15]
make:
list_of_dictionaries.append(dict(zip(KEYS, values)))

Azure Stream Analytics query language get value by key from array of key value pairs

I am trying to extract a specific value from an array property in the Stream Analytics query language.
My data looks as follows:
"context": {
"custom": {
"dimensions": [{
"MacAddress": "ma"
},
{
"IpAddress": "ipaddr"
}]
}
}
I am trying to obtain a result that has "MacAddress", "IpAddress" as column titles and "ma", "ipaddr" as rows.
I am currently achieving this with this query:
SELECT
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 0), 'MacAddress') AS MacAddress,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 1), 'IpAddress') AS IpAddress,
I am trying to use CROSS APPLY but so far no luck. Below the CROSS APPLY query:
SELECT
flat.ArrayValue.MacAddress as MacAddress,
flat.ArrayValue.IpAddress as IpAddress
FROM
[ffapi-track-events] as MySource
CROSS APPLY GetArrayElements(MySource.context.custom.dimensions) as flat
This one produces two rows instead of one:
MacAddress, IpAddress
ma ,
, ipaddr
so I'm missing precisely the flattening when writing it like that.
I would like to bypass hardcoding the index 0 as it's not guaranteed that MacAddress won't switch places with "IpAddress"... So I need something like FindElementInArray by condition, or some means to join with the dimensions array.
Is there such thing?
Thank you.

Logic for reverse search result

I have a usecase where for a given result value i want to reverse lookup all the search conditions defined that will give this as result.
So, I have a set of search conditions defined in a table as key value list. Each entry in this table is a search query. Now, I have a random value in dataset which can be result of any search entries defined in the table. I want to lookup that table so that for this value i can get all the search queries possible where this value would appear as its result.
The search table consist of fields search_conditions, search_table among other fields.
Schema would be like
Search_Table
id (long)
search_table_id (long)
search_conditions (json array as text)
This is value of one such search condition
[
{
"key": "name",
"operator": "equals",
"value": "jeff"
},
{
"key": "age",
"operator": "between",
"value": [
20,
40
]
}
]
Value that i have to search can be say a random user {"name": "mr x", "age":12}.
This may not be exactly a technology based question but its solution may require technology. Any help will be appreciated. The concern is more about optimization as this has to be done in real time.

Resources