Filter the list generated by a Gremlin traversal and Groovy - groovy

I'm doing the following traversal:
g.V().has('Transfer','eventName','Airdrop').as('t1').
outE('sent_to').
inV().dedup().as('a2').
inE('sent_from').
outV().as('t2').
where('t1',eq('t2')).by('address').
outE('sent_to').
inV().as('a3').
select('a3','a2').
by('accountId').toList().groupBy { it.a3 }.collectEntries { [(it.key): [a2 : it.value.a2]]};
So as you can see I'm basically doing a traversal and at the end I'm using groovy with collectEntries to aggregate the results like I need them, which is aggregated by a3 in this case. The results look like this:
==>0xfe43502662ce2adf86d9d49f25a27d65c70a709d={a2=[0x99feb505a8ed9976cf19e757a9536117e6cdc5ba, 0x22019ad32ea3adabae68003bdefd099d7e5e3886]}
(This is GOOD, because the number of values in a2 is at least 2)
==>0x129e0131ea3cc16fe5252d7280bd1258f629f20f={a2=[0xf7958fad496d15cf9fd9e54c0012504f4fdb96ff]}
(This is NOT GOOD, I want to return in my list only those combinations where there are at lest 2 values for a2)
I have tried using filters and an additional where step in the traversal itself but I haven't been able to do it. I'm not sure if this is something I should skip using Groovy in my last line. Any help or orientation would be very much appreciated

I don't think you need to drop into Groovy to get the answer you want. It would be preferable to do this all in Gremlin especially since you intend to filter results which could yield some performance benefit. Gremlin has it's own group() step as well as methods for filtering the resulting Map:
g.V().has('Transfer','eventName','Airdrop').as('t1').
out('sent_to').
dedup().as('a2').
in('sent_from').as('t2').
where('t1',eq('t2')).by('address').
out('sent_to').inV().as('a3').
select('a3','a2').
by('accountId').
group().
by('a3').
by('a2').
unfold().
where(select(values).limit(local,2).count(local).is(gte(2)))
The idea is to build your Map with group() then deconstruct it to entries with unfold(). You the filter each entry with where() by selecting the values of the entry, which is a List of "a2" then counting the items locally in that List. I use limit(local,2) to avoid unnecessary iteration beyond 2 since the filter is gte(2).

The easiest way to do this is with findAll { }.
.groupBy { it.a3 }
.findAll { it.value.a2.size() > 1 }
.collectEntries { [(it.key): [a2: it.value.a2]] }
if some a2 are null, then value.a2 also evaluates to null and filters the results without the need for explicit nullchecks

Related

Not able to use for loop in ternary operator in arangodb

How do we write conditions in arango, that includes for loops. I can elaborate the requirement below.
My requirement is if a particular attribute(array type) exists in the arango collection, i would read data from the collection(that requires a loop) or else, might do the following :
return null
return empty string ""
do nothing.
Is this possible to achieve in arango?
The helping methods could be -->
-- has(collectionname, attributename)
-- The ternary operator ?:
let attribute1 = has(doc,"attribute1") ?(
for name in doc.attribute1.names
filter name.language == "xyz"
return name.name
) : ""
But this dosent work. Seems like arango compiler first attempts to compile the for loop, finds nulls and reports error as below. Instead, it should have compiled "has" function first for the ternary operator being used.
collection or array expected as operand to FOR loop; you provided a value of type 'null' (while executing)
If there is a better way of doing it, would appreciate the advice!!
Thanks in advance!
Nilotpal
Fakhrany here from ArangoDB.
Regarding your question, this is a known limitation.
From https://www.arangodb.com/docs/3.8/aql/fundamentals-limitations.html:
The following other limitations are known for AQL queries:
Subqueries that are used inside expressions are pulled out of these
expressions and executed beforehand. That means that subqueries do not
participate in lazy evaluation of operands, for example in the ternary
operator. Also see evaluation of subqueries.
Also noted here for the ternary operator:
https://www.arangodb.com/docs/3.8/aql/operators.html#ternary-operator.
An answer to the question what to do may be to use a FILTER before enumerating over the attributes:
FOR doc IN collection
/* the following filter will only let those documents passed in which "attribute1.names" is an array */
FILTER IS_ARRAY(doc.attribute1.names)
FOR name IN doc.attribute1.names
FILTER name.language == "xyz"
RETURN name.name
Other solutions are also possible. Depends a bit on the use case.

ArangoDB AQL: Find Gaps In Sequential Data

I've been given data to build an application that has sequential data in the form of part numbers of products: "000000", "000001", "000002", "000010", "000011" .... The previous application was an old MS Access database that didn't have any gap filling features in the part number generator, hence the gap between "000002" and "000010" (Yes, they are also strings, but I can work with that...).
We could continue to increment based on the last value and ignore the gaps, however, in an attempt to use all numbers available to us with our naming scheme, we'd like to be able to fill the gaps. Our naming scheme describes the "product family" with the first two digits such that: [00]0000 would be a different family from [02]0000.
I can find the starting and ending values using something like:
let query = `
LET first = (
MIN(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
LET last = (
MAX(
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN part.PartNumber
)
)
RETURN { first, last }
`
The above example returns: {first: "000000", last: "000915"}
Using ArangoDB and AQL, how could I go about finding these gaps? I've found some SQL examples but I feel the features of AQL are a bit more limiting.
Thanks in advance!
To start with, I think your best bet for getting min/max values is using aggregates:
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
COLLECT x = 1
AGGREGATE first = MIN(part.PartNumber), last = MAX(part.PartNumber)
RETURN {
first: first,
last: last
}
But that won't really help when trying to find gaps. And you're right - SQL has several logical constructs that could help (like using variables and cursor iteration), but even that would be a pattern I would discourage.
The better path might be to do a "brute force" approach - compare a table containing your existing numbers with a table of all numbers, using a native method like JOIN to find the difference. Here's how you might do that in AQL:
LET allNumbers = 0..9999
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
LET childId = RIGHT(part.PartNumber, 4)
RETURN TO_NUMBER(childId)
)
RETURN MINUS(allNumbers, existingParts)
The x..y construct creates a sequence (an array of numbers), which we use as the full set of possible numbers. Then, we want to return only the "non-family" part of the ID (I'm calling it "child"), which needs to be numeric to compare with the previous set. Then, we use MINUS to remove elements of existingParts from the allNumbers list.
One thing to note, that query would return only the "child" portion of the part number, so you would have to join it back to the family number later. Alternatively, you could also skip string-splitting, and get "fancy" with your list creation:
LET allNumbers = TO_NUMBER(CONCAT(#family, '0000'))..TO_NUMBER(CONCAT(#family, '9999'))
LET existingParts = (
FOR part in part_search
SEARCH STARTS_WITH(part.PartNumber, #family)
RETURN TO_NUMBER(part.PartNumber)
)
RETURN MINUS(allNumbers, existingParts)

Can a comparator function be made from two conditions connected by an 'and' in python (For sorting)?

I have a list of type:
ans=[(a,[b,c]),(x,[y,z]),(p,[q,r])]
I need to sort the list by using the following condition :
if (ans[j][1][1]>ans[j+1][1][1]) or (ans[j][1][1]==ans[j+1][1][1] and ans[j][1][0]<ans[j+1][1][0]):
# do something (like swap(ans[j],ans[j+1]))
I was able to implement using bubble sort, but I want a faster sorting method.
Is there a way to sort my list using the sort() or sorted() (Using comparator or something similar) functions while pertaining to my condition ?
You can create a comparator function that retuns a tuple; tuples are compared from left to right until one of the elements is "larger" than the other. Your input/output example is quite lacking, but I believe this will result into what you want:
def my_compare(x):
return x[1][1], x[1][0]
ans.sort(key=my_compare)
# ans = sorted(ans, key=my_compare)
Essentially this will first compare the x[1][1] value of both ans[j] and ans[j+1], and if it's the same then it will compare the x[1][0] value. You can rearrange and add more comparators as you wish if this didn't match your ues case perfectly.

Read multiple hash keys and keep only unique values

If I have data in Hiera like:
resource_adapter_instances:
'Adp1':
adapter_plan_dir: "/opt/weblogic/middleware"
adapter_plan: 'Plan_DB.xml'
'Adp2':
adapter_plan_dir: "/opt/weblogic/middleware"
adapter_plan: 'ODB_Plan_DB.xml'
'Adp3':
adapter_plan_dir: "/opt/weblogic/middleware"
adapter_plan: 'Plan_DB.xml'
And I need to transform this into an array like this, noting duplicates are removed:
[/opt/weblogic/middleware/Plan_DB.xml, /opt/weblogic/middleware/ODB_Plan_DB.xml]
I know I have to use Puppet's map but I am really struggling with it.
I tried this:
$resource_adapter_instances = hiera('resource_adapter_instances', {})
$resource_adapter_paths = $resource_adapter_instances.map |$h|{$h['adapter_plan_dir']},{$h['adapter_plan']}.join('/').uniq
notice($resource_adapter_instances)
But that doesn't work, and emits syntax errors. How do I do this?
You are on the right track. A possible solution is as follows:
$resource_adapter_instances = lookup('resource_adapter_instances', {})
$resource_adapter_paths =
$resource_adapter_instances.map |$x| {
[$x[1]['adapter_plan_dir'], $x[1]['adapter_plan']].join('/')
}
.unique
notice($resource_adapter_paths)
A few further notes:
The hiera function is deprecated so I rewrote using lookup and you should too.
Puppet's map function can be a little confusing - especially if you need to iterate with it through a nested Hash, as in your case. On each iteration, Puppet passes each key and value pair as an array in the form [key, value]. Thus, $x[0] gets your Hash key (Adp1 etc) and $x[1] gets the data on the right hand side.
Puppet's unique function is not uniq as in Bash, Ruby etc but actually is spelt out as unique.
Note I've rewritten it without the massively long lines. It's much easier to read.
If you puppet apply that you'll get:
Notice: Scope(Class[main]): [/opt/weblogic/middleware/Plan_DB.xml,
/opt/weblogic/middleware/ODB_Plan_DB.xml]

Cognos query calculation - how to obtain a null/blank value?

I have a query calculation that should throw me either a value (if conditions are met) or a blank/null value.
The code is in the following form:
if([attribute] > 3)
then ('value')
else ('')
At the moment the only way I could find to obtain the result is the use of '' (i.e. an empty character string), but this a value as well, so when I subsequently count the number of distinct values in another query I struggle to get the correct number (the empty string should be removed from the count, if found).
I can get the result with the following code:
if (attribute='') in ([first_query].[attribute]))
then (count(distinct(attribute)-1)
else (count(distinct(attribute))
How to avoid the double calculation in all later queries involving the count of attribute?
I use this Cognos function:
nullif(1, 1)
I found out that this can be managed using the case when function:
case
when ([attribute] > 3)
then ('value')
end
The difference is that case when doesn't need to have all the possible options for Handling data, and if it founds a case that is not in the list it just returns a blank cell.
Perfect for what I needed (and not as well documented on the web as the opposite case, i.e. dealing with null cases that should be zero).

Resources