saving intermediate steps in gremlin - groovy

I am writing a query which should detect certain loops within a graph, which means that I need I need to assign names to certain nodes within the path so that I can compare nodes later in the path with the saved ones. for example A -> B -> C -> A. Is this possible within gremlin?

It sounds like you're looking for something like this:
https://github.com/tinkerpop/gremlin/wiki/Except-Retain-Pattern
where you keep a list of previously traversed vertices and then utilize that list later in the traversal.

Related

How to use the load function in Octave not knowing the variable name of the saved data?

I have several 100 *.mat files with matrices with same shapes. But over time this matrices were saved with different names. Lets say for this example a or b.
How can I now load and process this data without knowing the name using a new name instead?
An if condition is not an option, because there are too many different names.
I already tryed:
data = load('example_file.mat')
but then I need again the old variable names to access the matrices with data.a or data.b ...
What I need is something like:
load('example_file.mat') as matrix
Is this possible in Octave?
Your attempt
data = load('example_file.mat')
is the right start. You can examine data to find out what variables it contains. Assuming there’s always a single matrix in the MAT-file,
names = fieldnames(data);
data = data.(names{1});

Python3 - Access multikey dict with single key

I want to map a timestamp t and an identifier id to a certain state of an object. I can do so by mapping a tuple (t,id) -> state_of_id_in_t. I can use this mapping to access one specific (t,id) combination.
However, sometimes I want to know all states (with matching timestamps t) of a specific id (i.e. id -> a set of (t, state_of_id_in_t)) and sometimes all states (with matching identifiers id) of a specific timestamp t (i.e. t -> a set of (id, state_of_id_in_t)). The problem is that I can't just put all of these in a single large matrix and do linear search based on what I want. The amount of (t,id) tuples for which I have states is very large (1m +) and very sparse (some timestamps have many states, others none etc.). How can I make such a dict, which can deal with accessing its contents by partial keys?
I created two distinct dicts dict_by_time an dict_by_id, which are dicts of dicts. dict_by_time maps a timestamp t to a dict of ids, which each point to a state. Similiarly, dict_by_id maps an id to a dict of timestamps, which each point to a state. This way I can access a state or a set of states however I like. Notice that the 'leafs' of both dicts (dict_by_time an dict_by_id) point to the same objects, so its just the way I access the states that's different, the states themselves however are the same python objects.
dict_by_time = {'t_1': {'id_1': 'some_state_object_1',
'id_2': 'some_state_object_2'},
't_2': {'id_1': 'some_state_object_3',
'id_2': 'some_state_object_4'}
dict_by_id = {'id_1': {'t_1': 'some_state_object_1',
't_2': 'some_state_object_3'},
'id_2': {'t_1': 'some_state_object_2',
't_2': 'some_state_object_4'}
Again, notice the leafs are shared across both dicts.
I don't think it is good to do it using two dicts, simply because maintaining both of them when adding new timestamps or identifiers result in double work and could easily lead to inconsistencies when I do something wrong. Is there a better way to solve this? Complexity is very important, which is why I can't just do manual searching and need to use some sort of HashMap magic.
You can always trade add complexity with lookup complexity. Instead of using a single dict, you can create a Class with an add method and a lookup method. Internally, you can keep track of the data using 3 different dictionaries. One uses the (t,id) tuple as key, one uses t as the key and one uses id as the key. Depending on the arguments given to lookup, you can return the result from one of the dictionaries.

Solr: find groups containing a value

I have a structure of documents indexed in Solr that are grouped together by a property.
Let's say I have a group consisting of three documents:
A -> B -> C
I want to performa query by a property value V that will return the whole group whether A or B or C contain the value V.
For example - the query will return my whole group (A,B,C) if B contains the value V.
Is this possible in solr?
Thanks!
If I understand correctly, yes, this possible. You can use Graph query parser to do this:
you index your docs with the right info on how docs are linked to each other in some fields (see sample in the docs).
then, you query like this:
q={!graph+from=in_edge+to=out_edge}id:A
where id:A is the query to get the starting set of docs, and the {!graph ...} is to get all docs reachable from the starting set.
Some caveats:
works only in standalone solr or single Solrcloud shard (though some graph features are available also in Streaming expressions, those would work across all shards, but have less features for now)
depending on how the graph of docs you want to reach looks like (and how you index the edge info), works case you might need to run two queries, one to get the docs 'from' the starting set and another to get the docs 'to' the starting set, for example.

Find non-[cycle detected] leaves in Paths to Root in Memory Snapshot

In VS2015 there is a useful tool for finding what is holding on to a reference to an object. However, in the spaghetti-like code that I am working in expanding this tree out has many leaves that end with [cycle detected] (e.g. A has a reference to B and B has a reference to A). For example:
Is there a way to filter out the branches that contain just leaves with cycles? Or a way to export the data here?

How is insert O(log(n)) in Data.Set?

When looking through the docs of Data.Set, I saw that insertion of an element into the tree is mentioned to be O(log(n)). However, I would intuitively expect it to be O(n*log(n)) (or maybe O(n)?), as referential transparency requires creating a full copy of the previous tree in O(n).
I understand that for example (:) can be made O(1) instead of O(n), as here the full list doesn't have to be copied; the new list can be optimized by the compiler to be the first element plus a pointer to the old list (note that this is a compiler - not a language level - optimization). However, inserting a value into a Data.Set involves rebalancing that looks quite complex to me, to the point where I doubt that there's something similar to the list optimization. I tried reading the paper that is referenced by the Set docs, but couldn't answer my question with it.
So: how can inserting an element into a binary tree be O(log(n)) in a (purely) functional language?
There is no need to make a full copy of a Set in order to insert an element into it. Internally, element are stored in a tree, which means that you only need to create new nodes along the path of the insertion. Untouched nodes can be shared between the pre-insertion and post-insertion version of the Set. And as Deitrich Epp pointed out, in a balanced tree O(log(n)) is the length of the path of the insertion. (Sorry for omitting that important fact.)
Say your Tree type looks like this:
data Tree a = Node a (Tree a) (Tree a)
| Leaf
... and say you have a Tree that looks like this
let t = Node 10 tl (Node 15 Leaf tr')
... where tl and tr' are some named subtrees. Now say you want to insert 12 into this tree. Well, that's going to look something like this:
let t' = Node 10 tl (Node 15 (Node 12 Leaf Leaf) tr')
The subtrees tl and tr' are shared between t and t', and you only had to construct 3 new Nodes to do it, even though the size of t could be much larger than 3.
EDIT: Rebalancing
With respect to rebalancing, think about it like this, and note that I claim no rigor here. Say you have an empty tree. Already balanced! Now say you insert an element. Already balanced! Now say you insert another element. Well, there's an odd number so you can't do much there.
Here's the tricky part. Say you insert another element. This could go two ways: left or right; balanced or unbalanced. In the case that it's unbalanced, you can clearly perform a rotation of the tree to balance it. In the case that it's balanced, already balanced!
What's important to note here is that you're constantly rebalancing. It's not like you have a mess of a tree, decided to insert an element, but before you do that, you rebalance, and then leave a mess after you've completed the insertion.
Now say you keep inserting elements. The tree's gonna get unbalanced, but not by much. And when that does happen, first off you're correcting that immediately, and secondly, the correction occurs along the path of the insertion, which is O(log(n)) in a balanced tree. The rotations in the paper you linked to are touching at most three nodes in the tree to perform a rotation. so you're doing O(3 * log(n)) work when rebalancing. That's still O(log(n)).
To add extra emphasis to what dave4420 said in a comment, there are no compiler optimizations involved in making (:) run in constant time. You could implement your own list data type, and run it in a simple non-optimizing Haskell interpreter, and it would still be O(1).
A list is defined to be an initial element plus a list (or it's empty in the base case). Here's a definition that's equivalent to native lists:
data List a = Nil | Cons a (List a)
So if you've got an element and a list, and you want to build a new list out of them with Cons, that's just creating a new data structure directly from the arguments the constructor requires. There is no more need to even examine the tail list (let alone copy it), than there is to examine or copy the string when you do something like Person "Fred".
You are simply mistaken when you claim that this is a compiler optimization and not a language level one. This behaviour follows directly from the language level definition of the list data type.
Similarly, for a tree defined to be an item plus two trees (or an empty tree), when you insert an item into a non-empty tree it must either go in the left or right subtree. You'll need to construct a new version of that tree containing the element, which means you'll need to construct a new parent node containing the new subtree. But the other subtree doesn't need to be traversed at all; it can be put in the new parent tree as is. In a balanced tree, that's a full half of the tree that can be shared.
Applying this reasoning recursively should show you that there's actually no copying of data elements necessary at all; there's just the new parent nodes needed on the path down to the inserted element's final position. Each new node stores 3 things: an item (shared directly with the item reference in the original tree), an unchanged subtree (shared directly with the original tree), and a newly created subtree (which shares almost all of its structure with the original tree). There will be O(log(n)) of those in a balanced tree.

Resources