Search in hierarchical tree on neo4j - search

I have a database organized in several hierarchical trees.
Nodes are organized by number.Nodes that begin with the same number are interconnected by relationships. For example: (5)-[connect]-(50)-[connect]-(507)... etc. I want to search, for example, the node 301 starting from the first parent node: the node 3. How do I do this query in cypher?

If you want to search for a specific node starting from the first parent I would suggest following query:
MATCH (n {number:1})-[:CONNECT*0..]->(n1) return n, n1;
This query searches for the node with property number = 1 and searches for all children which are related through CONNECT relationship. If you want to search for a specific child node you have to change the query this way:
MATCH (n {number:1})-[:CONNECT*0..]->(n1 {number:101}) return n, n1;
In the *0.. part you can define until what depth you want to search, so you can also search for depth=n with *0..n. This documentation is a good place to start with the match/path clause: https://neo4j.com/docs/developer-manual/current/cypher/clauses/match/

Related

apache TinkerPop gremlin How to filter starting vertex from path result

guys! I am new to gremlin and I need a help.
I want to understand if there a way to do aggregation on the full graph per node based on values in the neighbors properties?
Example:
I want to calculate mean amount of money spent by a "known" customers.
My graph structure is next:
Customer1 -- > Phone_Number < -- Customer2.
So i want to get, as result, next:
Customer1 Mean(Neighbors.value('money'))
Customer2 Mean(Neighbors.value('money'))
...
The neighbors for a customer(base node) is all other customers (except himself) connected to one(or more) phones associated to base node.
I understand how to exclude base node if I know ID, but is there a way to do calculations across the full graph automatically excluding "starting" nodes?
Like, ignore CustomerK if we start from CustomerK along the all path?
Another small question: Is there a way to filter vertex properties in path by starting vertex value? Like, ignore all customers that older than "age" of starting vertex
You can filter your start vertex by using the step as and then the neq predicate.
like this:
g.V().hasLabel('Customer').as('V').
project('name', 'mean').by('name').by(out().
in().where(neq('V')).dedup().values('money').
mean())
example: https://gremlify.com/as
For filter by properties you can do something similar to this: where(gt('V')).by('age').
g.V().hasLabel('Customer').as('V').
project('name', 'mean').by('name').
by(coalesce(
out().in().where(neq('V')).where(gt('V')).
by('age').
dedup().values('money').mean(),
constant('no values')
))

Combining new ArangoSearch views and graph traversals

I've read through the ArangoDB 3.4 docs and the ArangoSearch view tutorial, but I'm still unclear on if/how views can be combined with graph traversals. There is an example of a graph/view join in the tutorial; however, what I need to do is to simply filter the candidate pool resulting from a traversal with a view-based text search. For example:
"for i in 2..2 outbound start_doc edges1, inbound edges2 [filter by view] return i"
The initial 2-hop traversal from the "start_doc" vertex will result in a much smaller candidate pool than the entire collection. I want to then perform a text search on this candidate pool using a configured view (probably "text_en" analyzer).
Would i just define the view expression after the traversal? Or would I need to use a "union_distinct" function to combine the traversal and the search results? (This seem like it would be very inefficient given a potentially very large result set from the view.)
Thanks!
This is how I solved a similar problem, perhaps it will work for you too:
for i in 2..2 outbound start_doc edges1, inbound edges2
filter (
for x in view
search i._key == x._key and search_condition
limit 1
return x
) != []
return i

Cypher Query Return matched Nodes and optional relationships

I am trying to find the most optimal way of returning all matched nodes and any relationships they might have?
Here's my problem:
I need to return all users who created a project, so
match (u : User)-[r:CREATE]->(p: Project) return u, collect(p)
Simple enough, but User could also have other relationships and I would like to include them or optionally check (return true/false)
For example User could have relationship RECOMMEND, I don't want to limit by it, but if check if it exists an with what node?
Ideally my return table would look like this:
USER1 - PROJECT(S) - RECOMMENDED USER
USER2 - PROJECT(S) - NULL (nobody is recommending)
OPTIONAL MATCH will match the pattern and return null if it does not exist
MATCH (u : User)-[r:CREATE]->(p: Project)
OPTIONAL MATCH (u)-[:RECOMMEND]->(rec)
RETURN u, collect(p), collect(rec)

Search Documents from two collections in MarkLogic

In Marklogic, I want to search between two collections by joining the id element of doc from collection1 to id element of doc from collection2. When it is matched i need the resulting document from both collections.
I have the below code, but it is very slow. How to use cts:search or search:search to achieve the same
for $i in collection('demographic')/individual,
$j in collection('membership')/membership[enrolleIndividualId/id/text() = $i/individual/id/text()])
return {$i,$j}
Update:
I should note that your sample is not valid XQuery: return element root { $i, $j } would be valid. Also, you should not use the /text() node selector, as it's behavior can be counterintuitive. You can compare elements directly in an XPath predicate ([enrolleIndividualId/id eq $i/individual/id]). Use /fn:string() in place of /text() if you need the contents of an element as a string. I'd also recommend using the atomic equality operator eq in place of the sequence equality operator = when directly comparing individual elements.
Original Answer:
There are several approaches to implementing joins in MarkLogic, but I would first question your data model. From the names of the elements in your sample query, it looks like you are using a relational model (individuals have memberships). MarkLogic is a document database, and it's optimized for denormalized documents. You will be much better served to process your data and generate new individual documents that each contain the relevant membership data.
That being said, here's how you could join your documents:
First, you will need range indices to write performant joins. If the id element from your sample query is not unique to individuals, you will need path range indices on enrolledIndividualId/id and individual/id, otherwise, a simple element range index on id will do.
The most common join pattern in MarkLogic uses a "shotgun-OR" query; first retrieving values from the lexicon backing a range index, and then constructing an or-query from those values to retrieve the relevant documents. This won't work directly in your case, as you want to retrieve both sides of the join. You can either run a search for each pair of documents, or run a single search for one side, and then an additional document read for each document.
pairs:
for $value in cts:values(cts:path-reference("individual/id"))
return
cts:search(/,
cts:or-query((
cts:and-query((
cts:collection-query("demographic"),
cts:path-range-query("individual/id", "=", $value))),
cts:and-query((
cts:collection-query("membership"),
cts:path-range-query("enrolledIndividualId/id", "=", $value))))),
"unfiltered")
shotgun-OR plus iteration:
for $doc in
cts:search(/,
cts:and-query((
cts:collection-query("demographic"),
cts:path-range-query("individual/id", "=",
cts:values(cts:path-reference("individual/id"))))),
"unfiltered")
return
cts:search(/,
cts:and-query((
cts:collection-query("membership"),
cts:path-range-query("enrolledIndividualId/id", "=", $doc/individual/id))),
"unfiltered")
As you can see, each approach requires I/O proportionate to the number of docs/values you want to join. If you only needed the shotgun-OR (ie, a query for documents based on criteria from other documents), you would only need to make two requests, the initial cts:values() call to retrieve values from a lexicon, and the cts:search() call using a query built from those values.
Note: the cts:query objects used in these examples could be used in conjunction with the Search API by means of the search:resolve() function.
Given your apparent data model, you will be much better served by processing your data into individual, de-normalized documents.

hierarchical fulltext search in neo4j

I have parent/child graph in neo4j: decision(parent, list of child decision) which have property name (string) I'm going to use for search. it perfectly find my decisions which have search term in name by query:
START d=node:node_auto_index({autoIndexQuery}) MATCH (d:Decision) RETURN d
I want to complicate this query to find decision which have search term in name AND ALSO have search term in names of its children:
Name of relation is CONTAINS (Decision CONTAINS decisions)
I think the following query should work:
START parent=node:node_auto_index({autoIndexQuery})
WITH parent
START child=node:node_auto_index({autoIndexQuery})
MATCH (parent:Decision)-[:contains]->(child:Decision)
where parent <> child
RETURN parent, child;
One issue here is that a full text query condition (I think) can only take place in a START block. This means you'd need to match both parent and child that way, then connect them with MATCH.
This might take some time to complete, depending on how many nodes you have matching, since the query will mostly see if this parent/child relationship exists between all of the nodes that match. But it should get the job done.

Resources