How can I add a new Vertex only if it is possible to add an Edge too? - tinkerpop3

I need to add a new Vertex with an Edge.
My code is something like:
g.V().addV("Vertex").addE("MyEdge").to(V().has("OtherVertex", "name", "test"))
If V().has("OtherVertex", "name", "test") return a Vertex, everything works fine. My problem is if the OtherVertex doesn't exist, Gremlin add the new Vertex without edges. I would like to add the new Vertex only if I can create the Edge.
I am using Gremlin-server for developing. My guess is, I could try to use Transactions, but I am not sure if AWS Neptune support it now.
Any suggestion?
Thanks.
I think, avoiding transactions, I realize that I can select OtherVertex first. If it doesn't exist, the query will not create a new Vertex:
g.V().has("OtherVertex", "name", "test").as('t').addV("Vertex").as('v').addE("MyEdge").from('v').to('t')

As you wrote, this is the correct approach:
g.V().has("OtherVertex", "name", "test").as('t').
addV("Vertex").as('v').addE("MyEdge").from('v').to('t')
I would just add something in relation to your initial attempt that showed:
g.V().addV("Vertex")
I think you simply meant to start with:
g.addV("Vertex")
If you go with the former you create some unintended problems:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.V().addV('person')
gremlin> g.V().addV('person')
gremlin> g
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
Note that nothing gets added for an empty graph. That is because V() returns no vertices and therefore there is nothing in the pipeline to trigger addV() with. Let's assume you have some data though:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().addV('person')
==>v[13]
==>v[14]
==>v[15]
==>v[16]
==>v[17]
==>v[18]
gremlin> g.V().addV('person')
==>v[19]
==>v[20]
==>v[21]
==>v[22]
==>v[23]
==>v[24]
==>v[25]
==>v[26]
==>v[27]
==>v[28]
==>v[29]
==>v[30]
Now, there's an even worse problem in that we're adding one vertex for every existing vertex as V() now returns all the vertices in the graph on each execution.
As a final note, please see this Gremlin Recipe for ways to do get-or-create type operations as it's related to this conditional sort of mutation that you're doing now.

Related

ArangoDB - Swap _to and _from values for edge using AQL

Is there a clean way to swap the _to and _from values for an edge using AQL? According to Arango's documentation on Edges:
To change edge endpoints you would need to remove old document/edge and insert new one. Other fields can be updated as in default collection.
So what I was able to come up with was a query that looks like this:
FOR edge IN edge_collection
FILTER [some criteria]
LET tempEdge = KEEP(edge, ATTRIBUTES(edge, true))
LET newEdge = MERGE([{'_key':edge._key}, {'_from':edge._to}, {'_to':edge._from}, tempEdge])
REPLACE newEdge IN edge_collection
RETURN NEW
To explain my own solution a bit, I used the ATTRIBUTES(edge, true) function to get the names of all of the Attributes on the Edge, and the true parameter removed the internal attributes (like _key, _id, _to, etc.). Read more about ATTRIBUTES here.
Then the KEEP(edge, [attributes]) function returns a new Document that only has the Attributes specified in the given array, which thanks to the ATTRIBUTES function in this case, is everything but the internal fields. Read more about KEEP here.
Then I use the MERGE function to combine the _key from the original edge, swap the _to and _from values, and all of the non-internal attributes. Read more about MERGE here.
Lastly, I use REPLACE which removes the original edge and adds the new one in, just like Arango requires. Read more about REPLACE here.
Like I said, this appears to work, but the MERGE in particular feels like the wrong way to go about doing what I did. Is there an easier way to set values on an Object? For instance, something that would let me just make a call similar to: tempEdge._from = edge._to?
Yes, there is a simpler solution:
FOR edge IN edge_collection
FILTER [some criteria]
UPDATE edge WITH {_from: edge._to, _to: edge._from} IN edge_collection
RETURN NEW
_from and _to can be updated (in contrast to the system attributes _id, _key and _rev), so you don't need to replace the whole document. And since UPDATE merges the changes into the existing document, you only need to specify the new values for _from and _to.

how to do Gremlin contain search for both number and string

Neptune 1.0.2.1 + Gremlin + nodejs.
I have a vertext and property, e.g. Vertex - Device, property - Test, the Test property could store different type of data, e.g. number and string
Vertex 1 - Test = ['ABCD','xyz']
Vertex 2 - Test = [123,'XYZ']
I want to do a 'containing' search, e.g. Test=A, or Test=123 regardless the datatype.
I was trying
queryText = 'BC' //this throw error
or queryText = 123 //this actually works
//I expect both case should hit the result.
g.V().hasLabel('Device').or(__.has('Test', parseFloat(queryText)), __.has('Test', textP.containing(queryText)));
but get 'InternalFailureException\' error
Is it possible I can write a single query regardless the datatype?
if not possible, or at least make textP.containing work with multiple query assuming I know the datatype? right now the containing search throw error if the property contains number
It looks like you have the closing bracket in the wrong place inside the or() step. You need to close the first has step before the comma.
In your example
g.V().hasLabel('Device').or(__.has('Test', parseFloat(queryText), __.has('Test', textP.containing(queryText))));
Which should be
g.V().hasLabel('Device').or(__.has('Test', parseFloat(queryText)), __.has('Test', textP.containing(queryText)));
EDITED and UPDATED
With the corrected query and additional clarification about the data model containing different types for the same property key, I was able to reproduce what you are seeing. However, the same behavior can be seen using TinkerGraph as well as Neptune. The error message generated is is a little different but the meaning is the same. Given the fact that TinkerGraph behaves the same way I am of the opinion that Neptune is behaving consistently with the "reference" implementation. That said, this raises a question as to whether the TextP predicates should be smarter and check the type of the property before attempting the test.
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV('test').property('x',12.5)
==>v[0]
gremlin> g.addV('test').property('x','ABCDEF')
==>v[2]
gremlin> g.V().hasLabel('test').or(has('x',12.3),has('x',TextP.containing('CDE')))
java.math.BigDecimal cannot be cast to java.lang.String
Type ':help' or ':h' for help.
Display stack trace? [yN]
ADDITIONAL UPDATE
I created a Jira issue so the Apache TinkerPop community can consider making a change to the TextP predicates.
https://issues.apache.org/jira/browse/TINKERPOP-2375

How to get [a b] as my output when I pass a,b using values(a,b) in the same order?

I am using gremlin console to check my query working or not. I am able to get the required data but in the reverse order.
In the end of my query, I am using
values.('id','name').fold()
But the output I am getting is [name id].
How to get the output as [id name]?
Thank you.
Gremlin typically doesn't do anything to preserve order and relies on the order provided by the underlying graph database. If you need order then you need to specify the order in some way. I think that you could use union() for that:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).union(values('name'),values('age')).fold()
==>[marko,29]
gremlin> g.V(1).union(values('age'),values('name')).fold()
==>[29,marko]
I'm not sure if it's a great idea to rely on union() though. Like TinkerGraph should respect the order specified there, but I'm not sure if other graphs will. Perhaps a better solution would be to explicit with order():
gremlin> g.V(1).properties('name','age').order().by(key).value()
==>29
==>marko
That's alphabetical sort by key though and not the order you typed them. Ultimately though if you're just validating results it would probably be better to just return a Map with project()
gremlin> g.V(1).project('name','age').by('name').by('age')
==>[name:marko,age:29]
gremlin> g.V(1).project('age','name').by('age').by('name')
==>[age:29,name:marko]
gremlin> g.V(1).project('age','name').by('age').by('name').next().getClass()
==>class java.util.LinkedHashMap
As you can see project() preserves order() as it uses a LinkedHashMap and executes the by() modulators in the order they are defined. What's neat about that is that you really wanted that List form you were asking about in your initial question you could then just grab the values from the Map:
gremlin> g.V(1).project('name','age').by('name').by('age').select(values)
==>[marko,29]
gremlin> g.V(1).project('age','name').by('age').by('name').select(values)
==>[29,marko]
Hopefully one of these approaches works for your situation.

gremlin outputs different from as seen on the internet, I think in bytes

How to get gremlin output normal indices along with v
Currently it outputs something like this
gremlin> g.V
WARN com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx - Query requires iterating over all vertices [()]. For better performance, use indexes
gremlin> juno = g.addVertex(null);
==>v[128824]
gremlin> june = g.addVertex(null);
==>v[128828]
gremlin> jape = g.addVertex(null);
==>v[128832]
But as I saw on the internet it should be output something like this when a vertex is added in the graph
gremlin> g.V
WARN com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx - Query requires iterating over all vertices [()]. For better performance, use indexes
gremlin> juno = g.addVertex(null);
==>v[1]
gremlin> june = g.addVertex(null);
==>v[2]
gremlin> jape = g.addVertex(null);
==>v[3]
The same problem occurs when I try to load about 10000 vertices. All these vertices have _id field in it but after loading this field is gone. It also not that the vertices have been loaded with this id....same is the case with _type field its also not present after loading.
I need these id and type because they map to something in other table too.
Here is a look at my rexter doghouse about the 3 loaded vertices
http://i.imgur.com/xly0jf8.png
So bit confused about all this stuff.
Thanks in advance
When vertices are added to Titan an Element identifier is assigned. That value is up to Titan and you should not expect it to start at "1" or any other specific number when you do. If you need there to be some kind of number like that you should add it yourself.
With respect to the _id and _type fields, I'm assuming that you are referring to fields found in JSON output from Rexster. Note that those are Rexster fields that are appended to the output. _id is always there and should map directly to Vertex.id() or Edge.id() depending on the data you are returning. _type just refers to the whether the JSON returned is representative of a "vertex" or an "edge". That data is not stored in Titan itself.

How to find the information about index created on Titan/Rexster graph database with cassandra as datastore

I have a Rexster/Titan + Cassandra configuration. I have created unique index over vertex properties. How can I verify the index has created properly? and also to check the other properties such as uniqueness and any other information about the created index?
As you are using Titan, you can use the TitanManagement API:
gremlin> g = TitanFactory.open('conf/titan-berkeleydb-es.properties')
==>titangraph[berkeleyje:/home/smallette/jvm/titan-0.5.4-hadoop1/conf/../db/berkeley]
gremlin> GraphOfTheGodsFactory.load(g)
==>null
gremlin> mgmt = g.getManagementSystem()
==>com.thinkaurelius.titan.graphdb.database.management.ManagementSystem#6ac756b
mgmt.getGraphIndexes(Vertex.class).collect{[it.name,it.fieldKeys.collect{it.cardinality}]}
==>[name, [SINGLE]]
==>[vertices, [SINGLE]]
gremlin> mgmt.rollback()
==>null
You can either issue the query as I did from the Gremlin Console or you should be able to simply issue the same query like that to Rexster's Gremlin Extension to get that result. Be sure to call rollback (or commit) to close the management API transaction, especially if using Rexster (Rexster doesn't auto-manage those).

Resources