Gremlin: SetProperty iteratively to existing graph database - groovy

I am trying to run JUNGs PageRank algorithm onto my existing neo4j graph database and save a node's score as a property for future reference.
So I created the following groovy file:
import edu.uci.ics.jung.algorithms.scoring.PageRank
g = new Neo4jGraph('/path/to/graph.db')
j = new GraphJung(g)
pr = new PageRank<Vertex,Edge>(j, 0.15d)
pr.evaluate()
g.V.sideEffect{it.pagerank=pr.getVertexScore(it)}
and run it through gremlin.
It runs smoothly and if I were to check the property via g.v(2381).map() I get what I'd expect.
However, when I leave gremlin and start up my neo4j server, these modifications are non-existant.
Can anyone explain why and how to fix this?
My hunch is that it has something to do with my graph in gremlin being embedded:
gremlin> g
==>neo4jgraph[EmbeddedGraphDatabase [/path/to/graph.db]]
Any ideas?

You will need a g.shutdown() at the end of your groovy script. Without a g.shutdown() all changes to the graph are most likely to stay in memory. Re-initializing the graph from disk (/path/to/graph.db in your case), will lose the changes which were still in memory. g.shutdown() will flush the current transaction from memory to disk. This will make sure your changes persist and will be retrieved when you try to access the database again.
Hope this helps.
Note: You are correct on the hunch for embedded database. This issue will not occur if you use Neo4j's REST interface because every REST API request is treated as a single transaction.

Related

Azure Cosmos DB Python SDK : Query items from change feed using checkpoints?

Newbie to the CosmosDB...please shed some light
#Matias Quaranta - Thank you for the samples
From the official samples it seems like the Change feed can be queried either from the beginning or from a specific point in time.
options["startFromBeginning"] = True
or
options["startTime"] = time
What other options does the QueryItemsChangeFeed method support?
Does it support querying from a particular check point within a partition?
Glad the samples are useful. In theory, the concept of "checkpoints" does not exist in the Change Feed. "Checkpoints" is basically you storing the last processed batch or continuation after every execution in case your process halts.
When the process starts again, you can take your stored continuation and use it.
This is what the Change Feed Processor Library and our Azure Cosmos DB Trigger for Azure Functions do for you internally.
To pass the continuation in python, you can use options['continuation'] and you should be able to get them from the response headers on the 'x-ms-continuation'.
Refer to the sample code ReadFeedForTime, I has tried the options["startTime"]. But it doesn't work, the response is the same as the list of documents start from Beginning.

Node.js - Scaling with Redis atomic updates

I have a Node.js app that preforms the following:
get data from Redis
preform calculation on data
write new result back to Redis
This process may take place several times per second. The issue I now face is that I wish to run multiple instances of this process, and I am obviously seeing out of date date being updated due to each node updating after another has got the last value.
How would I make the above process atomic?
I cannot add the operation to a transaction within Redis as I need to get the data (which would force a commit) before I can process and update.
Can anyone advise?
Apologies for the lack of clarity with the question.
After further reading, indeed I can use transactions however the area I was struggling to understand was that I need separate out the read from the update, and just wrap the update in the transaction along with using WATCH on the read. This causes the update transaction to fail if another update has taken place.
So the workflow is:
WATCH key
GET key
MULTI
SET key
EXEC
Hopefully this is useful for anyone else looking to an atomic get and update.
Redis supports atomic transactions http://redis.io/topics/transactions

Hazelcast Map reload on demand

HazeCast 3.2-RC1 Evaluation:
I am not able find any HazelCast api to re-load i.e, trigger MapLoader (loadAllKeys(), loadAll()) on-demand.
I see this autoload (ALL) happens only when Server starts, But I need a control to re-load on demand when required to re-synchronize with underlying database.
Map.clear() clears all the data, but not finding any control to to re-load automatically rather write additional code to populate the data and push it to the cache?
Can some advise if there are any workarounds?
Thanks
the documentation says that the MapStore is called if a key is not in memory. So after you clear the Map will be populated by simply call get() on it. You will only have the data in memory that is really used.
On the other hand, MapStore is called "when the map is first touched/used". Maybe you can create a new hazelcast map and switch to the new map.
see http://www.hazelcast.org/docs/latest/manual/html-single/hazelcast-documentation.html#persistence for more information.
Regards
Thorsten

How to delete and create graphs in titan/cassandra?

I am just wondering how to delete an entire graph from titan/cassandra?
I've set up Titan with Elastic Search, Cassandra and Rexster from the website. I created a simple graph called "graph". The problem I encountered is I did not realize that
Once an index has been created for a key, it can never be removed.
So I when experimenting I made a lot of random indexes good and bad. I tried renaming it in rexster-cassandra-es.xml thinking it would lose the reference but it just continued with a different rexster path.
Also how do you create new databases? When I started Rexster it just created a database named "graph" for me and I noticed that it allowed you to choose which database to use.
Thank you very much.
You can do a few things, but the easiest might be to simply shutdown casssandra/elastic search/rexster and just delete the data directories. Restart it all and get a fresh start. You could also drop the "titan" key space in cassandra and delete indices in elastic search...the instructions for these approaches would be in the respective documentation for each component.

Presenting background loading data through a ContentProvider Cursor

I'm currently re-factoring an Android project that in a few places loads data on background threads in order to update list views. The API that is being called to collect the data has a callback mechanism, so when a lot of data is returned (which takes a long time) I can handle the results asynchronously.
In the old code, this data was packaged up as an appropriate object and passed into a handle on the UI thread, to be inserted into the list view's adapter. This worked well, but I've decided that presenting the data through a ContentProvider would make the project easier to maintain and expand.
This means I need to provide the data as a Cursor object when requested via the query method.
So far I've been unable to update the data in the Cursor after retuning it. Does this mean that all of the data needs to be collected before returning the Cursor? The Android LoaderThrottleSupport sample suggests that I don't, but I have yet to get it working for anything other than an SQL backend.
Has anyone else tried to present non-SQL backed asynchronous data in this sort of way?

Resources