Script to run Rexster as a daemon in linux - linux

I'm setting up Titan graph database for the first time in a production environment on Debian virtual machines, and I am utilising Rexster to provide the interface into Titan. However after googling around I cannot find any scripts to allow rexster to run as a daemon in the background. As per titan rexster with external cassandra instance I have split off Cassandra, Elasticsearch, and Rexster to start as their own processes. Cassandra and Elasticsearch conveniently have Debian packages that deploy the daemon scripts out of the box, however there is nothing for Rexster. Has anyone made a script that allows Rexster to run as a daemon?
Looking at the rexster.sh script in titan download zip ../$titan_base/bin/ it calls java to start Rexster up, so I'm thinking that some kind of wrapper like JSVC could be used to start it up, unless there is an easier way?

A simple, generic tool to handle this is Daemonize. More details in this post.
If your Debian is new enough to be using Systemd, look into creating a service script. The key commands for using your script would be:
systemctl start rexster.service
systemctl enable rexster.service

Related

Creating Linux Service for Cassandra DDAC

Creating Linux service for DataStax Distribution of Apache Cassandra (DDAC)
Hi,
Installed DataStax Distribution of Apache Cassandra (DDAC), the Cassandra community version by DataStax.
Used this link:
https://docs.datastax.com/en/ddac/doc/datastax_enterprise/install/installDDAC.html
At the end of the instructions, it says to start Cassandra using interactive command, not as a service:
$ bin/cassandra
Also, there is NO option to create a service for Cassandra using:
$ service cassandra start
I get:
Failed to start cassandra.service: Unit not found.
Does DDAC support starting as a service?
Regards,
You are right, DDAC has those instructions to launch the process from command line, if you want to set it as a service, my guess is that Datastax will provide it as part of their enterprise product.
You can still create the systemd service unit, there are multiple examples in github like this one

Linux Hadoop Services monitoring tool and restart if down

I have configured Hadoop 2.7.5 with Hbase. It is a 5 system cluster in fully distributed mode. I have to monitor Hadoop/Hbase daemons and want to start some action (e.g. mail ) if some daemon goes down. Is there any built-in solution.
Also I want to start Hadoop at boot time. How can I do this ?
I am assuming that you aren't using major dists like Cloudera or Hortonworks, they have this feature built into their stack.
For automated restarts a boot time, you can use the init.d (or systemd)
and example of emailing out in the event of failure there is scripting solution in this answer Bash script to monitor process and sendmail if failed
In enterprise organisation, most will have a monitoring solution in place such as tivoli, which you can hook into.

titan rexster with external cassandra instance

I have a cassandra cluster (2.1.0) running fine.
After installing titan 5.1, and editing the titan-cassandra.properties to point to cluster hostname list rather than localhost, i run following -
titan.sh -c conf/titan-cassandra.properties start
It is able to recognize running cassandra instance, starts elastic search, but times out while connecting to rexster.
If i run it with local cassandra, everything runs fine using following ->br>
titan.sh start
do i need to make any change in rexster properties to point to running cassandra cluster..
Thanks in advance
Titan Server started by titan.sh represents a quick way to get started with Titan/Rexster/ES. It is designed to simplify running all those things with one startup script. Once you start breaking things apart (e.g. a separate cassandra cluster), you might not want to use titan.sh anymore because, it still forks a cassandra process when it starts up. Presumably, you don't need that anymore, given that you have a separate cassandra cluster.
Given the more advanced nature of your environment, I would simply download Rexster and configure it to connect to your cluster.

Where to find Titan error logs?

I'm using Titan with Cassandra, Elasticsearch and Rexster.
Everything is properly set up and I can add/remove nodes and edges to the graph through Rexster as well as the REST API.
When it crashes, I have to kill java and run it again. The error that I get in Rexster is:
Could not get the vertices of graphs from Rexster.
It happens often and I don't know what the problem is. I'm not sure what part of the stack -- Titan, Rexster or Elasticsearch -- fails.
Where can I find a log file that I could look at to find out what the problem is?
I assume that you are using Titan Server distribution. By default there should be a log directory in the root of your titan installation directory. It should contain two files:
cassandra.log - obviously for cassandra
rexstitan.log - Rexster logs. As Rexster hosts Titan, the Titan logging messages should be in here as well.
It also depends of your Titan configuration, whether is remote or embedded. Cassandra logs are usually stored in /var/log/cassandra/. Check there also.

How to setup Titan with embedded Cassandra and Rexster

I am trying to setup Titan (server 0.4.4) with Cassandra embedded. My
environment is Windows 8.1 x64 + Cygwin.
The install is in E:\titan-server-0.4.4.
I also need to be able to access this setup via Rexster.
For my configuration, I referred to https://github.com/thinkaurelius/titan/wiki/Using-Cassandra.
I've modified graph configuration
E:\titan-server-0.4.4\conf\rexster-cassandra-es.xml
graph section to
<graph>
<graph-name>graph</graph-name>
<graph-type>com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration</graph-type>
<graph-read-only>false</graph-read-only>
<properties>
<auto-type>none</auto-type>
<storage.batch-loading>true</storage.batch-loading>
<storage.cassandra-config-dir>file:///E:\titan-server-0.4.4\conf\cassandra.yaml</storage.cassandra-config-dir>
<storage.backend>embeddedcassandra</storage.backend>
<storage.index.search.backend>elasticsearch</storage.index.search.backend>
<storage.index.search.directory>../db/es</storage.index.search.directory>
<storage.index.search.client-only>false</storage.index.search.client-only>
<storage.index.search.local-mode>true</storage.index.search.local-mode>
</properties>
<extensions>
<allows>
<allow>tp:gremlin</allow>
</allows>
</extensions>
</graph>
(Note
<auto-type>none</auto-type>
<storage.batch-loading>true</storage.batch-loading>
these are to allow bulk insert. The whole idea of embedded Cassandra is to improve the insertion performance.)
However, when I tried starting the service with ./bin/titan.sh -v start, the start failed with:
org.apache.cassandra.exceptions.ConfigurationException:
localhost/127.0.0.1:7000 is in use by another process. Change
listen_address:storage_port in cassandra.yaml to values that do not
conflict with other services
at org.apache.cassandra.net.MessagingService.getServerSocket(MessagingService.java:439)
at org.apache.cassandra.net.MessagingService.listen(MessagingService.java:387)
at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:549)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:514)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:411)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:278)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:366)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:409)
at com.thinkaurelius.titan.diskstorage.cassandra.utils.CassandraDaemonWrapper.start(CassandraDaemonWrapper.java:51)
at com.thinkaurelius.titan.diskstorage.cassandra.embedded.CassandraEmbeddedStoreManager.(CassandraEmbeddedStoreManager.java:102)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at com.thinkaurelius.titan.diskstorage.Backend.instantiate(Backend.java:344)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:367)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:311)
at com.thinkaurelius.titan.diskstorage.Backend.(Backend.java:121)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1173)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.(StandardTitanGraph.java:75)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:40)
at com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration.configureGraphInstance(TitanGraphConfiguration.java:25)
at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(GraphConfigurationContainer.java:119)
at com.tinkerpop.rexster.config.GraphConfigurationContainer.(GraphConfigurationContainer.java:54)
at com.tinkerpop.rexster.server.XmlRexsterApplication.reconfigure(XmlRexsterApplication.java:99)
at com.tinkerpop.rexster.server.XmlRexsterApplication.(XmlRexsterApplication.java:47)
at com.tinkerpop.rexster.Application.(Application.java:96)
at com.tinkerpop.rexster.Application.main(Application.java:188)
localhost/127.0.0.1:7000 is in use by another process. Change
listen_address:storage_port in cassandra.yaml to values that do not
conflict with other services
I tried mofiying the ports in "E:\titan-server-0.4.4\conf\cassandra.yaml", but after some investigation, I've realized that the port is actually taken by Cassandra itself, i.e. in this configuration, ./bin/titan.sh -v start tries to start multiple instances of Cassandra?!
I copied cassandra.yaml to cassandra2.yaml with different port settings and specified path to cassandra2.yaml in the graph configuration xml.
After this, I was able to start Rexster with Titan and Cassandra embedded by running ./bin/titan.sh -v start.
However, I strongly believe that something is wrong with this setup. Besides, the system does not behave well - sometime I cannot save a graph in Rexster's (Web based) Gremlin shell by using g.commit() - the command succeeds, but nothing gets saved.
So is the right way to run Titan with Cassandra embedded? What is the configuration supposed to be?
If you use Titan server via the shell or bat script, it will automatically start a Titan instance for you and attempt to connect to it over localhost.
When you configured it to use Cassandra embedded, the two instances naturally conflict.
Is there a particular reason you want to use Cassandra embedded. I'd strongly encourage you to try the out-of-the-box version first. Cassandra embedded is mostly meant for low latency applications and requires a solid understanding of the JVM.
Good luck!

Resources