Upgrade janusgraph from 0.2.2 to 0.5.2 - cassandra

I'm new to Janusgraph. I need to upgrade the Janausgraph version from 0.2.2(storage: cassandra, index: es) to the latest stable version (0.5.2). I've gone through the docs/forums how to initiate the process (I've seen only the changelog). I wasn't able to figure out the clear/direct solution. Whether to go for incremental upgrade (0.2.2 > 0.x.x* > 0.5.2) or direct upgrade (install 0.5.2, try to dump the cassandra data some way, iff works)
I've tried the second, downloaded the latest janusgraph (both base and -full dist), installed the latest cassandra(311) and es(6xx,7xx). I've copied the old cassandra data to the latest cassandra (/var/lib/cassandra). I've started both the servers, janusgraph and cassandra, it is up and running. But when I tried to interact with janusgraph(via gremlin server), it gave error like "Gremlin groovy script engine - Illegal Argument exception "
I figured out this is how it should not be done. I need to do an incremental upgrade by proper import/export data.
Can someone help me, how should I proceed further in incremental upgrade. How can I export/import all the janusgraph/gremlin-server data.

You will need to stop the 0.2 instance, set the configuration graph.allow-upgrade=true to janusgraph.properties (see here), then start a new 0.5 instance on top of the same Cassandra (or if needed migrate the old Cassandra/ES data to newer Cassandra/ES instances).
Thereafter, a good practice is to stop this 0.5 instance, remove the graph.allow-upgrade setting, then restart it for normal use, and change it only when the next upgrade is needed.

I almost forgot to write answer (Late but might be useful).
Firstly there aren't any incremental upgrades required. We can upgrade with simple "import/export" commands.
There are 3 different formats available as of now: json, xml and binary(gryo).
Gremlin commands (gremlin-cli):
// Export from *version(0.2.2)*
graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
graph.io(IoCore.gryo()).writeGraph('janusgraph_dump_2020_09_30_local.gryo')
graph.tx().commit()
// Import to *version(0.5.2)*
graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
graph.io(IoCore.gryo()).readGraph('janusgraph_dump_2020_09_30_local.gryo')
graph.tx().commit()
This solved my problem.

Related

Where is slow queries data in OpsCenter read from?

Since our former data model is not very correct, the Slow queries panel shows that there are some queries which are performing slowly.
As I am planing to redesign the data model, I want to clear out the old information displayed in this panel, so I can see only information about my new data model. However, I do not know where OpsCenter is reading this data from.
My idea is that if this information is stored in a table or file, I can truncate or delete them. Or am I totally wrong with that assumption and this could be done by a configuration file modification or something similar instead?
OpsCenter Version: 6.0.3
Cassandra Version: 2.1.15.1423
DataStax Enterprise Version: 4.8.10
It follows dse_perf.node_slow_log. Each node will track new events in the log as they occur, and store their top X. When viewing it in UI it gets the top X from each node and merges them. To "reset" you can truncate the log and restart the datastax agents to clear its current top X. There is a feature to reset for you in future but in 6.0.3 its a little difficult.

Get cache entries(keys, values) list on particular node in Apache ignite

Is there any option in ignitevisorcmd where I can see what entries(key,value details) are present in particular node? I tried cache -scan -c=mycache -id8=12345678 command but it prints entries from all other nodes also for mycache instead of printing data for 12345678 node only.
Current version of Visor Cmd does not support this, but I think it is easy to implement. I created issue in Ignite JIRA, you may track or even contribute.

How to migrate data from Cassandra 2.1.9 to a fresh 3.5 installation

I tried to use sstableloader to load data into Cassandra 3.5. The data was captured using nodetool snapshot under Cassandra 2.1.9. All the tables loaded fine except one. It's small, only 2 columns and 20 rows. So, I entered this bug: https://issues.apache.org/jira/browse/CASSANDRA-11806. The bug was quickly closed as a duplicate. It doesn't seem to be a duplicate, since the original case is upgrading a node in-place, not loading data with sstableloader.
Even so, I tried to apply the the advice given to run upgradesstable [sic].
The directions given to upgrade from one version of Cassandra to another seem sketchy at best. Here's what I did based on my working backup/restore and info garnered from various Cassandra docs on how to upgrade:
Snapshot the data from prod (Cassandra 2.1.9), as usual
Restore data to Cassandra 2.1.14 running on my workstation
Verify the restore to 2.1.14 (it worked)
Copy the data/data/makeyourcase into a Cassandra 3.5 install
Fire up Cassandra 3.5
Run nodetool upgradesstables to upgrade the sstables to 3.5
nodetool upgradesstables fails:
>./bin/nodetool upgradesstables
error: Unknown column role in table makeyourcase.roles
-- StackTrace --
java.lang.AssertionError: Unknown column role in table makeyourcase.roles
So, the questions: Is it possible to upgrade directly from 2.1.x to 3.5? What's the actual upgrade process? The process at http://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgradeCassandraDetails.html is seemingly missing important details.
This turned out to be a problem with the changing state of the table over time.
Since the table was small, I was able to migrate the data by using COPY to export the data to CSV and then importing it into the new version.
Have a look at https://issues.apache.org/jira/browse/CASSANDRA-11806 for discussion of another workaround and a coming bug fix.

Loading Cassandra data with SStableloader from different Cassandra cluster

I have two different independent machines running Cassandra and I want to migrate the data from one machine to the other.
Thus, I first took a snapshot of my Cassandra Cluster on machine 1 according to the datastax documentation.
Then I moved the data to machine 2, where I'm trying to import it with sstableloader.
As a note: The keypsace (open_weather) and tablename (raw_weather_data) on the machine 2 have been created and are the same as on machine 1.
The command I'm using looks as follows:
bin/sstableloader -d localhost "path_to_snapshot"/open_weather/raw_weather_data
And then get the following error:
Established connection to initial hosts
Opening sstables and calculating sections to stream
For input string: "CompressionInfo.db"
java.lang.NumberFormatException: For input string: "CompressionInfo.db"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:276)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:235)
at org.apache.cassandra.io.sstable.Component.fromFilename(Component.java:120)
at org.apache.cassandra.io.sstable.SSTable.tryComponentFromFilename(SSTable.java:160)
at org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:84)
at java.io.File.list(File.java:1161)
at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:78)
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:162)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
Unfortunately I have no idea why?
I'm not sure if it is related to the issue, but somehow on machine 1 my *.db files are name rather "strange" as compared to the *.db files I already have on machine 2.
*.db files from machine 1:
la-53-big-CompressionInfo.db
la-53-big-Data.db
...
la-54-big-CompressionInfo.db
...
*.db files from machine 2:
open_weather-raw_weather_data-ka-5-CompressionInfo.db
open_weather-raw_weather_data-ka-5-Data.db
What am I missing? Any help would be highly appreciated. I'm also open to any other suggestions. The COPY command will most probably not work since it is Limited to 99999999 rows as far as I know.
P.s. I didn't want to create a overly huge post, but if you need any further information to help me out, just let me know.
EDIT:
Note that I'm using Cassandra in the stand-alone mode.
EDIT2:
After installing the same version 2.1.4 on my destination machine (machine 2), I still get all the same error. With SSTableLoader I still get the above mentioned error and with copying the files manually (as described by LHWizard), I still get empty tables after starting Cassandra again and performing a SELECT command.
Regarding the initial tokens, I get a huge list of tokens if I perform node ring on machine 1. I'm not sure what to do with those?
your data is already in the form of a snapshot (or backup). What I have done in the past is the following:
install the same version of cassandra on the restore node
edit cassandra.yaml on the restore node - make sure that cluster_name and snitch are the same.
edit seeds: list and any other properties that were altered in the original node.
get the schema from the original node using cqlsh DESC KEYSPACE.
start cassandra on the restore node and import the schema.
(steps 6 & 7 may not be completely necessary, but this is what I do.)
stop cassandra, delete the contents of /var/lib/cassandra/data/, commitlog/, and saved_caches/* folders.
restart cassandra on the restore node to recreate the correct folders, then stop it
copy the contents of the snapshots folder to each corresponding table folder in the restore node, then start cassandra. You probably want to run nodetool repair.
You don't really need to bulk import the data, it's already in the correct format if you are using the same version of cassandra, although you didn't specify that in your original question.

How can I check if I need to execute `nodetool upgradesstables` on a node?

How can I check if I need to execute nodetool upgradesstables on a node? Or will it only do anything if necessary (and simply ignore the command)?
You need to run nodetool upgradesstables after updating Cassandra to a version that needs have your sstables converted to the latest format. Usually this only happens on new major releases. But you should make sure to read the upgrade instructions from the NEXT.txt (e.g. 2.1.8) to find out exactly if this step is required.

Resources