Why can't I cancel CouchDB 2.1.1 replication? - couchdb

Just a couple days ago, I migrated from CouchDB 1.6.1 to 2.1.1 in our production environment. The couchup replicate step was taking forever, and then I noticed that most continuous pull replication was running, so I figured that must be slowing it down.
So I deleted all the continuous replication documents from the _replicator database, hoping that would stop the replication. It didn't. I even restarted CouchDB, and the continuous replication is still occurring.
If I look in the Replication section in Fauxton, both tabs are empty. But when I look in the Tasks section, continuous replication is still running, 3 days later.
How do I stop continuous replication?
===========================================
UPDATED 2018/03/27 21:00 UTC
Trying to delete from _scheduler, but that's not allowing me to do anything but GET and HEAD. And I'm logged in as the system admin. (FWIW, _scheduler is not showing up as a database).
# curl -k -X GET 'http://localhost:5984/_scheduler/jobs/5e9bf92325eee5caff026e47197f9c3d+continuous+create_target' -u couchdbadmin:XXXXXXXX
{"database":"_replicator","id":"5e9bf92325eee5caff026e47197f9c3d+continuous+create_target","pid":"<0.606.0>","source":"http://couchdbadmin:*****#prdpccggww3n03.w3-969.ibm.com:5984/youribm-timeinout/","target":"http://couchdbadmin:*****#localhost:5984/youribm-timeinout/","user":null,"doc_id":"youribm-timeinout-continuous-replication-from-primary","history":[{"timestamp":"2018-03-24T00:01:41Z","type":"started"},{"timestamp":"2018-03-24T00:01:41Z","type":"added"}],"node":"couchdb#127.0.0.1","start_time":"2018-03-24T00:01:41Z"}
# curl -k -X DELETE 'http://localhost:5984/_scheduler/jobs/5e9bf92325eee5caff026e47197f9c3d+continuous+create_target' -u couchdbadmin:XXXXXXXX
{"error":"method_not_allowed","reason":"Only GET,HEAD allowed"}

Related

Does presto support adding data sources dynamically?

Does presto support adding data sources dynamically?
If don't, how can I achieve the purpose of adding new catalog by watching .properties files without restarting cluster?
Currently Presto does not support addition or removal of catalog without server restart. There is a long running open issue about it which discusses the challenges related to implementing it https://github.com/prestodb/presto/issues/2445. I think the best you can do currently is to push the .properties changes to all nodes in the cluster and restart Presto daemons. You could invoke graceful shutdown on the worker nodes to minimize query failures and have something like monit automatically bring up Presto server if it is shutdown.
curl -v -XPUT --data '"SHUTTING_DOWN"' -H "Content-type: application/json" http://node-id:8081/v1/info/state
Restart of Presto daemon on the coordinator would still cause brief outage unless you have coordinator HA setup.

Does cassandra upgrade require to run nodetool upgradesstables for cluster holding TTLed data

I am running 3 node apache cassandra cluster as docker container holding timeseries data with 45 days TTL.
I am planning to upgrade the current cassandra version 2.2.5 to cassandra 3.11.4 release. Following steps are identified for the upgrade -
Backup existing data
Flush one of the cassandra node
bin/nodetool -h cassandra1 -u ca_itoa -pw ca_itoa drain
Stop the cassandra1 node
Start the new cassandra 3.11.4 container
Upgrade the SSTable
bin/nodetool -u ca_itoa -pw ca_itoa upgradesstables
Check the node status. Repeat the process for the rest of the nodes
I have few questions about the upgrade process -
Are the steps correct?
Is it manodatory to run upgradesstables command. It is time consuming, and I want to see if I can avoid. The data has TTL set. Will the cassandra continue writing in new SSTable format whereas the old SSTable data get cleaned-up on expiring? Assumption is that, after 45 days, all SSTable would be in new shiny format.
Just some additional thoughts:
For Step #6, you actually don't have to run upgradesstables right away. In fact, if you're upgrading a production system, it's probably better that you don't until the application team verifies that they can connect ok. Remember, older versions of the driver which work in 2.2 may not work with 3.11.4.
To this end, I would wait until the entire cluster is running on the new version before running upgradesstables on each node.
Is it manodatory to run upgradesstables command?
As each Cassandra version is capable of reading its own SSTable format as well as the prior major version, I guess it's not mandatory. But it's definitely something that you should want to do. Especially when upgrading to 3.x.
Cassandra 3 contains a significant upgrade to the storage engine, which results is a much smaller disk footprint. One cluster I upgraded saw a 90% reduction in disk needs.
Plus, you'd be incurring additional latency when reading records which may be spread across the old SSTable files as well as the new. Reads for records across multiple files are bad enough as it is. But now you'd be forcing Cassandra to read and collate results from two formats.
So while I wouldn't say it's "mandatory," I'd definitely say it qualifies as a "good idea."
Yes, you need to run nodetool sstableupgrade on each node after cassandra upgrade as you are upgrading from 2.2.x to 3.11.4. sstable file format and ext also will change. You may run this process on background and it will not create any issue. please refer below links for more details https://blog.thethings.io/upgrading-apache-cassandra-cluster/

How long does memsql upgrade takes?

I have started an offline upgrade process to upgrade my MemSql Cluster from 5.8 To 6.5, Data size is around 300G it's been 5 hours already but i have lost all access to cluster and also there is no way to check the status.
memsql-ops memsql-list returns all leaves and aggregator shows online.
But, memsql> SHOW LEAVES; return empty set, my master aggregator automatically converted to child aggregator, so now i don't have any master aggregator.
I can't execute any command (Like AGGREGATOR SET AS MASTER) to child aggregator, it says 'memsql is not running as an aggregator', Or 'memsql node is not running', and sql query returns "The database 'xxx' is not available for queries, as it is waiting for the Master Aggregator to bring it online. Run SHOW DATABASES EXTENDED ..."
Also performing any management command like memsql-ops restart returns "Job cannot run because there is a MemSql upgrade intention with ID xxx is in progress"
Any information about this will be helpful as i am not able to find any related information online.
Thanks in advance...
We debugged the issue in MemSQL public chat and it was found that the Master Agg was running an unsupported beta version of MemSQL (6.0.0) which prevented the upgrade and then corrupted the database post upgrade.
For future readers please audit that you are not running beta versions of MemSQL on production clusters. If you are, not only will upgrade likely break, but it may not be possible to recover your data on a non-beta cluster.

cassandra cluster not re-starting upon improper shutdown

I have a test cluster on 3 machines where 2 are seeds all centos7 and all cassandra 3.4.
Yesterday all was fine they were chating and i had the "brilliant" idea to ....power all those machines off to simulate a power failure.
As a newbie that i am, i simply powered the machines back and i expected probably some kind of supermagic, but here it is my cluster is not up again, each individual refuses to connect.
And yes, my firewalld is disabled.
My question : what damage was made and how can i recover the cluster to the previous running state?
Since you abruptly shutdown your cluster, that simply means, nodes were not able to drain themselves.
Don't worry, it is unlikely any data loss happened because of this, as cassandra maintains commit logs, and will read from it when it is restarted.
First, find your seed node ip from cassandra.yaml
Start your seed node first.
Check the start up logs in cassandra.log and system.log and wait for it to start up completely, it will take sometime.
As it will read from commit log for pending actions, and will replay them.
Once it finishes starting up, start other nodes, and tail their log files.

Cassandra hangs on arbitrary commands

We're hosting Cassandra 2.0.2 cluster on AWS. We've recently started upgrading from normal to SSD drives, by bootstrapping new and decommissioning old nodes. It went fairly well, aside from two nodes hanging forever on decommission. Now, after the new 6 nodes are operational, we noticed that some of our old tools, using phpcassa stopped working. Nothing has changed with security groups, all ports TCP/UDP are open, telnet can connect via 9160, cqlsh can 'use' a cluster, select data, however, 'describe cluster' fails, in cli, 'show keyspaces' also fails - and by fail, I mean never exits back to prompt, nor returns any results. The queries work perfectly from the new nodes, but even the old nodes waiting to be decommissioned cannot perform them. The production system, also using phpcassa, does normal data requests - it works fine.
All cassandras have the same config, the same versions, the same package they were installed from. All nodes were recently restarted, due to seed node change.
Versions:
Connected to ### at ####.compute-1.amazonaws.com:9160.
[cqlsh 4.1.0 | Cassandra 2.0.2 | CQL spec 3.1.1 | Thrift protocol 19.38.0]
I've run out out of ideas. Any hints would be greatly appreciated.
Update:
After a bit of random investigating, here's a bit more detailed description.
If I cassandra-cli to any machine, and do "show keyspaces", it works.
If I cassandra-cli to a remote machine, and do "show keyspaces", it hangs indefinitely.
If I cqlsh to a remote cassandra, and do a describe keyspaces, it hangs. ctrl+c, repeat the same query, it instantly responds.
If I cqlsh to a local cassandra, and do a describe keyspaces, it works.
If I cqlsh to a local cassandra, and do a select * from Keyspace limit x, it will return data up to a certain limit. I was able to return data with limit 760, the 761 would fail.
If I do a consistency all, and select the same, it hangs.
If I do a trace, different machines return the data, though sometimes source_elapsed is "null"
Not to forget, applications querying the cluster sometimes do get results, after several attempts.
Update 2
Further playing introduced failed bootstrapping of two nodes, one hanging on bootstrap for 4 days, and eventually failing, possibly due to a rolling restart, and the other plain failing after 2 days. Repairs wouldn't function, and introduced "Stream failed" errors, as well as "Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException". Also, after executing repair, started getting "Read an invalid frame size of 0. Are you using tframedtransport on the client side?", so..
Solution
Switch rpc_server_type from hsha to sync. All problems gone. We worked with hsha for months without issues.
If someone also stumbles here:
http://planetcassandra.org/blog/post/hsha-thrift-server-corruption-cassandra-2-0-2-5/
In cassandra.yaml:
Switch rpc_server_type from hsha to sync.

Resources