We have a couple of production couchdb databases that have blown out to 30GB and need to be compacted. These are used by a 24/7 operations website and are replicated with another server using continuous replication.
From tests I've done it'll take about 3 mins to compact these databases.
Is it safe to compact one side of the replication while the production site and replication are still running?
Yes, this is perfectly safe.
Compaction works by constructing the new compacted state in memory, then writing that new state to a new database file and updating pointers. This is because CouchDB has a very firm rule that the internals of the database file never gets updated, only appended to with an fsync. This is why you can rudely kill CouchDB's processes and it doesn't have to recover or rebuild the database like you would in other solutions.
This means that you need extra disk space available to re-write the file. So, trying to compact a CouchDB database to prevent full disk warnings is usually a non-starter.
Also, replication uses the internal representation of sequence trees (b+trees). The replicator is not streaming the entire database file from disk onto the network pipe.
Lastly, there will of course be an increase in system resource utilization. However, your tests should have shown you roughly how much this costs on your system vs an idle CouchDB, which you can use to determine how closely you're pushing your system to the breaking point.
I have been working with CouchDB since a while; replicating databases and writing Views to fetch data.
I have seen its replication behavior and observed this, which can answer your question:
In the replication process previous revisions of the documents are not replicated to the destination, only current revision is replicated.
Compacting database only removes the previous revisions. So it will not cause any problem.
Compaction will be done on the database on which you are currently logged in. So it should not affect its replica which is continuously listening for changes in it. Because it listens for the current revision changes not the previous revisions. To verify it you can see this:
Firing this query will show you changes of all the sequences of database. It only works on the basis of latest revision changes not the previous ones(So I think compaction will not make any harm):
curl -X GET $HOST/db/_changes
The result is simple:
{"results":[
],
"last_seq":0}
More info can be found here: CouchDB Replication Basics
This might help you to understand it. In short answer of your question is YES, It is safe to compact database in continuous replication.
Related
I'm quite sure that I want to delete a database in order to release my resources. I'll never need replication nor the old version, nor logs anymore. But despite my frequently deletion of the database and recreating another, the disk space increases gradually.
How can I simply get rid of the whole database and it's affects on the disk?
Deleting the database via DELETE /db-name removes the database's data and associated indexes on disk. The database is as deleted as it's going to be.
If you are using the purge feature to remove all the documents in the database, instead consider a DELETE followed by a PUT to recreate.
Logs are a different matter, as they are not database-specific, but for the database engine itself. It might be that you need to clear old logs, but you will probably only be able to do that on a time-based rather than database-based manner.
I have a Node.js application that receives data via a Websocket connection and pushes each message to an Azure Redis cache. It stores a persistent array of messages in a variable for downstream use, and at regular intervals syncs that array from the cache. Bit convoluted, but at a later point I want to separate out the half of the application that writes to the cache from the half of it that reads from it..
At around 02:00 GMT, based on the Azure portal stats, I appear to have started getting "cache misses" on that sync, which last for a couple of hours before I started getting "cache hits" again sometime around 05:00.
The cache misses correspond to a sudden increase in CPU usage, which peaks at around 05:00. And when I say peaks, I mean it hits 81%, vs a previous max of about 6%.
So sometime around 05:00, the CPU peaks, then drops back to normal, the "cache misses" go away, but looking at the cache memory usage, I drop from about 37.4mb used to about 3.85mb used (which I suspect is the "empty" state), and the list that's being used by this application was emptied.
The only functions that the application is running against the cache are LPUSH and LRANGE, there's nothing that has any capability to remove data, and in case anybody was wondering, when the CPU ramped up the memory usage did not so there's nothing to suggest that rogue additions of data cropped up.
It's only on the Basic plan, so I'm not expecting it to be invulnerable or anything, but even without the replication features of the Standard plan I had expected that it wouldn't be in a position to completely wipe itself - I was under the impression that Redis periodically writes itself to disk and restores from that when it recovers from an error.
All of which is my way of asking:
Does anybody have any idea what might have happened here?
If this is something that others have been able to accidentally trigger themselves, are there any gotchas I should be looking out for that I might have in other applications using the same cache that could have caused it to fail so catastrophically?
I would welcome a chorus of people telling me that the Standard plan won't suffer from this sort of issue, because I've already forked out for it and it would be nice to feel like that was the right call.
Many thanks in advance..
Here my thoughts:
Azure Redis Cache stores information in memory. By default, it won't save a "backup" on disk, so, you had information in memory, for some reason the server got restarted and you lost your data.
PS: See this feedback, there is no option to persist information on disk using azure-redis cache yet http://feedback.azure.com/forums/169382-cache/suggestions/6022838-redis-cache-should-also-support-persistence
Make sure you don't use Basic plan. Basic plan doesn't suppose SLA and from my expirience it lost data quite often
Standard plan provides SLA and utilize 2 instances of Redis Cache. It's quite stable and it didn't lose our data, although such case still possible.
Now, if you're going to use Azure Redis as database, but not as a cache you need to utilize data persistance feature, which is already available in Azure Redis Cache Premium Tier: https://azure.microsoft.com/en-us/documentation/articles/cache-premium-tier-intro (see Redis data persistence)
James, using the Standards instance should give you much improved availability.
With the Basic tier any Azure Fabric update to the Master Node (or hardware failure), will cause you to loose all data.
Azure Redis Cache does not support persistence (writing to disk/blob) yet, even in Standard Tier. But the Standard tier does give you a replicated slave node, that can take over if you Master goes down.
I am a novice programmer in Node JS. I have a few queries regarding process related issues like locking and race conditions in Node JS and Mongo DB.
My codes are working perfectly in local environment,but when I am moving to production and come across large number of requests,I might encounter certain issues.
How do we avoid write level race conditions for mongo slaves located in different regions? ie say one piece of data is being written locally but the true value for it is being written remotely that is delayed
Consider we have node processes located regionally would it need to hit mongo master located in another region which then routes the request to a regional slave? This considerably increases the latency of each write - how do we avoid this? Can we have direct writes to regional slaves from local processes and some kind of replication to maintain data consistency?
I use a Node REST api and use mongoose as the Mongo DB driver.Any help would be deeply appreciated .Thank you .
MongoDB's automatic failover and high availability features are provided by what's called replication. The standard MongoDB terms are "primary" for master and "secondary" for slave, so I'll use those terms to be consistent with the documentation and the user base at large. I think both of your questions are answered by one fact: in a replica set, the primary is the only member that accepts writes from clients, ever. The secondaries get the data replicated to them asynchronously a short time later. To answer the questions directly:
No writes to slaves except internal replication of writes from the primary, so no "race condition" with writes can arise.
All writes must go to the primary. The replication system will distribute to data to the secondaries asynchronously. You can read from secondaries, but it isn't a best practice despite its occasional utility. I'd suggest reading about replica set read preference and reading Asya Kamsky's blog post about scaling with replica sets before deciding to read from secondaries.
I have recently just started working with firebird DB v2.1 on a Linux Redhawk 5.4.11 system. I am trying to create a monitor script that gets kicked off via a cron job. However I am running into a few issues and I was hoping for some advice...
First off I have read through most of the documentation that come with the firebird DB and a lot of the documentation that is provided on their site. I have tried using the gstat tool which is supplied but that didn't seem to give me the kind of information I was looking for. I ran across README.monitoring_tables file which seemed to be exactly what I wanted to monitor. Yet this is where I started to hit a snag in my progress....
After running from logging into the db via isql, I run SELECT MON$PAGE_READS, MON$PAGE_WRITES FROM MON$IO_STATS; I was able to get some numbers which seemed okay. However upon running the command again it appeared the data was stale because the numbers were not updating. I waited 1 minute, 5 minutes, 15 minutes and all the data was the same during each. Once I logged off and back on to run the command again the data changed. It appears that only on a relog does the data refresh and yet I am not sure if even then the data is correct.
My question is now am I even doing this correct? Are these commands truly monitoring my db or are just monitoring the command itself? Also why does it take a relog to refresh the statistics? One thing I was worried about was inconsistency in my data. In other words my system was running yet when I would logon each time the read/writes were not linearly increasing. They would vary from 10k to 500 to 2k. Any advice or help would be appreciated!
When you query a monitoring table, a snapshot of the monitoring information is created so the contents of the monitoring tables are stable for the rest of the transaction. You need to commit and start a new transaction if you want fresh information. Firebird always uses a transaction (and isql implicitly starts a transaction if none was started explicitly).
This is also documented in doc/README.monitoring_tables (at least in the Firebird 2.5 version):
A snapshot is created the first time any of the monitoring tables is being selected from in the given transaction and it's preserved until the transaction ends, so multiple queries (e.g. master-detail ones) will always return the consistent view of the data. In other words, the monitoring tables always behave like a snapshot (aka consistency) transaction, even if the host transaction has been started with another isolation level. To refresh the snapshot, the current transaction should be finished and the monitoring tables should be queried in the new transaction context.
(emphasis mine)
Note that depending on your monitoring needs, you should also look at the trace functionality that was introduced in Firebird 2.5.
I wrote a service which does the following on startup:
Boots up CouchDB
Makes a pull _replicate (+continuous) request to CouchDB
Monitors _active_tasks for the 'progress' to reach 100 before considering itself as ready
However, the database I'm dealing with is fairly large and so the replication task takes a very long time to reach 100 even though I only turned off the database recently, and it had a continuous replication task before that, so it should be almost entirely up to date. That is, the incremental replication should be quick.
Why could it be taking so long considering it's already almost up to date, and is there anything I can do to either speed it up OR allow my service to consider itself as ready before "progress" reaches 100? The latter seems unlikely as I do want it to be fully up to date.
Thanks :)