LMDB using extra space when overwriting entries - lmdb

In LMDB, if you overwrite a previous entry in the same transaction, it seems that the previous entry's space is not released until the data is committed. In my application, I encountered a case where LMDB ran out of space because a particular entry was overwritten so many times, so I just increased the database size to get around this problem.
Is there a way to free unused space in LMDB to prevent this situation from happening?

There is no compaction mechanism in lmdb. If there are active readers when a writer is performing modifications on entries, these entries are copied to new pages in order to preserve consistent reads. When readers are done, old entries pages are returned to free pages, available to next modifications.
So to limit storage over consumption, one has to perform multiple small writes without active readers.

Related

Cassandra: how to automatically delete old records to avoid disk space shortage?

We are using TWCS for time series data with default TTL of 30 days an compaction window size of 1 day.
Unfortunately, there are cases when incoming data rate gets higher and not so much disk space left to write it. At the same time due to budget constraints adding new nodes to the cluster is not an option. Currently we resort to manually deleting old sstables, but it is error prone.
What is the best way in TWCS case to make Cassandra delete, say, all records that are older than certain date? I mean not to create tombstones in new sstable, but to actually delete old records from disk to free up space.
Of course, I can reduce TTL, but it will affect only new records (so will help only in a long run, but not immediately) and in a case when there is not so much incoming data records will be stored for a shorter period than could be.
Basically, that's the intent of the TTLs to automatically remove the old data. The explicit deletion always creates a tombstone, and it won't work well with with TWCS. So right now the solution would be to stop node, remove old files to free space, start the node - repeat on all nodes. But you're doing that already.

Best practices for ArangoDB compaction on-demand for file space reclamation

Part of my evaluation of ArangoDB involves importing a few CSV files of over 1M rows into a staging area, then deleting the resulting collections or databases. I will need to do this repeatedly for the production processes I envision.
I understand the the ArangoDB service invokes compaction periodically per this page:
https://docs.arangodb.com/3.3/Manual/Administration/Configuration/Compaction.html
After deleting a database, I waited over 24 hours and no disk space has been reclaimed, so I'm not sure this automated process is working.
I'd like answers to these questions:
What are the default values for the automatic compaction parameters shown in the link above?
Other than observing a change in file space, how do I know that a compaction worked? Is the a log file or other place that would indicate this?
How can I execute a compaction on-demand? All the references I found that discussed such a feature indicated that it was not possible, but they were from several years ago and I'm hoping this feature has been added.
Thanks!
The GET route /_api/collection/{collection-name}/figures contains a sub-attribute compactionStatus in the attribute figures with time and message of the last compaction for debugging purposes. There is also some other information in the response that you might be interested in. See if doCompact is set to true at all.
https://docs.arangodb.com/3.3/HTTP/Collection/Getting.html#return-statistics-for-a-collection
You can run arangod --help-compaction to see the startup options for compaction including the default values. This information is also available online in the 3.4 docs:
https://docs.arangodb.com/3.4/Manual/Programs/Arangod/Options.html#compaction-options
The PUT route /_api/collection/{collection-name}/rotate, quoting the documentation directly:
Rotates the journal of a collection. The current journal of the
collection will be closed and made a read-only datafile. The purpose
of the rotate method is to make the data in the file available for
compaction (compaction is only performed for read-only datafiles, and
not for journals)
Saving new data in the collection subsequently will create a new journal file automatically if there is no current journal.
https://docs.arangodb.com/3.3/HTTP/Collection/Modifying.html#rotate-journal-of-a-collection

Single Threaded LMDB

If you are using LMDB from only a single thread, and don't care about database persistence at all, is there any reason to open and close transactions?
Will it cause a performance issue to do all operations within a single transaction? Is there a performance hit from opening and closing too many transactions?
I am finding that my LMDB database is slowing down dramatically once it grows larger than available RAM, but neither my SSD nor CPU are near their capacity.
If the transaction is not committed, there is no guarantee that a reader(in a different process) can read the item. Write transactions should be committed at some point, so the data is available for other readers.
The database slowdown could simply be due to non sequential writes. From this post(https://ayende.com/blog/163330/degenerate-performance-scenario-for-lmdb), non sequential writes take longer.
If you don't commit your db just grows in memory, which will result in the OS starting to swap once you run out of memory, which hit's the disk, which is slow.
If you don't need persistence at all then use an in-memory hash-map, lmdb really doesn't provide you with anything in that case. If you do want persistence but don't care about loosing data then choose a reasonable commit (which depends on the value size, so experiment) ratio and commit i.e. after every 1000 values or so.
If you commit too infrequently you just incur the whole cost of disk access at a single point in time, so I think it makes more sense to spread that load a bit.

physical disk space management of cassandra

Recently I have been looking into Cassandra from our new project's perspective and learned a lot from this community and its wiki too. But I have not found anything about about how updates are managed in Cassandra in terms of physical disk space management though it seems to be very much similar to record delete management using compaction.
Suppose there are 100 records with 5 column values each so when all changes would be flushed disk all records will be written adjacently and when delete operation is done then its marked in Memory table first and physically record is deleted after some time as set in configuration or when its full. And the compaction process claims the space.
Now question is that at one side being schema less there is no fixed number of columns at the the beginning but on the other side when compaction process takes place then.. does it put records adjacently on disk like traditional RDBMS to speed up the read process as for RDBMS its easy because they have to allocate fixed amount of space as per declaration of columns datatype.
But how Cassandra exactly makes the records placement on disk in compaction process (both for update/delete) to speed up the reads?
One more question related to compaction is that when there is no delete queries but there is an update query which updates an existent record with some variable length data or insert altogether a new column then how compaction makes its space available on disk between already existent data rows?
Rows and columns are stored in sorted order in an SSTable. This allows a compaction of multiple SSTables to output a new, (sorted) SSTable, with only sequential disk IO. This new SSTable will be outputted into a new file and freespace on the disks. This process doesn't depend on the number of rows of columns, just on them being stored in a sorted order. So yes, in all SSTables (even those resulting form compactions) rows and columns will be arranged in a sorted order on disk.
Whats more, as you hint at in your question, updates are no different from inserts - they do not overwrite the value on disk, but instead get buffered in a Memtable, then get flushed into a new SSTable. When the new SSTable eventually gets compacted with the SSTable containing the original value, the newer value will annihilate the old one - ie the old value will not be outputted from the compaction. Timestamps are used to decide which values is newest.
Deletes are handled in the same fashion, effectively inserted an "anti-value", or tombstone. The limitation of this process is that is can require significant space overhead. Deletes are effectively 'lazy, so the space doesn't get freed until some time later. Also, while the output of the compaction can be the same size as the input, the old SSTables cannot be deleted until the new one is completed, so this can reduce disk utilisation to 50%.
In the system described above, new values for an existing key can be a different size to the existing key without padding to some pre-determined length, as the new value does not get written over the old value on update, but to a new SSTable.

CouchDB .view file growing out of control?

I recently encountered a situation where my CouchDB instance used all available disk space on a 20GB VM instance.
Upon investigation I discovered that a directory in /usr/local/var/lib/couchdb/ contained a bunch of .view files, the largest of which was 16GB. I was able to remove the *.view files to restore normal operation. I'm not sure why the .view files grew so large and how CouchDB manages .view files.
A bit more information. I have a VM running Ubuntu 9.10 (karmic) with 512MB and CouchDB 0.10. The VM has a cron job which invokes a Python script which queries a view. The cron job runs once every five minutes. Every time the view is queried the size of a .view file increases. I've written a job to monitor this on an hourly basis and after a few days I don't see the file rolling over or otherwise decreasing in size.
Does anyone have any insights into this issue? Is there a piece of documentation I've missed? I haven't been able to find anything on the subject but that may be due to looking in the wrong places or my search terms.
CouchDB is very disk hungry, trading disk space for performance. Views will increase in size as items are added to them. You can recover disk space that is no longer needed with cleanup and compaction.
Every time you create update or delete a document then the view indexes will be updated with the relevant changes to the documents. The update to the view will happen when it is queried. So if you are making lots of document changes then you should expect your index to grow and will need to be managed with compaction and cleanup.
If your views are very large for a given set of documents then you may have poorly designed views. Alternatively your design may just require large views and you will need to manage that as you would any other resource.
It would be easier to tell what is happening if you could describe what document updates (inc create and delete) are happening and what your view functions are emitting, especially for the large view.
That your .view files grow, each time you access a view is because CouchDB updates views on access. CouchDB views need compaction like databases too. If you have frequent changes to your documents, resulting in changes in your view, you should run view compaction from time to time. See http://wiki.apache.org/couchdb/HTTP_view_API#View_Compaction
To reduce the size of your views, have a look at the data, you are emitting. When you emit(foo, doc) the entire document is copied to the view to it is very instantly available when you query the view. the function(doc) { emit(doc.title, doc); } will result in a view as big as the database itself. You could also emit(doc.title, nil); and use the include_docs option to let CouchDB fetch the document from the database when you access the view (which will result in a slightly performance penalty). See http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options
Use sequential or monotonic id's for documents instead of random
Yes, couchdb is very disk hungry, and it needs regular compactions. But there is another thing that can help reducing this disk usage, specially sometimes when it's unnecessary.
Couchdb uses B+ trees for storing data/documents which is very good data structure for performance of data retrieval. However use of B-tree trades in performance for disk space usage. With completely random Id, B+-tree fans out quickly. As the minimum fill rate is 1/2 for every internal node, the nodes are mostly filled up to the 1/2 (as the data spreads evenly due to its randomness) generating more internal nodes. Also new insertions can cause a rewrite of full tree. That's what randomness can cause ;)
Instead, use of sequential or monotonic ids can avoid all.
I've had this problem too, trying out CouchDB for a browsed-based game.
We had about 100.000 unexpected visitors on the first day of a site launch, and within 2 days the CouchDB database was taking about 40GB in space. This made the server crash because the HD was completely full.
Compaction brought that back to about 50MB. I also set the _revs_limit (which defaults to 1000) to 10 since we didn't care about revision history, and it's running perfectly since. After almost 1M users, the database size is usually about 2-3GB. When i run compaction it's about 500MB.
Setting document revision limit to 10:
curl -X PUT -d "10" http://dbuser:dbpassword#127.0.0.1:5984/yourdb/_revs_limit
Or without user:password (not recommended):
curl -X PUT -d "10" http://127.0.0.1:5984/yourdb/_revs_limit

Resources