As far as I know, IBM uses triple write to persistent recovery log with steps as below:
assuming the page(4k) already has 1k data in it
loads the page from disk first
modify the page to add contents(3k) and write it to some other place
write the page to disk
if the page was broke, then it can restore it from the backup when recover from power down.
It's used to prevent partial write as claimed, cause it might break the original data in the page if the machine crash when writing data directly and the FS/disk don't support atomic write.
My question is why not just write append only log like WAL instead of inplace update?
Related
I'm working with Azure Data Factory to copy .txt files from an FTP site. I'm using a binary transfer approach leveraging binary formats, but ADF is showing incredibly slow throughput (90KB/s) so is taking hours to transfer a 4GB file, which isn't particularly large.
The FTP site is in the US while ADF is located in Europe, but I'm able to download from a VM in the same Europe data center and retrieve the file from FTP in a few minutes. It seems like something is not quite right, any idea why ADF is not able to retrieve a 4GB .txt file? I'm copying to BLOB and am using Azure IR for compute.
The pipeline is running 6-7 hours which seems absurd for a reasonably sized file. I have tried different formats (reading directly as delimited), etc. but it continues to be absurdly slow. I'm assuming the FTP has reasonable download speeds considering I can retrieve the file from a desktop after 4-5 minutes. When monitoring the load in the ADF monitoring I can see it is continually "reading from source" though I can see the data read amount is not changing so I'm wondering if it is dropping a connection, etc.
Any thoughts would help!
I've solved this by, instead of using binary transfer, reading in native format (delimited) and then write out in chunked up parquet in BLOB. I've set a max number of rows for the write out files so it's forcing physical files to be committed.
For whatever reason, it was struggling trying to read the entire 4GB file and then write it out. It seems counterintuitive to move away from a binary copy but seems to be the only option in this case.
I am trying to understand the difference between File system consistent and crash consistent backups provided by Azure. The majority of the information that I find is from this link. I see Application consistent backup is to ensure that all memory data and pending I/O are accounted for perhaps by using a quiescing process so proper snapshot can be taken. However bit confused between the other two. I see Crash consistent is one which doesn't consider the in-memory, pending I/Os and only considers backing up what has been written. But then what exactly would be meant by file-consistent backup? I don't find any definition. As a result when the docs mention that by default Linux VM backups are File system consistent if not using pre/post scripts, I am not understanding the implications. Any help much appreciated.
Simple example to demark the difference is : when a recovery point is file-system consistent, there won't be any file system check performed to make sure that file system is not corrupted. In case of crash consistency, after a VM boots up, a file-system check may be performed and based on that there can be potentially a data loss because of corruption of file system. So, it is always better to strive for file system consistency.
I'm quite sure that I want to delete a database in order to release my resources. I'll never need replication nor the old version, nor logs anymore. But despite my frequently deletion of the database and recreating another, the disk space increases gradually.
How can I simply get rid of the whole database and it's affects on the disk?
Deleting the database via DELETE /db-name removes the database's data and associated indexes on disk. The database is as deleted as it's going to be.
If you are using the purge feature to remove all the documents in the database, instead consider a DELETE followed by a PUT to recreate.
Logs are a different matter, as they are not database-specific, but for the database engine itself. It might be that you need to clear old logs, but you will probably only be able to do that on a time-based rather than database-based manner.
AFAIK all disk reads on linux get into the page cache.
Is there a way to prevent reads (done by a backup process) to get in to the page cache?
Imagine:
A server runs fine, since most operations don't need to touch the disk, since enough memory is available.
Now the backup process starts and does a lot of reading. The read bytes get into the memory (page cache) although nobody wants to read the same bytes again in the next hours.
The backup data fills up the memory and more important pages from the cache get dropped.
Server performance gets worse since more operations need to touch the disk, because the relevant pages were dropped from the cache.
My preferred solution:
Tell linux that the reads done by the backup process don't need to be stored in the page cache.
if you re using rsync there is the flag --drop-cache according to this question
the nocache utility which
minimize the effect an application has on the Linux file system cache
Use case: backup processes that should not interfere with the present state of the cache.
using dd there is direct I/O to bybass cache according to this question
the dd also has the option nocache option check the command info coreutils 'dd invocation'for details
We have a couple of production couchdb databases that have blown out to 30GB and need to be compacted. These are used by a 24/7 operations website and are replicated with another server using continuous replication.
From tests I've done it'll take about 3 mins to compact these databases.
Is it safe to compact one side of the replication while the production site and replication are still running?
Yes, this is perfectly safe.
Compaction works by constructing the new compacted state in memory, then writing that new state to a new database file and updating pointers. This is because CouchDB has a very firm rule that the internals of the database file never gets updated, only appended to with an fsync. This is why you can rudely kill CouchDB's processes and it doesn't have to recover or rebuild the database like you would in other solutions.
This means that you need extra disk space available to re-write the file. So, trying to compact a CouchDB database to prevent full disk warnings is usually a non-starter.
Also, replication uses the internal representation of sequence trees (b+trees). The replicator is not streaming the entire database file from disk onto the network pipe.
Lastly, there will of course be an increase in system resource utilization. However, your tests should have shown you roughly how much this costs on your system vs an idle CouchDB, which you can use to determine how closely you're pushing your system to the breaking point.
I have been working with CouchDB since a while; replicating databases and writing Views to fetch data.
I have seen its replication behavior and observed this, which can answer your question:
In the replication process previous revisions of the documents are not replicated to the destination, only current revision is replicated.
Compacting database only removes the previous revisions. So it will not cause any problem.
Compaction will be done on the database on which you are currently logged in. So it should not affect its replica which is continuously listening for changes in it. Because it listens for the current revision changes not the previous revisions. To verify it you can see this:
Firing this query will show you changes of all the sequences of database. It only works on the basis of latest revision changes not the previous ones(So I think compaction will not make any harm):
curl -X GET $HOST/db/_changes
The result is simple:
{"results":[
],
"last_seq":0}
More info can be found here: CouchDB Replication Basics
This might help you to understand it. In short answer of your question is YES, It is safe to compact database in continuous replication.