When I executing sed with -i option (replace file) against very large file, is there any way to know how about target file processed.
e.g. creating intermediate file at /tmp , or processed on the memory and swapped etc.
strace shows that even for small files the original file is read, the results are written to a temporary file and then renamed to the name of the original file.
So I would assume that this is the same behaviour for larger files: a temporary file will be created.
Related
I find myself in a situation similar to this question:
Linux: Overwrite all files in folder with specified data?
The answers there work nicely, however, they are for typed-out text. Allow me to provide context.
I have a Linux terminal which the following file structure: (with files & folders irrelevant to the question removed)
root/
empty.svg
svg/
257238.svg
297522.svg
a7yf872.svg
236y27fh.svg
38277.svg
... (~200 other .svg files with arbitrary names)
2903852.svg
The framework I am working with requires those .svg files to exist with those specific filenames, but obviously, it does not care about SVG image they contain. I do not plan on using such files and they take up a hefty amount of space on disk, so I wish to convert them all into empty SVGs, aka the empty.svg file on my root directory, which is a 12x12 transparent SVG file (124 bytes). This way the framework shouldn't error out like it did when I tried simply overwriting the raw data of those SVGs with plaintext using the answer of the question linked at the top of this question. I've tried many methods by trying to be creative with my basic Linux command-line knowledge but no success. How do I accomplish this?
TL;DR: How to recursively overwrite all files in a folder with the raw data of another file from Linux CLI?
Similar to the link, you can use tee command, but instead of echo use cat to copy file contents, where cat is the command to read the contents of the file.
cat empty.svg | tee svg/257238.svg svg/297522.svg <etc>
But if there are a lot of files in svg directory it will be useful to use loop to automate the previous command:
for f in svg/*; do
if [[ "$f" == *.svg ]]; then
cat empty.svg > "$f"
fi
done
Here we use pipes and redirections to connect commands and redirect previous command output.
In a particular directory, I made a file named "fileName" and add contents to it. When I typed cat fileName, it's content are printed on the terminal. Now I used the following command:
cat fileName>fileName
No error was shown. Now when I try to see contents of file using,
cat fileName
nothing was shown in the terminal and file is empty (when I checked it). What is the reason for this?
> i.e. redirection to the same file will create/truncate the file before cat command is invoked as it has a higher precedence. You could avoid the same by using intermediate file and then from intermediate to actual file or you could use tee like:
cat fileName | tee fileName
To clarify on SMA's answer, the file is truncated because redirection is handled by the shell, which opens the file for writing before invoking the command. when you run cat file > file,the shell truncates and opens the file for writing, sets stdout to the file, and then execute ["cat", "file"]. So you will have to use some other command for the task like tee
The answers given here are wrong. You will have a problem with truncating regardless of using the redirect or pipeline, although it may APPEAR to work sometimes, depending on size of file or length of your pipeline. It is a race condition, as the reader may have a chance to read some or all of the file before the writer starts, but the point of the pipeline is to run all these at the same time so they will be starting at the same time and the first thing tee executable will do is open the output file (and truncate it in the process). The only way you will not have a problem in this scenario is if the end of the pipeline would load the entirety of the output into memory and only write it to file on shutdown. It is unlikely to happen and defeats the point of having a pipeline.
Proper solution for making this reliable is to just write to a temp file and then rename the temp file back to original filename:
TMP="$(mktemp fileName.XXXXXXXX)"
cat fileName | grep something | tee "${TMP}"
mv "${TMP}" fileName
I have a file named test:
[test#mypc ~]$ ls -i
4982967 test
Then I use vim to change its content and enter :w to save it.
It now has a different inode:
[test#mypc ~]$ ls -i
4982968 test
That means it's a different file already, why would vim save it to another file as I use :w to save to the original one?
You see, echo to a file will not change the inode, which is expected:
[test#mypc ~]$ echo v >> test
[test#mypc ~]$ ls -i
4982968 test
It is trying to protect you from disk and os problems. It writes out a complete copy of the file, and when it is satisfied this has finished properly, renames this file to the required filename. Hence, new inode number.
If there were a crash during the save process, the original file would remain untouched, possibly saving you from losing the file completely.
My project c source code file is corrupted while making the tgz of the file. I wanted to make *.tgz of 4 files. The file names are common.c common.h myfile.c and myfile.h. I mistyped the tar command. I used the following tar command by mistake
tar -cvf common.* myfile.* project.tgz
This has corrupted the common.c file. Is there any way to overcome this error?
If the file is really important, than umount the related block device, and you can look into it with the strings command. If the file contains something relatively unique, you can grep it, and if you have a little chance you can save the most of the file.
This can work with mounted block devices, but the chance of loosing the data is higher. But it works only under system's where you can access the block device directly.
I'm creating a zip file by relatively specifying file locations. Here is an example of the command I'm running:
zip priv/purchases/test.zip priv/audio/5001.mp3 priv/audio/5002.mp3
When the file compresses it maintains the relative paths of the files. Thus I get a file hierarchy of:
/priv
/audio
/5001.mp3
/5002.mp3
I've read the man page and I guess I should be using the -j flag. Instead I'd like the files to be extracted at the root of the uncompressed file.
-j seems to work but it ALSO includes the file structure. Why?
Well don't I feel silly. Apparetly if you don't remove the previous directory it seems to append the files. Shoot! Haha.