While loop cp give copies of partial file (linux) - linux

I am trying to copy a list of file in varying directories based on their sample name using the following script. Although the files are copied, the file are only partially copied. I have 64k lines in each file, but only exactly 40k lines are copied.
while read sample
do
echo copying ${sample}
cp ${sample_dir}/*${sample}*/file.tsv ${output_dir}/${sample}.file.tsv
done < ${input_list}/sample_list.txt
Am I missing something obvious here? Does the cp command have limits on how many lines it can copy?
Cheers,

I don't think CP command has limit on file size (unless ulimit has restrictions) to copy but it has limit on number of files to copy or check if it is related to new file creation with larger size.
check limits across the system using command ulimit,
ulimit -a
And verify the file size is not limited to 40k and if it is unlimited then no issue (like, file size (blocks, -f) unlimited).
Try with rsync command if it works for you,
rsync -avh source destination
Could you verify few things,
1) Verify if it's not file read error, like cat the file and save the output to another file (either manually/redirect).
cat *.tsv > /tmp/verify-size
And then verify the size on that file,
du -h /tmp/verify-size ---> This should be 64k
2) Create large dummy file with size > 40k (or exact size of .tsv (64k))
dd if=/dev/zero of=/tmp/verify-newfile-size bs=64000 count=1
du -h /tmp/verify-newfile-size ---> This should be 64k
If this new file creation success (if you are able to create a file with any size) then try CP command and verify the size
-OR- Try with dd command
dd if=/tmp/verify-newfile-size of=/tmp/verify-newfile-size2 bs=64000 count=1
3) Try with an atomic command,
mv /tmp/verify-newfile-size /tmp/verify-newfile-size3

Related

Clear log file while the logging process is running

Assume before clear the log file, file size is 1.5M xxx.log.
I'm using sudo cp /dev/null xxx.log command to clear the log file.
After running the script, file size change to '0'.
But If i make some action, log size increase to 1.5 xxx.log with a lot of white space.
When I googling it need to stop writing process first!
But I don't want to stop the writing process, is there is any other ways.
When the log size increases, the logical size increases. It creates a sparse file. You could check using ls -ls xxx.log or du xxx.log to check. The file on disk does not use 1.5MB.
(edited because of negatives feedback)
What I tried to explain, is when you cp /dev/null xxx.log, you truncated the file. But the application, will continue to write on it, but at its current position. So you create a sparse file, it is empty from the start of the file to the last write of the application.
The ls -lh will print what I called the logical size, but du -hs will only print the number of block really allocated.

Does anyone know sed -i implementation?

When I executing sed with -i option (replace file) against very large file, is there any way to know how about target file processed.
e.g. creating intermediate file at /tmp , or processed on the memory and swapped etc.
strace shows that even for small files the original file is read, the results are written to a temporary file and then renamed to the name of the original file.
So I would assume that this is the same behaviour for larger files: a temporary file will be created.

Why would vim create a new file every time you save a file?

I have a file named test:
[test#mypc ~]$ ls -i
4982967 test
Then I use vim to change its content and enter :w to save it.
It now has a different inode:
[test#mypc ~]$ ls -i
4982968 test
That means it's a different file already, why would vim save it to another file as I use :w to save to the original one?
You see, echo to a file will not change the inode, which is expected:
[test#mypc ~]$ echo v >> test
[test#mypc ~]$ ls -i
4982968 test
It is trying to protect you from disk and os problems. It writes out a complete copy of the file, and when it is satisfied this has finished properly, renames this file to the required filename. Hence, new inode number.
If there were a crash during the save process, the original file would remain untouched, possibly saving you from losing the file completely.

Recover files deleted with rsync -avz --delete

Is it possible to recover files deleted with rsync -avz --delete?
If it is, what are some suggested tools to do so?
I am assuming you ran rsync on some unix system.
If you don't have a backup of your file system,
then its a long tedious process recovering deleted files from unix file system.
High level steps :
find partition where your file resided
create image of entire partition % dd if=/partition of=partition.img ..
(this assumes you have enough space to store this somewhere locally in a different partition, or you can copy it over to different system % dd if=/partition | ssh otherhost "dd of=partition.img")
open the img file in hex edit
(this assumes you know the contents of the files that you've lost and can identify them when you see the content.)
note the byte offset and length of your file
use grep -b to extract the contents of your missing file.
enjoy!
I wasn't able to get extundelete to work, so I ended up using photorec + find/grep in order to recover my important files.

How to recover deleted files in linux filesystem (a bit faster)?

If I launch the following command to recover lost file on linux:
grep -a -B 150 -A 600 "class SuperCoolClass" /dev/sda10 > /tmp/SuperCoolClass.repair
Do I really need the "-a"? We need to recover from "sda10" some erased files (sabotage) and we have a bunch of them to recover and I believe removing the -a would be faster.
I believe the files to be on disk but not in binary.
thx
The file you are working on is /dev/sda10 which grep would assume to contain binary data. In order to treat it as text (which you are looking for) you need the -a otherwise grep will just print Binary file /dev/sda10 matches
In addition since the task is IO rather than CPU bound it would not be a big performance gain in any case.
In the future it's quite easy to test something like this by yourself:
create dummy 10Mb disk: dd if=/dev/zero of=testfs bs=1024 count=10000
create filesystem: mkfs.ext4 testfs
mount via loopback: mount -o loop ./testfs /mnt/test/
copy some stuff on the dummy filesystem
unmount: umount /mnt/test
run grep on the test file with different options
EDIT
it just occurred to me that maybe you are looking for the command '/usr/bin/strings' instead
something like:
extract all printable strings from ruined disk: /usr/bin/strings -a /dev/sda10 > /tmp/recovery
grep on the text only many times for different strings: grep "whatever" /tmp/recovery > /tmp/recovery.whatever
To recover a text file (only a text file) you accidently deleted / overwrote (provided you remember a phrase in that text file)
Ensure the safety of files by unmounting the directory with
umount /home/johndoe.
Find which hard disk partition the folder is at, say sda3
Switch to terminal as root.
Run
grep -a -A800 -B800 'search this phrase' /dev/sda3 | strings>recovery_log.txt
This will take a while. You can go through the file recovery_log.txt using any text editor, even while the command is running.

Resources