Copying sectors? - linux

Is there a script i can use to copy some particular sectors of my Harddisk?
I actually have two partitions say A and B, on my Harddisk. Both are of same sizes. What i want is to run a program which starts copying data from the starting sector of A to the starting sector of B until the end sector of A is copied to the end sector of B.
Looking for possible solutions...
Thanks a lot

How about using dd? Following copies 1024 blocks (of 512 bytes size, which is usually a sector size) with 4096 block offset from sda to sdb partition:
dd if=/dev/sda1 of=/dev/sdb1 bs=512 count=1024 skip=4096
PS. I also suppose it should be SuperUser or rather ServerFault question.

If you want to access the hard drive directly, not via partitions, then, well, just do that. Something like
dd if=/dev/sda of=/dev/sda bs=512 count=1024 skip=XX seek=YY
should copy 1024 sectors starting at sector XX to sectors YY->YY+1024. Of course, if the sector ranges overlap, results are probably not going to be pretty.
(Personally, I wouldn't attempt this without first taking a backup of the disk, but YMMV)

I am not sure if what you are looking for is a partion copier.
If that is what you mean try clonezilla.
(it will show you what exact statement it uses so can be used to find out how to do that in a script afterwards)

Related

After dd completion, should records In = records out

I am using the following cmd where sda(500GB) is my laptop hd (unmounted) and sdc(500GB) is my external usb hd
dd if=/dev/sda of=/dev/sdc bs=4096
When complete this returns
122096647+0 records in
122096646+0 records out
50010782016 bytes (500GB) copied, 10975. 5 s, 45.6 MB/s
This shows records in != records out
fdisk -l
returns
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 718847 358407 7 HPFS/NTFS/exFAT
/dev/sda2 718848 977102847 488192000 7 HPFS/NTFS/exFAT
/dev/sdc1 * 2048 718847 358407 7 HPFS/NTFS/exFAT
/dev/sdc2 718848 977102847 976384000 7 HPFS/NTFS/exFAT
This also shows differences between the Block sizes
Another question is it normal for dd to take 3 hours for a 500GB copy.(laptop ssd to normal non ssd usb hd)
My Physical Sector on windows is 4096 whilst Logical Sector is 512
is it normal for dd to take 3 hours - yes. dd can take very long because you are copying everything off the drive bit by bit bit. And you need to recognize how the connection is made from source (sda) to destination (sdc). You mention sdc is your external usb hard drive, so what is the max transfer speed on USB? Then, it is unlikely that transfer will always happen at that max value. If it is USB 2.0 then yes, it can take very long.
Which is why i hate dd. It is often used when it should not be, and differences between source and destination such as partition sizes, types, block sizes cause problems.
In most cases you are better off using cp -rp or tar.
If you are trying to clone a drive that has a bootable linux operating system, you do not need to use dd there are better ways.

DD Image larger than source

I created an image file using dd on my disk /dev/sda which fdisk says it is 500107862016 bytes in size. The resulting image file is 500108886016 bytes which is exactly 1024000 bytes larger.
Why is the image file 1MB larger than my source disk? Is there something related to the fact that I specified bs=1M in my dd command?
When I restore the image file onto another identical disk, I get "dd: error writing ‘/dev/sda’: No space left on device" error. Is this a problem? Will my new disk be corrupted?
conv=noerror makes dd(1) continue after a reading error, and this is not what you want. Also conv=sync fills incomplete blocks (mainly last block) with zeros up to fill a complete block, so probably this appending zeros to your last block is what is making your file greater than the actual disk size.
You don't need to use any of the conv options you used. No conversion is going to be made, and dd(1) will write the incomplete last block in case of the image doesn't have a full block size (which is the case)
Just retry your command with:
dd if=/dev/sda of=yourfile.img
and then
dd if=yourfile.img of=/dev/sdb
If you plan to use some greater buffer size (not needed, as you are using a block device and the kernel doesn't impose a blocksize for reading block devices) just use a multiple of the sector size that is a divisor of the whole disk size (something like one full track ---absurd, as today's disks' tracks are completely logical and don't have any relationship with actual disk geometry)

Does tee forward data that has not made it into the file?

I'm zeroing a new hard disk like so:
pv /dev/zero | tee /dev/sdb | sha1sum -
The idea is that I will zero the disk and simultaneously compute a checksum of however many zeros got written. Then I'll sha1sum the block device and see if it matches the data that I originally wrote to it.
The question is, what happens when "tee" runs out of space on the device and terminates? Say the block device is exactly 1 million bytes; tee will obviously fill it with 1 million zero bytes, but will it forward exactly 1 million zero bytes to sha1sum?
Answer to the original question:
No, tee will not stop writing to stdout at precisely the point at which a write to a file specified in an argument fails.
But I don't see that it matters much. It appears that your goal is to ensure that the entire disk has been overwritten with zeros, without worrying about how big the disk is. So reading the disk and comparing every block read to a block of zeros should suffice. You can do that with cmp /dev/sdb /dev/zero. If it says "EOF on /dev/sdb", then all the bytes were 0.
For what it's worth, I thought of another way to do the same thing, albeit a little indirectly:
pv /dev/zero | dd bs=100M of=/dev/sdb 2> log
dd's report (captured in "log") should contain a precise count of bytes written, and you can use that to compute the sha1sum (or, alternatively, diff the block device against a generated stream of exactly that many zeros).
(bs=100M is because dd's default block size is 512 bytes, which turns out not to be performant in my use case)

Use Linux dd to copy and read file at specified location

I have destination drive which I know is partitioned in 512b sectors. I want to transfer let's say 150b file with dd to this drive at a given destination, let's say start sector 2099200, and then to read exactly the same amount of bytes as the file size (150b) from the same location sector. I tried something like this:
sudo dd if=my.txt of=/dev/sdb obs=512 seek=2099199
sudo dd if=/dev/sdb of=my.txt obs=150 count=1 ibs=512 skip=2099199
It almost works but I can't make it transfer only 150b:
1+0 records in
3+1 records out
512 bytes (512 B) copied
What is wrong and how to do what I need? May be I get it wrong and some other solution would be better, but I need to be file system independent.
From the man page:
count=BLOCKS
copy only BLOCKS input blocks
When you copy the file back from the drive, you are copying 512 bytes because you specify the input to be copied in 512 byte blocks with the ibs option and you copy one whole block with the count option. Instead, you could just specify the number of blocks you wish to copy as your ibs value:
sudo dd if=/dev/sdb of=my.txt ibs=150 count=1 skip=2099199
EDIT: As pointed out in the comments, this method would require recomputing the skip value. An alternative would be this:
sudo dd if=/dev/sdb ibs=512 count=1 skip=2099199 | dd count=150 of=my.txt

Quickly create a large file on a Linux system

How can I quickly create a large file on a Linux (Red Hat Linux) system?
dd will do the job, but reading from /dev/zero and writing to the drive can take a long time when you need a file several hundreds of GBs in size for testing... If you need to do that repeatedly, the time really adds up.
I don't care about the contents of the file, I just want it to be created quickly. How can this be done?
Using a sparse file won't work for this. I need the file to be allocated disk space.
dd from the other answers is a good solution, but it is slow for this purpose. In Linux (and other POSIX systems), we have fallocate, which uses the desired space without having to actually writing to it, works with most modern disk based file systems, very fast:
For example:
fallocate -l 10G gentoo_root.img
This is a common question -- especially in today's environment of virtual environments. Unfortunately, the answer is not as straight-forward as one might assume.
dd is the obvious first choice, but dd is essentially a copy and that forces you to write every block of data (thus, initializing the file contents)... And that initialization is what takes up so much I/O time. (Want to make it take even longer? Use /dev/random instead of /dev/zero! Then you'll use CPU as well as I/O time!) In the end though, dd is a poor choice (though essentially the default used by the VM "create" GUIs). E.g:
dd if=/dev/zero of=./gentoo_root.img bs=4k iflag=fullblock,count_bytes count=10G
truncate is another choice -- and is likely the fastest... But that is because it creates a "sparse file". Essentially, a sparse file is a section of disk that has a lot of the same data, and the underlying filesystem "cheats" by not really storing all of the data, but just "pretending" that it's all there. Thus, when you use truncate to create a 20 GB drive for your VM, the filesystem doesn't actually allocate 20 GB, but it cheats and says that there are 20 GB of zeros there, even though as little as one track on the disk may actually (really) be in use. E.g.:
truncate -s 10G gentoo_root.img
fallocate is the final -- and best -- choice for use with VM disk allocation, because it essentially "reserves" (or "allocates" all of the space you're seeking, but it doesn't bother to write anything. So, when you use fallocate to create a 20 GB virtual drive space, you really do get a 20 GB file (not a "sparse file", and you won't have bothered to write anything to it -- which means virtually anything could be in there -- kind of like a brand new disk!) E.g.:
fallocate -l 10G gentoo_root.img
Linux & all filesystems
xfs_mkfile 10240m 10Gigfile
Linux & and some filesystems (ext4, xfs, btrfs and ocfs2)
fallocate -l 10G 10Gigfile
OS X, Solaris, SunOS and probably other UNIXes
mkfile 10240m 10Gigfile
HP-UX
prealloc 10Gigfile 10737418240
Explanation
Try mkfile <size> myfile as an alternative of dd. With the -n option the size is noted, but disk blocks aren't allocated until data is written to them. Without the -n option, the space is zero-filled, which means writing to the disk, which means taking time.
mkfile is derived from SunOS and is not available everywhere. Most Linux systems have xfs_mkfile which works exactly the same way, and not just on XFS file systems despite the name. It's included in xfsprogs (for Debian/Ubuntu) or similar named packages.
Most Linux systems also have fallocate, which only works on certain file systems (such as btrfs, ext4, ocfs2, and xfs), but is the fastest, as it allocates all the file space (creates non-holey files) but does not initialize any of it.
truncate -s 10M output.file
will create a 10 M file instantaneously (M stands for 10241024 bytes, MB stands for 10001000 - same with K, KB, G, GB...)
EDIT: as many have pointed out, this will not physically allocate the file on your device. With this you could actually create an arbitrary large file, regardless of the available space on the device, as it creates a "sparse" file.
For e.g. notice no HDD space is consumed with this command:
### BEFORE
$ df -h | grep lvm
/dev/mapper/lvm--raid0-lvm0
7.2T 6.6T 232G 97% /export/lvm-raid0
$ truncate -s 500M 500MB.file
### AFTER
$ df -h | grep lvm
/dev/mapper/lvm--raid0-lvm0
7.2T 6.6T 232G 97% /export/lvm-raid0
So, when doing this, you will be deferring physical allocation until the file is accessed. If you're mapping this file to memory, you may not have the expected performance.
But this is still a useful command to know. For e.g. when benchmarking transfers using files, the specified size of the file will still get moved.
$ rsync -aHAxvP --numeric-ids --delete --info=progress2 \
root#mulder.bub.lan:/export/lvm-raid0/500MB.file \
/export/raid1/
receiving incremental file list
500MB.file
524,288,000 100% 41.40MB/s 0:00:12 (xfr#1, to-chk=0/1)
sent 30 bytes received 524,352,082 bytes 38,840,897.19 bytes/sec
total size is 524,288,000 speedup is 1.00
Where seek is the size of the file you want in bytes - 1.
dd if=/dev/zero of=filename bs=1 count=1 seek=1048575
Examples where seek is the size of the file you want in bytes
#kilobytes
dd if=/dev/zero of=filename bs=1 count=0 seek=200K
#megabytes
dd if=/dev/zero of=filename bs=1 count=0 seek=200M
#gigabytes
dd if=/dev/zero of=filename bs=1 count=0 seek=200G
#terabytes
dd if=/dev/zero of=filename bs=1 count=0 seek=200T
From the dd manpage:
BLOCKS and BYTES may be followed by the following multiplicative suffixes: c=1, w=2, b=512, kB=1000, K=1024, MB=1000*1000, M=1024*1024, GB =1000*1000*1000, G=1024*1024*1024, and so on for T, P, E, Z, Y.
To make a 1 GB file:
dd if=/dev/zero of=filename bs=1G count=1
I don't know a whole lot about Linux, but here's the C Code I wrote to fake huge files on DC Share many years ago.
#include < stdio.h >
#include < stdlib.h >
int main() {
int i;
FILE *fp;
fp=fopen("bigfakefile.txt","w");
for(i=0;i<(1024*1024);i++) {
fseek(fp,(1024*1024),SEEK_CUR);
fprintf(fp,"C");
}
}
You can use "yes" command also. The syntax is fairly simple:
#yes >> myfile
Press "Ctrl + C" to stop this, else it will eat up all your space available.
To clean this file run:
#>myfile
will clean this file.
I don't think you're going to get much faster than dd. The bottleneck is the disk; writing hundreds of GB of data to it is going to take a long time no matter how you do it.
But here's a possibility that might work for your application. If you don't care about the contents of the file, how about creating a "virtual" file whose contents are the dynamic output of a program? Instead of open()ing the file, use popen() to open a pipe to an external program. The external program generates data whenever it's needed. Once the pipe is open, it acts just like a regular file in that the program that opened the pipe can fseek(), rewind(), etc. You'll need to use pclose() instead of close() when you're done with the pipe.
If your application needs the file to be a certain size, it will be up to the external program to keep track of where in the "file" it is and send an eof when the "end" has been reached.
One approach: if you can guarantee unrelated applications won't use the files in a conflicting manner, just create a pool of files of varying sizes in a specific directory, then create links to them when needed.
For example, have a pool of files called:
/home/bigfiles/512M-A
/home/bigfiles/512M-B
/home/bigfiles/1024M-A
/home/bigfiles/1024M-B
Then, if you have an application that needs a 1G file called /home/oracle/logfile, execute a "ln /home/bigfiles/1024M-A /home/oracle/logfile".
If it's on a separate filesystem, you will have to use a symbolic link.
The A/B/etc files can be used to ensure there's no conflicting use between unrelated applications.
The link operation is about as fast as you can get.
The GPL mkfile is just a (ba)sh script wrapper around dd; BSD's mkfile just memsets a buffer with non-zero and writes it repeatedly. I would not expect the former to out-perform dd. The latter might edge out dd if=/dev/zero slightly since it omits the reads, but anything that does significantly better is probably just creating a sparse file.
Absent a system call that actually allocates space for a file without writing data (and Linux and BSD lack this, probably Solaris as well) you might get a small improvement in performance by using ftrunc(2)/truncate(1) to extend the file to the desired size, mmap the file into memory, then write non-zero data to the first bytes of every disk block (use fgetconf to find the disk block size).
This is the fastest I could do (which is not fast) with the following constraints:
The goal of the large file is to fill a disk, so can't be compressible.
Using ext3 filesystem. (fallocate not available)
This is the gist of it...
// include stdlib.h, stdio.h, and stdint.h
int32_t buf[256]; // Block size.
for (int i = 0; i < 256; ++i)
{
buf[i] = rand(); // random to be non-compressible.
}
FILE* file = fopen("/file/on/your/system", "wb");
int blocksToWrite = 1024 * 1024; // 1 GB
for (int i = 0; i < blocksToWrite; ++i)
{
fwrite(buf, sizeof(int32_t), 256, file);
}
In our case this is for an embedded linux system and this works well enough, but would prefer something faster.
FYI the command dd if=/dev/urandom of=outputfile bs=1024 count = XX was so slow as to be unusable.
Shameless plug: OTFFS provides a file system providing arbitrarily large (well, almost. Exabytes is the current limit) files of generated content. It is Linux-only, plain C, and in early alpha.
See https://github.com/s5k6/otffs.
So I wanted to create a large file with repeated ascii strings. "Why?" you may ask. Because I need to use it for some NFS troubleshooting I'm doing. I need the file to be compressible because I'm sharing a tcpdump of a file copy with the vendor of our NAS. I had originally created a 1g file filled with random data from /dev/urandom, but of course since it's random, it means it won't compress at all and I need to send the full 1g of data to the vendor, which is difficult.
So I created a file with all the printable ascii characters, repeated over and over, to a limit of 1g in size. I was worried it would take a long time. It actually went amazingly quickly, IMHO:
cd /dev/shm
date
time yes $(for ((i=32;i<127;i++)) do printf "\\$(printf %03o "$i")"; done) | head -c 1073741824 > ascii1g_file.txt
date
Wed Apr 20 12:30:13 CDT 2022
real 0m0.773s
user 0m0.060s
sys 0m1.195s
Wed Apr 20 12:30:14 CDT 2022
Copying it from an nfs partition to /dev/shm took just as long as with the random file (which one would expect, I know, but I wanted to be sure):
cp ascii1gfile.txt /home/greygnome/
uptime; free -m; sync; echo 1 > /proc/sys/vm/drop_caches; free -m; date; dd if=/home/greygnome/ascii1gfile.txt of=/dev/shm/outfile bs=16384 2>&1; date; rm -f /dev/shm/outfile
But while doing that I ran a simultaneous tcpdump:
tcpdump -i em1 -w /dev/shm/dump.pcap
I was able to compress the pcap file down to 12M in size! Awesomesauce!
Edit: Before you ding me because the OP said, "I don't care about the contents," know that I posted this answer because it's one of the first replies to "how to create a large file linux" in a Google search. And sometimes, disregarding the contents of a file can have unforeseen side effects.
Edit 2: And fallocate seems to be unavailable on a number of filesystems, and creating a 1GB compressible file in 1.2s seems pretty decent to me (aka, "quickly").
You could use https://github.com/flew-software/trash-dump
you can create file that is any size and with random data
heres a command you can run after installing trash-dump (creates a 1GB file)
$ trash-dump --filename="huge" --seed=1232 --noBytes=1000000000
BTW I created it

Resources