Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I am trying to perform an image on a hard disk which is failing.
The issue I am encountering causes the program to fail as the disk will routinely drop during the image process and when it is re-recognised by the system it is under a different address (/dev/sdb is now /dev/sde).
I have tried imaging each partition independently but on a 500GB disk I am strugging to get past 100GB a session before the disk will drop (i think the head is going as it clicks).
My question is, if using dd is there a way to image the disk, breaking it down into say 50GB parts so that I can get the whole disk over a number of images and then consolodate.
Or better still, is there a way to force the disk to re-identify on the previous location?
I have found little information on this topic so any insight would be useful.
Thanks.
When the device is lost, your stream will be lost, too. You cannot recover it even if it gets the same device name assigned. However you might want to employ udev rules to get the same name back just for your convenience.
In dd, you can use four useful parameters:
bs=BYTES the size of a "block"
skip=N number of blocks to skip in input
seek=N number of blocks to skip in output
count=N number of blocks to be copied (we don't need it here)
Also, dd has the, albeit a bit hidden, feature of providing progress reports. You can either use "status=progress" or send a signal to the process. The latter is more complicated but it allows you to define the frequency of progress reports. For example, you can do this in another terminal:
for ((;;)); do sleep 1; kill -USR1 `pidof -s dd`; done
Putting all of this together, you can use bs=4M as a reasonable blocksize. Then you can run aforementioned command in a secondary terminal, then start dd, initially with
dd bs=4M seek=0 skip=0 if=/dev/… of=…
After it fails the first time, you use the last block number that was successfully copied by dd as parameters to seek and skip. You can be a bit conservative here (decrease the number a bit) to ensure you don't get any "holes" in your output.
Repeat until the whole disk is done. Good luck!
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
I have a data api which could get stream data use rust as an independent service process, and plan to write several client process to read the data, because the client process have some function based on apache arrow datatype.
I think this might be a single producer multi consumer problem. What's the best practice for swap these apache arrow data between different processes with low latency?
The easiest way to do this is to send the file across a socket, preferably with something like Flight. That would be my recommendation until you prove that performance is insufficient.
Since these are on the same machine you can potentially use something like memory mapped files. You first open a memory mapped file (you'll need to know the size and I'm not sure exactly how to do this but you can easily find the size of the buffers and you can just make a conservative guess for how much you'll need for metadata) and then write to that file on the producer. Make sure to write data using the Arrow format with no compression. This will involve one memory copy from user space to kernel space. You would then send the filename to the consumers over a socket.
Then, on the consumers, you can open the file as a memory mapped file and work with the data (presumably in a read-only fashion since there are multiple consumers). This read will be zero-copy as the file should still be in the kernel's cache. If you are in a situation where the kernel is swapping however then this is likely to backfire.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I hope this is the right place to ask this question.
While trying to move a big directory "mydirname" (abt900GB), in a remote linux server, from /abc/source to /xyz/target ; I used the following command in sourcedirectory,
mv mydirname /xyz/target/ &
However, after a while the process got interrupted and gave an error,
mv: cannot stat `mydirname/GS9/set04/trans/run.3/acc': Stale file handle
mv: cannot stat `mydirname/GS9/set04/trans/run.4/amc': Stale file handle
.
.
.
and many more such messages mentioning different subdirectories locations.
The problem is that, the process has moved about 300gb of data. However, there are many directories which are not fully moved. Similar, problem occurred with another transfer (about 500 GB) that was running at the same machine.
Also, I am no longer in the same working session. I have disconnected and reconnected to the remote server.
It would be great if you help with following queries.
Is it possible that some of the file are not fully-transferred (i have seen such cases in 'cp' command where if a process interrupts, it results in lesser size file at the destination.
How can I resume the process so that I do not loose any data. Will 'mv' command be enough? or is there any special command that can work in background.
Else, is there a command to undo the process and restore the 'mydirname' to original location 'source'.
Use "rsync" to complete a job like this:
rsync -av --delete mydirname/ /xyz/target
It will verify that all files are moved, of the proper length, correct timestamps and will delete any leftover garbage.
You can test first with a "dry run" to see what the damages are:
rsync -avn --delete mydirname/ /xyz/target
This goes through the whole rsync process but doesn't actually do anything. It's usually a good idea to run this test to check your command syntax and see if it's going to do what you think it should do.
The "rsync" command is actually more like a copy "cp" than a move "mv". It will leave the source files in place and you can delete them later when you are satisfied that everthing has transferred correctly.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I'm using Ubuntu 14.04 and I made an empty directory on /tmp with the mkdir command:
cd /tmp
mkdir foo
and then I checked it's size using ls:
ls -ldh foo
and the result shows that the size of the directory is 4KB, although it has nothing inside!
then I created an empty file with touch:
touch empty
and then I checked its size:
ls -l empty
the result shows that the empty file is of 0B, which differs from the empty directory.
I've read about some Q&A's saying that the 4KB is the metadata of the directory. But if it is the metadata, what kind of information is stored inside and why it is so huge, and why an empty file don't have such kind of metadata? If it is not the metadata, what does the 4KB mean?
I'm going to break this question down into 3 parts, 2 of which I can answer...
Part 1: why isn't an empty directory size 0?
Because it contains . and .. so it's not really empty.
Part 2: Why is 4K the minimum?
Because that's the filesystem's block size. You can set it smaller when you create the filesystem, but there is overhead. The filesystem must remember a free-or-in-use flag for every block, so smaller blocks = more blocks = more overhead. (In the early days of ext2, the default block size was 1K. Disks were small enough that the space saved by not allocating a multiple of 4K for every file was more important than the space used for the free block map.)
Block sizes over 4K aren't possible because 4K is the page size (the smallest unit of virtual memory) on most processors, and Linux wasn't designed to deal with filesystem blocks bigger than memory pages.
Part 3: When you ls -l a regular file, you get the actual number of bytes used but when you ls -ld a directory, you get the number of bytes allocated. Why?
This part I don't know. For regular files, there is an allocation size you can view with ls -s, and the two sizes actually tell you different things. But on directories, the -l size is like a redundant copy of the -s size. Presumably the kernel could report a size that indicates how much of the 4K block is actually used, but it doesn't. I don't know why.
The metadata a directory contains is a series of directory entries. It's not empty upon creation because two dirents are immediately created: one for that directory, called ".", and one for its parent directory, called "..".
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I would like to know how to copy a Linux partition (example: /dev/sda1) on a USB stick, and then boot on the USB stick.
I tried to just copy it with the command cp but when I tried to boot on it, it booted on the partition I copied (/dev/sda1) and not the usb.
In short what I want to do is create a USB stick with my Linux partition on it that I can boot on with any computer.
Thank you.
cp is great for copying files, but you should consider it too high-level for copying partitions. When you copy a partition, you read from a device file and write to another device file, or normal file, or what ever. With cp, many file attributes might be changed: modification time, owner, permissions, etc. This isn't great for partition copies, e.g. files owned by root should still be owned by root, or ~/.ssh/config should still have permissions 600.
The program for this task is dd, which copies bit-by-bit. You specify an input file and an output file:
dd if=/dev/sda of=/dev/sdf bs=512
This copies contents of /dev/sda to /dev/sdf while reading 512 bytes at a time (bs=blocksize). After some time, it will finish and report some statstics. To get statistics during copying, you must send SIGUSR1 signal to the dd process.
Please beware that dd is a dangerous tool, if incorrectly used: For example, it won't ask you for permission to overwrite your 10000 picture vacation album. It simply does. Make sure to specify the correct device files!
You also have to take care that sizes of source and destination fit: destination needs to be at least the size as the source. If you have a 500GB hard disk it won't work to copy to a 4GB USB stick.
Copying whole hard disks also copies the boot loader. An issue with this might be, that the entries in boot loader configuration reference wrong disks. However, starting the boot loader should be no problem (provided architecture matches). If you use GRUB, you even get a command line, which you can use to boot the system manually.
Please change your bios setting so that the first boot device is USB.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 12 years ago.
Improve this question
Let's assume I have a hard drive with some linux distribution on it. My task is to set up similar system (with similar distro, kernel version, software versions etc.) on the other hard drive. How can i do that if:
Case a: I'm allowed to use any software i want (include software like Virtualbox to make full image of the system)
Case b: I'm not allowed to use anything but standard linux utilities to retrieve all characteristics i need, and then install "fresh" system on other hard drive manually.
Thanks for reading. It's very hard to me to express what i mean, i hope you understood it.
One word: CloneZilla
It can clone the partitions, disks, copies the boot record. You can boot it up from CD or USB drive or even via network (PXE).
You could go with dd but it's slow because it copies everything, even the empty space on disk, and if your partitions are not the same size you can have various problems, so I do not recommend dd.
You could also boot the system from some live CD like Knoppix, mount the partitios and copy everything using cp -a. And run something like watch df in a second terminal to monitor the progress. But even then you need to mess with the bootloader after copy is done.
I used to use various manual ways to clone Linux systems in the past, until I discovered CloneZilla. Life is much easier since then.
Easiest way is to use dd from the command prompt.
dd if=/dev/sda of=/dev/sdb --bsize=8096
dd (the disk duplicator) is used for exactly this purpose. I would check the man page to ensure my blocksize argument is correct though. The other two arguments are if (in file) and of (out file). The of= hard drive should be the same size or larger than the if= hard drive.
You can create an exact copy of the system on the first disk with dd or cpio and a live cd.