linux redirect 100GB stdout to file fails - linux

I have this command that writes over 100GB of data to a file.
zfs send snap1 > file
Something appears to go wrong several hours into the process. E.g., if I run the job twice, the output is slightly different. If I try to process the file with
zfs receive snap2 < file
an error is reported after several hours.
For debugging purposes, I'm guessing that there's some low probability failure in the shell redirection. Has anyone else seen problems with redirecting massive amounts of data? Any suggestions about where to proceed?
Debugging this is tedious because small examples work, and running the large case takes over 3 hours each time.
Earlier I had tried pipes:
zfs send snap1| zfs receive snap2
However this always failed with much smaller examples, for which
zfs send snap1 > file; zfs receive snap2 < file
worked. (I posted a question about that, but got no useful responses.) This is another reason that I suspect the shell.
Thanks.

The probability that the failure is in the shell (or OS) is negligible compared to a bug in zfs or a problem in how you are using it.
It just takes some minutes to test your hypothesis: compile this stupid program:
#include<unistd.h>
#include<string.h>
#define BUF 1<<20
#define INPUT 56
int main(int argc, char* argv[]) {
char buf[BUF], rbuf[BUF], *a, *b;
int len, i;
memset(buf, INPUT, sizeof(buf));
if (argc == 1)
{
while ((len = read(0, rbuf, sizeof(rbuf))) > 0)
{
a = buf; b = rbuf;
for (i = 0; i < len; ++i)
{
if (*a != *b)
return 1;
++a; ++b;
}
}
}
else
{
while (write(1, buf, sizeof(buf)) > 0);
}
return 0;
}
then try mkfifo a; ./a.out w > a in a shell and pv < a | ./a.out in another one, see how long does it take to get any bit flip.
It should get in the TiB region relatively fast...

Related

'echo' calls .write function INFINITE times

Context
I wrote a Linux device driver in which the functions read and write are implemented. The problem is with the function write, here the portion of the code:
ssize_t LED_01_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos)
{
int retval = 0;
PDEBUG(" reading from user space -> wrinting in kernel space\n");
//struct hello_dev *dev = filp->private_data;
if (count > COMMAND_MAX_LENGHT){
printk(KERN_WARNING "[LEO] LED_01: trying to write more than possible. Aborting write\n");
retval = -EFBIG;
goto out;
}
if (down_interruptible(&(LED_01_devices->sem_LED_01))){
printk(KERN_WARNING "[LEO] LED_01: Device was busy. Operation aborted\n");
return -ERESTARTSYS;
}
if (copy_from_user((void*)&(LED_01_devices-> LED_value), buf, count)) {
printk(KERN_WARNING "[LEO] LED_01: can't use copy_from_user. \n");
retval = -EPERM;
goto out_and_Vsem;
}
write_status_to_LED();
PDEBUG(" Value instert: %u \n", LED_01_devices-> LED_value);
out_and_Vsem:
write_times++;
up(&(LED_01_devices->sem_LED_01));
out:
return retval;
}
Question
If I use the module in a C compiled program, it works properly, as expected.
When I execute echo -n 1 > /dev/LED_01 (from the Command LINE), it writes INFINITE times and, even with the Ctrl+C it doesn't stop. I need to reboot.
Here the snipped code of the test function that works properly:
// ON
result = write(fd, (void*) ON_VALUE, 1);
if ( result != 0 ){
printf("Oh dear, something went wrong with write()! %s\n", strerror(errno));
}
else{
printf("write operation executed succesfully (%u)\n",ON_VALUE[0]);
}
Is the problem in the driver or in the way I use echo?
If you need to whole source code, all the file used are stored in this git repository folder
Value returned by the kernel's .write function is interpreted as:
error code, if it is less than zero (<0),
number of bytes written, if it is more than or equal to zero (>=0)
So, for tell user that all bytes has been written, .write function should return its count parameter.
In case of .write function, returning zero has a little sense: every "standard" utility like echo will just call write() function again.

Why does my process take too long to die?

Basically I'm using Linux 2.6.34 on PowerPC (Freescale e500mc). I have a process (a kind of VM that was developed in-house) that uses about 2.25 G of mlocked VM. When I kill it, I notice that it takes upwards of 2 minutes to terminate.
I investigated a little. First, I closed all open file descriptors but that didn't seem to make a difference. Then I added some printk in the kernel and through it I found that all delay comes from the kernel unlocking my VMAs. The delay is uniform across pages, which I verified by repeatedly checking the locked page count in /proc/meminfo. I've checked with programs that allocate that much memory and they all die as soon as I signal them.
What do you think I should check now? Thanks for your replies.
Edit: I had to find a way to share more information about the problem so I wrote this below program:
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <string.h>
#include <errno.h>
#include <signal.h>
#include <sys/time.h>
#define MAP_PERM_1 (PROT_WRITE | PROT_READ | PROT_EXEC)
#define MAP_PERM_2 (PROT_WRITE | PROT_READ)
#define MAP_FLAGS (MAP_ANONYMOUS | MAP_FIXED | MAP_PRIVATE)
#define PG_LEN 4096
#define align_pg_32(addr) (addr & 0xFFFFF000)
#define num_pg_in_range(start, end) ((end - start + 1) >> 12)
inline void __force_pgtbl_alloc(unsigned int start)
{
volatile int *s = (int *) start;
*s = *s;
}
int __map_a_page_at(unsigned int start, int whichperm)
{
int perm = whichperm ? MAP_PERM_1 : MAP_PERM_2;
if(MAP_FAILED == mmap((void *)start, PG_LEN, perm, MAP_FLAGS, 0, 0)){
fprintf(stderr,
"mmap failed at 0x%x: %s.\n",
start, strerror(errno));
return 0;
}
return 1;
}
int __mlock_page(unsigned int addr)
{
if (mlock((void *)addr, (size_t)PG_LEN) < 0){
fprintf(stderr,
"mlock failed on page: 0x%x: %s.\n",
addr, strerror(errno));
return 0;
}
return 1;
}
void sigint_handler(int p)
{
struct timeval start = {0 ,0}, end = {0, 0}, diff = {0, 0};
gettimeofday(&start, NULL);
munlockall();
gettimeofday(&end, NULL);
timersub(&end, &start, &diff);
printf("Munlock'd entire VM in %u secs %u usecs.\n",
diff.tv_sec, diff.tv_usec);
exit(0);
}
int make_vma_map(unsigned int start, unsigned int end)
{
int num_pg = num_pg_in_range(start, end);
if (end < start){
fprintf(stderr,
"Bad range: start: 0x%x end: 0x%x.\n",
start, end);
return 0;
}
for (; num_pg; num_pg --, start += PG_LEN){
if (__map_a_page_at(start, num_pg % 2) && __mlock_page(start))
__force_pgtbl_alloc(start);
else
return 0;
}
return 1;
}
void display_banner()
{
printf("-----------------------------------------\n");
printf("Virtual memory allocator. Ctrl+C to exit.\n");
printf("-----------------------------------------\n");
}
int main()
{
unsigned int vma_start, vma_end, input = 0;
int start_end = 0; // 0: start; 1: end;
display_banner();
// Bind SIGINT handler.
signal(SIGINT, sigint_handler);
while (1){
if (!start_end)
printf("start:\t");
else
printf("end:\t");
scanf("%i", &input);
if (start_end){
vma_end = align_pg_32(input);
make_vma_map(vma_start, vma_end);
}
else{
vma_start = align_pg_32(input);
}
start_end = !start_end;
}
return 0;
}
As you would see, the program accepts ranges of virtual addresses, each range being defined by start and end. Each range is then further subdivided into page-sized VMAs by giving different permissions to adjacent pages. Interrupting (using SIGINT) the program triggers a call to munlockall() and the time for said procedure to complete is duly noted.
Now, when I run it on freescale e500mc with Linux version at 2.6.34 over the range 0x30000000-0x35000000, I get a total munlockall() time of almost 45 seconds. However, if I do the same thing with smaller start-end ranges in random orders (that is, not necessarily increasing addresses) such that the total number of pages (and locked VMAs) is roughly the same, observe total munlockall() time to be no more than 4 seconds.
I tried the same thing on x86_64 with Linux 2.6.34 and my program compiled against the -m32 parameter and it seems the variations, though not so pronounced as with ppc, are still 8 seconds for the first case and under a second for the second case.
I tried the program on Linux 2.6.10 on the one end and on 3.19, on the other and it seems these monumental differences don't exist there. What's more, munlockall() always completes at under a second.
So, it seems that the problem, whatever it is, exists only around the 2.6.34 version of the Linux kernel.
You said the VM was developed in-house. Does this mean you have access to the source? I would start by checking to see if it has anything to stop it from immediately terminating to avoid data loss.
Otherwise, could you potentially try to provide more information? You may also want to check out: https://unix.stackexchange.com/ as they would be better suited to help with any issues the linux kernel may be having.

Zero bytes lost in Valgrind

What does it mean when Valgrind reports o bytes lost, like here:
==27752== 0 bytes in 1 blocks are definitely lost in loss record 2 of 1,532
I suspect it is just an artifact from creative use of malloc, but it is good to be sure (-;
EDIT: Of course the real question is whether it can be ignored or it is an effective leak that should be fixed by freeing those buffers.
Yes, this is a real leak, and it should be fixed.
When you malloc(0), malloc may either give you NULL, or an address that is guaranteed to be different from that of any other object.
Since you are likely on Linux, you get the second. There is no space wasted for the allocated buffer itself, but libc has to do some housekeeping, and that does waste space, so you can't go on doing malloc(0) indefinitely.
You can observe it with:
#include <stdio.h>
#include <stdlib.h>
int main() {
unsigned long i;
for (i = 0; i < (size_t)-1; ++i) {
void *p = malloc(0);
if (p == NULL) {
fprintf(stderr, "Ran out of memory on %ld iteration\n", i);
break;
}
}
return 0;
}
gcc t.c && bash -c 'ulimit -v 10240 && ./a.out'
Ran out of memory on 202751 iteration
It looks like you allocated a block with 0 size and then didn't subsequently free it.

Tools to reduce risk regarding password security and HDD slack space

Down at the bottom of this essay is a comment about a spooky way to beat passwords. Scan the entire HDD of a user including dead space, swap space etc, and just try everything that looks like it might be a password.
The question: part 1, are there any tools around (A live CD for instance) that will scan an unmounted file system and zero everything that can be? (Note I'm not trying to find passwords)
This would include:
Slack space that is not part of any file
Unused parts of the last block used by a file
Swap space
Hibernation files
Dead space inside of some types of binary files (like .DOC)
The tool (aside from the last case) would not modify anything that can be detected via the file system API. I'm not looking for a block device find/replace but rather something that just scrubs everything that isn't part of a file.
part 2, How practical would such a program be? How hard would it be to write? How common is it for file formats to contain uninitialized data?
One (risky and costly) way to do this would be to use a file system aware backup tool (one that only copies the actual data) to back up the whole disk, wipe it clean and then restore it.
I don't understand your first question (do you want to modify the file system? Why? Isn't this dead space exactly where you want to look?)
Anyway, here's an example of such a tool:
#include <stdio.h>
#include <alloca.h>
#include <string.h>
#include <ctype.h>
/* Number of bytes we read at once, >2*maxlen */
#define BUFSIZE (1024*1024)
/* Replace this with a function that tests the passwort consisting of the first len bytes of pw */
int testPassword(const char* pw, int len) {
/*char* buf = alloca(len+1);
memcpy(buf, pw,len);
buf[len] = '\0';
printf("Testing %s\n", buf);*/
int rightLen = strlen("secret");
return len == rightLen && memcmp(pw, "secret", len) == 0;
}
int main(int argc, char* argv[]) {
int minlen = 5; /* We know the password is at least 5 characters long */
int maxlen = 7; /* ... and at most 7. Modify to find longer ones */
int avlen = 0; /* available length - The number of bytes we already tested and think could belong to a password */
int i;
char* curstart;
char* curp;
FILE* f;
size_t bytes_read;
char* buf = alloca(BUFSIZE+maxlen);
if (argc != 2) {
printf ("Usage: %s disk-file\n", argv[0]);
return 1;
}
f = fopen(argv[1], "rb");
if (f == NULL) {
printf("Couldn't open %s\n", argv[1]);
return 2;
}
for(;;) {
/* Copy the rest of the buffer to the front */
memcpy(buf, buf+BUFSIZE, maxlen);
bytes_read = fread(buf+maxlen, 1, BUFSIZE, f);
if (bytes_read == 0) {
/* Read the whole file */
break;
}
for (curstart = buf;curstart < buf+bytes_read;) {
for (curp = curstart+avlen;curp < curstart + maxlen;curp++) {
/* Let's assume the password just contains letters and digits. Use isprint() otherwise. */
if (!isalnum(*curp)) {
curstart = curp + 1;
break;
}
}
avlen = curp - curstart;
if (avlen < minlen) {
/* Nothing to test here, move along */
curstart = curp+1;
avlen = 0;
continue;
}
for (i = minlen;i <= avlen;i++) {
if (testPassword(curstart, i)) {
char* found = alloca(i+1);
memcpy(found, curstart, i);
found[i] = '\0';
printf("Found password: %s\n", found);
}
}
avlen--;
curstart++;
}
}
fclose(f);
return 0;
}
Installation:
Start a Linux Live CD
Copy the program to the file hddpass.c in your home directory
Open a terminal and type the following
su || sudo -s # Makes you root so that you can access the HDD
apt-get install -y gcc # Install gcc
This works only on Debian/Ubuntu et al, check your system documentation for others
gcc -o hddpass hddpass.c # Compile.
./hddpass /dev/YOURDISK # The disk is usually sda, hda on older systems
Look at the output
Test (copy to console, as root):
gcc -o hddpass hddpass.c
</dev/zero head -c 10000000 >testdisk # Create an empty 10MB file
mkfs.ext2 -F testdisk # Create a file system
rm -rf mountpoint; mkdir -p mountpoint
mount -o loop testdisk mountpoint # needs root rights
</dev/urandom head -c 5000000 >mountpoint/f # Write stuff to the disk
echo asddsasecretads >> mountpoint/f # Write password in our pagefile
# On some file systems, you could even remove the file.
umount testdisk
./hdpass testdisk # prints secret
Test it yourself on an Ubuntu Live CD:
# Start a console and type:
wget http://phihag.de/2009/so/hddpass-testscript.sh
sh hddpass-testscript.sh
Therefore, it's relatively easy. As I found out myself, ext2 (the file system I used) overwrites deleted files. However, I'm pretty sure some file systems don't. Same goes for the pagefile.
How common is it for file formats to contain uninitialized data?
Less and less common, I would've thought. The classic "offender" is older versions of MS office applications that (essentially) did a memory dump to disk as its "quicksave" format. No serialisation, no selection of what to dump and a memory allocator that doesn't zero newly allocated memory pages. That lead to not only juicy things from previous versions of the document (so the user could use undo), but also juicy snippets from other applications.
How hard would it be to write?
Something that clears out unallocated disk blocks shouldn't be that hard. It'd need to run either off-line or as a kernel module, so as to not interfer with normal file-system operations, but most file systems have an "allocated"/"not allocated" structure that is fairly straight-forward to parse. Swap is harder, but as long as you're OK with having it cleared on boot (or shutdown), it's not too tricky. Clearing out the tail block is trickier, definitely not something I'd want to try to do on-line, but it shouldn't be TOO hard to make it work for off-line cleaning.
How practical would such a program be?
Depends on your threat model, really. I'd say that on one end, it'd not give you much at all, but on the other end, it's a definite help to keep information out of the wrong hands. But I can't give a hard and fast answer,
Well, if I was going to code it for a boot CD, I'd do something like this:
File is 101 bytes but takes up a 4096-byte cluster.
Copy the file "A" to "B" which has nulls added to the end.
Delete "A" and overwrite it's (now unused) cluster.
Create "A" again and use the contents of "B" without the tail (remember the length).
Delete "B" and overwrite it.
Not very efficient, and would need a tweak to make sure you don't try to copy the first (and therefor full) clusters in a file. Otherwise, you'll run into slowness and failure if there's not enough free space.
There's tools that do this efficiently that are open source?

How to get the size of a gunzipped file in vim

When viewing (or editing) a .gz file, vim knows to locate gunzip and display the file properly.
In such cases, getfsize(expand("%")) would be the size of the gzipped file.
Is there a way to get the size of the expanded file?
[EDIT]
Another way to solve this might be getting the size of current buffer, but there seems to be no such function in vim. Am I missing something?
There's no easy way to get the uncompressed size of a gzipped file, short of uncompressing it and using the getfsize() function. That might not be what you want. I took at a look at RFC 1952 - GZIP File Format Specification, and the only thing that might be useful is the ISIZE field, which contains "...the size of the original (uncompressed) input data modulo 2^32".
EDIT:
I don't know if this helps, but here's some proof-of-concept C code I threw together that retrieves the value of the ISIZE field in a gzip'd file. It works for me using Linux and gcc, but your mileage may vary. If you compile the code, and then pass in a gzip'd filename as a parameter, it will tell you the uncompressed size of the original file.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
int main(int argc, char *argv[])
{
FILE *fp = NULL;
int i=0;
if ( argc != 2 ) {
fprintf(stderr, "Must specify file to process.\n" );
return -1;
}
// Open the file for reading
if (( fp = fopen( argv[1], "r" )) == NULL ) {
fprintf( stderr, "Unable to open %s for reading: %s\n", argv[1], strerror(errno));
return -1;
}
// Look at the first two bytes and make sure it's a gzip file
int c1 = fgetc(fp);
int c2 = fgetc(fp);
if ( c1 != 0x1f || c2 != 0x8b ) {
fprintf( stderr, "File is not a gzipped file.\n" );
return -1;
}
// Seek to four bytes from the end of the file
fseek(fp, -4L, SEEK_END);
// Array containing the last four bytes
unsigned char read[4];
for (i=0; i<4; ++i ) {
int charRead = 0;
if ((charRead = fgetc(fp)) == EOF ) {
// This shouldn't happen
fprintf( stderr, "Read end-of-file" );
exit(1);
}
else
read[i] = (unsigned char)charRead;
}
// Copy the last four bytes into an int. This could also be done
// using a union.
int intval = 0;
memcpy( &intval, &read, 4 );
printf( "The uncompressed filesize was %d bytes (0x%02x hex)\n", intval, intval );
fclose(fp);
return 0;
}
This appears to work for getting the byte count of a buffer
(line2byte(line("$")+1)-1)
If you're on Unix/linux, try
:%!wc -c
That's in bytes. (It works on windows, if you have e.g. cygwin installed.) Then hit u to get your content back.
HTH
From within vim editor, try this:
<Esc>:!wc -c my_zip_file.gz
That will display you the number of bytes the file is having.

Resources