Large file support

Large file support - linux

Could someone please explain, what exactly this O_LARGEFILE option does to support opening of large files.
And can there be any side effects of compiling with -D_FILE_OFFSET_BITS=64 flag. In other words, when compiled with this option do we have to make sure something.

Use _FILE_OFFSET_BITS in preference to O_LARGEFILE. These are used on 32 bit systems to allow opening files so large that they exceed the range of a 32bit file pointer.
No, you don't have to do anything special. If you are on 64bit Linux it makes no difference anyway.

From man 2 open:
O_LARGEFILE
(LFS) Allow files whose sizes cannot be represented in an off_t (but can be represented in an off64_t) to be opened. The _LARGE‐
FILE64_SOURCE macro must be defined in order to obtain this definition. Setting the _FILE_OFFSET_BITS feature test macro to 64 (rather
than using O_LARGEFILE) is the preferred method of obtaining method of accessing large files on 32-bit systems (see fea‐
ture_test_macros(7)).
Edit: (ie. RTM :P)

Related

Can we convert elf from a cpu architecture to another, in linux? [duplicate]

How I can run x86 binaries (for example .exe file) on arm?As I see on Wikipedia,I need to convert binary data for the emulated platform into binary data suitable for execution on the targeted platform.but question is:How I can do it?I need to open file in hex editor and change?Or something else?

To successfully do this, you'd have to do two things.. one relatively easy, one very hard. Neither of which you want to do by hand in a hex editor.
Convert the machine code from x86 to ARM. This is the easy one, because you should be able to map each x86 opcode to one or more ARM opcodes. There are different ways to do this, some more efficient than others, but it can be done with a pretty straightforward mapping.
Remap function calls (and other jumps). This one is hard, because monkeying with the opcodes is going to change all the offsets for the jump and return points. If you have dynamically linked libraries (.so), and we assume that all the libraries are available at exactly the same version in both places (a sketchy assumption at best), you'd have to remap the loads.
It's essentially a machine->machine compiler and linker.
So, can you do it? Sure.
Is it easy? No.
There may be a commercial tool out there, but I'm not aware of it.

You can not do this with a binary;note1 here binary means an object with no symbol information like an elf file. Even with an elf file, this is difficult to impossible. The issue is determining code from data. If you resolve this issue, then you can make de-compilers and other tools.
Even if you haven an elf file, a compiler will insert constants used in the code in the text segment. You have to look at many op-codes and do a reverse basic block to figure out where a function starts and ends.
A better mechanism is to emulate the x86 on the ARM. Here, you can use JIT technology to do the translation as encountered, but you approximately double code space. Also, the code will execute horribly. The ARM has 16 registers and the x86 is register starved (usually it has hidden registers). A compilers big job is to allocate these registers. QEMU is one technology that does this. I am unsure if it goes in the x86 to ARM direction; and it will have a tough job as noted.
Note1: The x86 has an asymmetric op-code sizing. In order to recognize a function prologue and epilogue, you would have to scan an image multiple times. To do this, I think the problem would be something like O(n!) where n is the bytes of the image, and then you might have trouble with in-line assembler and library routines coded in assembler. It maybe possible, but it is extremely hard.

To run an ARM executable on an X86 machine all you need is qemu-user.
Example:
you have busybox compiled for AARCH64 architecture (ARM64) and you want to run it on an X86_64 linux system:
Assuming a static compile, this runs arm64 code on x86 system:
$ qemu-aarch64-static ./busybox
And this runs X86 code on ARM system:
$ qemu-x86_64-static ./busybox
What I am curioous is if there is a way to embed both in a single program.

read x86 binary file as utf-8,then copy from ELF to last character�.Then go to arm binary and delete as you copy with x86.Then copy x86 in clip-board to the head.i tried and it's working.

Is a core dump executable by itself?

The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?

In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.

I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).

As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.

What is the fastest way to get just the preprocessed source code with MSVC?

I'm trying to find the fastest way to get the complete preprocessed source code (I don't need #line information other comments, just the raw source code) for a C source file.
I have the following little test program which just includes the Windows header file (mini.c):
#include <windows.h>
Using Microsoft Visual Studio 2005, I then run this command:
cl /nologo /P mini.c
This takes 6 seconds to generate a 2.5MB mini.i file; changing this to
cl /nologo /EP mini.c > mini.i
(which skips comments and #line information) needs just 0.5 seconds to write 2.1MB of output.
Is anybody aware of good techniques for improving this even further, without using precompiled headers?
The reason I'm asking is that I wrote a variant of the popular ccache tool for MSVC. As part of the work, the program needs to compute a hash sum of the preprocessed source code (and a few other things). I'd like to make this as fast as possible.
Maybe there is a dedicated preprocessor binary available, or other command line switches which might help?
UPDATE: One idea which just came to my mind: define the WIN32_LEAN_AND_MEAN macro to strip out lots of rarely needed code. This speeds the above preprocessor run up by a factor of approx 3x.

You're repeatedly processing the same source file (<windows.h>) which in turn pulls in a lot of other files. This <windows.h> is located in the default SDK directory.
Now, processing this unchanging file takes serious time. Yet you can rely on it not changing - it's part of the public interface, after all. Hence you could preprocess it - strip out comments, for instance - and pass that version to cl /EP.
Of course, this is typically an I/O bound task, but with a significant CPU part intermixed. An approach which processes multiple sources in parallel will help the total throughput. Measuring single source preprocesing times isn't too relevant.
Finally, measure the time to write the output to NUL. You shouldn't be including the time needed to write mini.i to disk, since you'll intend to pipe the output to md5sum.

There's a preprocess to file project setting that you can use. In addition, some of the newer MSVC versions offer multithreaded compilation.

The /P option has been there for years, it creates the .i file, and no object file.

open limitation based on file size

Is there any limitation on "open" based on file size. ?
My file size is 2 GB will it open successfully and is there any timing issue can come ?
filesystem is rootfs.

From the open man page:
O_LARGEFILE
(LFS) Allow files whose sizes cannot be represented in an off_t
(but can be represented in an off64_t) to be opened. The
_LARGEFILE64_SOURCE macro must be defined in order to obtain
this definition. Setting the _FILE_OFFSET_BITS feature test
macro to 64 (rather than using O_LARGEFILE) is the preferred
method of obtaining method of accessing large files on 32-bit
systems (see feature_test_macros(7)).
On a 64-bit system, off_t will be 64 bits and you'll have no problem. On a 32-bit system, you'll need the suggested workaround to allow for files larger than 2 GB.

rootfs may not support large files; consider using a proper filesystem instead (tmpfs is almost the same as rootfs, but with more features).
rootfs is intended only for booting and early use.

How to create a file of size more than 2GB in Linux/Unix?

I have this home work where I have to transfer a very big file from one source to multiple machines using bittorrent kinda of algorithm. Initially I am cutting the files in to chunks and I transfer chunks to all the targets. Targets have the intelligence to share the chunks they have with other targets. It works fine. I wanted to transfer a 4GB file so I tarred four 1GB files. It didn't error out when I created the 4GB tar file but at the other end while assembling all the chunks back to the original file it errors out saying file size limit exceeded. How can I go about solving this 2GB limitation problem?

I can think of two possible reasons:
You don't have Large File Support enabled in your Linux kernel
Your application isn't compiled with large file support (you might need to pass gcc extra flags to tell it to use 64-bit versions of certain file I/O functions. e.g. gcc -D_FILE_OFFSET_BITS=64)

This depends on the filesystem type. When using ext3, I have no such problems with files that are significantly larger.
If the underlying disk is FAT, NTFS or CIFS (SMB), you must also make sure you use the latest version of the appropriate driver. There are some older drivers that have file-size limits like the ones you experience.

Could this be related to a system limitation configuration ?
$ ulimit -a
vi /etc/security/limits.conf
vivek hard fsize 1024000
If you do not want any limit remove fsize from /etc/security/limits.conf.

If your system supports it, you can get hints with: man largefile.

You should use fseeko and ftello, see fseeko(3)
Note you should define #define _FILE_OFFSET_BITS 64
#define _FILE_OFFSET_BITS 64
#include <stdio.h>

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Large file support - linux

Could someone please explain, what exactly this O_LARGEFILE option does to support opening of large files. And can there be any side effects of compiling with -D_FILE_OFFSET_BITS=64 flag. In other words, when compiled with this option do we have to make sure something.

Use _FILE_OFFSET_BITS in preference to O_LARGEFILE. These are used on 32 bit systems to allow opening files so large that they exceed the range of a 32bit file pointer. No, you don't have to do anything special. If you are on 64bit Linux it makes no difference anyway.

Related

Can we convert elf from a cpu architecture to another, in linux? [duplicate]

Is a core dump executable by itself?

What is the fastest way to get just the preprocessed source code with MSVC?

open limitation based on file size

How to create a file of size more than 2GB in Linux/Unix?

Categories

Resources