File size can't exceed 2 Go - linux

I am doing Monte Carlo simulations. I am trying to direct the results of my program into a Huge file using fprintf to avoid tabs because it necessitate much memory size.
The problem is that, when the data size on file achieve 2Go, the program can't write on it anymore. I did some research in this and other sites but I didn't get a helpful response to my problem.
I am using Ubuntu 12.04 LTS with file type ext4 and the partition size is 88 Go. I am not good at computer sciences and I don't know even what means ext but I saw that this type of file can support individual files with 16 Go at least.
So can anyone tell me what to do?

The maximal file size limit for a 32 bit is 2^31 (2 GiB), but using the LFS interface on filesystems that support LFS applications can handle files as large as 263 bytes.

Thank you for your answer it was so helpful. I changed fopen with fopen64 and i used -D_FILE_OFFSET_BITS=64 when compiling, and all got fine :)

Related

If the size of the file exceeds the maximum size of the file system, what happens?

For example, In FAT32 partition, The maximum file size is 4GB. but I was able to create a 5GB file with vim and I saved the file and opened it again, the console output was broken like a staircase. I have three questions.
If the size of the file exceeds the maximum size of the file system, what happens?
In my case, Why break?
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1). Does this have anything to do with the file system? Is there a relationship between the limits of data in stat() and the limits of each feature in the file system?
If the size of the file exceeds the maximum size of the file system, what happens?
By definition, that can never happens. What really happens is that some system call (probably write(2) ...) is failing, and the code doing that should take care of that case.
Notice that FAT32 filesystems restrict the maximal size of files to 2Gigabytes. Use a better file system on your USB key if you want more (or split(1) large files in smaller chunks before copying them to your FAT32-formatted USB key).
If using <stdio.h> notice that fflush(3), fprintf(3), fclose(3) (and most other standard functions) can fail (e.g. because they will do some failing write(2)).
the console output was broken like a staircase
probably because your pseudoterminal was in some broken state. See stty(1), reset(1), termios(3) and read the tty demystified.
In Unix system call, stat() can succeed up to a 2GB(2^31 - 1)
You are misunderstanding stat(2). Read again its documentation
Read Advanced Linux Programming then syscalls(2).
I was able to create a 5GB file with vim
To understand the behavior of vim read first its documentation then study its source code (it is free software, and you can and perhaps should study its code).
You could also use strace(1) to understand what system calls are done by some command or process.

Need advice for disk access program

I'm envisioning a program I will need to write and need some advice on the language. I will need to be doing raw disk access so I can display hex data, scroll or jump around on the disk, and do calculations from the data. I have been using Java the most and it's portability between OSes for my other projects is certainly a benefit, but raw disk access either isn't possible, would require JNI, or may be possible on *nix when you can access disks as "files". I keep reading different things. By the way I can handle this type of work using Files in Java, but in this project I need to be able to access the disk so disk imaging to files beforehand isn't needed.
It would be nice to make it as portable as I could since there is a real benefit to using different OSes, but it may not be worth it and I should just stick with Windows and a native compiling language. Is there any existing JNI code that could help? I have experience in other languages but I haven't used C++ in a long time. Should I forget about Java and tryout C#? Someone told me that Python has libraries available for this type of thing despite it being an interpreted language so what about Python? What would be best for the project? What would be good for me to learn?
Searching around for raw disk access, Java, Python, does not seem to give any useful results. Thanks for any help!
EDIT
It seems like this will be quite involved, learning what I need to know, and then learning that. It's too bad I couldn't use disk images instead because then I'd be able to start working on it immediately in Java, which I'm comfortable with and I know I could make a good product. I've gotten great throughput in other raw data processing projects with Java so that doesn't worry me. Plus it would be truly portable. Hmm might have to consider it more. I'd probably need a big azz storage system to hold all the images though :)
UPDATE
Just a note for anyone that finds this question... I have figured out this works just by specifying the disk for the File using the PhysicalDrive notation (in Windows) like the answer below by hunsricker. However there are some issues. First if you do a "exists" check File.exists(), it says the file does not exist. Also, the file size is zero, and when I get a "java.io.IOException: The drive cannot find the sector requested" is the way I know I'm at the end of the file. And the worst part- I was getting some odd runtime errors doing this when I was reading some bytes and skipping some (64) bytes in a loop. I altered my program a bit to read different amounts and that changed where the error occurred. I was using BufferedInputStream instead of RandomAccessFile like hunsricker below by the way, not sure if it makes a difference. My only answer for this issue is that since I'm doing physical disk access, it doesn't like that I am not reading in even 512 byte sectors or 1K blocks or such. Indeed when I read even 1K, 2K, 512bytes, etc., and don't skip anything, it works fine and runs to the end. The errors I saw were java.io.ioexception "incorrect function" and java.io.ioexception "the parameter is incorrect". There was no rhyme or reason to them. Then I made image files of the same data and ran my program on those and it would do any combination of reading and skipping bytes with no problem. Physical disk access was more picky I guess.
I was looking by myself for a possibility to access raw data of a physical drive. And now as I got it to work, I just want to tell you how. You can access raw disk data directly from within java ... just run the following code with administrator priviliges:
File diskRoot = new File ("\\\\.\\PhysicalDrive0");
RandomAccessFile diskAccess = new RandomAccessFile (diskRoot, "r");
byte[] content = new byte[1024];
diskAccess.readFully (content);
So you will get the first kB of your first physical drive on the system. To access logical drives - as mentioned above - just replace 'PhysicalDrive0' with the drive letter e.g. 'D:'
oh yes ... I tried with Java 1.7 on a Win 7 system ...
RageDs link brougth me to the solution ... thank you :-)
Disk access will depend on the disk's particular drivers. And since this is such a low-level task, I doubt Java/Python would have such support (these languages are generally used for fast, high-level software package development). Since you will probably not be aware of the disks' particular hardware implementations, you will probably have to end up using an operating system API (which is OS-dependent of course). I would recommend looking into C and/or the particular assembly language for the architecture you plan to do this work on. Then, I would recommend continuing your search to find the appropriate API for your target OS.
EDIT
For Windows, a good place to start is here. More specifically, MSDN's CreateFile() is probably a function you would be interested in.

open limitation based on file size

Is there any limitation on "open" based on file size. ?
My file size is 2 GB will it open successfully and is there any timing issue can come ?
filesystem is rootfs.
From the open man page:
O_LARGEFILE
(LFS) Allow files whose sizes cannot be represented in an off_t
(but can be represented in an off64_t) to be opened. The
_LARGEFILE64_SOURCE macro must be defined in order to obtain
this definition. Setting the _FILE_OFFSET_BITS feature test
macro to 64 (rather than using O_LARGEFILE) is the preferred
method of obtaining method of accessing large files on 32-bit
systems (see feature_test_macros(7)).
On a 64-bit system, off_t will be 64 bits and you'll have no problem. On a 32-bit system, you'll need the suggested workaround to allow for files larger than 2 GB.
rootfs may not support large files; consider using a proper filesystem instead (tmpfs is almost the same as rootfs, but with more features).
rootfs is intended only for booting and early use.

What are the available interactive languages that run in tiny memory? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am looking for general purpose programming languages that
have an interactive (live coding) prompt
work in 32 KB of RAM by itself or 8 KB when the compiler is hosted on a separate machine
run on a microcontroller with as little as 8-32 KB RAM total (without an MMU).
Below is my list so far, what am I missing?
Python: The PyMite VM needs 64K flash, 8K RAM. Targets LPC, SAM7 and ATmegas with 8K or more. Hosted.
Lua: The eLua FAQ recommends 256K flash, 64K RAM.
FORTH: amforth needs 8K flash, 150 bytes RAM, 30 bytes EEPROM on an ATmega.
Scheme: armpit Scheme The smallest target is the LPC2103 with 32K Flash, 4K SRAM.
C: Interactive C runs on 68HC11 with no flash and 32K SRAM. Hosted.
C: picoc an open source, cross-compiling, interactive C system. When compiled for AVR, it takes 63K flash, 8K RAM. The RAM could be reduced with effort to keep tables in flash.
C++: AngelScript an open source, byte-code based, C/C++ like scripting language with easy native calls.
Tcl: TinyTCL runs on DOS, 60K binary. Looks easy to port.
BASIC: TinyBasic: Initializes with a 64K heap, might be adjustable.
Lisp
PostScript: (I haven't found a FOSS implementation for low memory yet)
Shell: bitlash: An interactive command shell for Arduino (ATmega). See also AVRSH.
A homebrew Forth runtime can be implemented in very little memory indeed. I know someone who made one on a Cosmac in the 1970s. The core runtime was just 30 bytes.
I hear that CHIP-8, XPL0, PicoC, and Objective Caml have been ported to graphing calculators.
The Wikipedia "Lego Mindstorms" article lists a bunch of programming languages that allegedly run on the Lego RCX or Lego NXT platform.
Do any of them meet your "live coding" criteria?
You might want to check out the other microcontroller Forths at the Forth wiki . It lists at least 4 Forths for the Atmel AVR: amforth (which you already mention), PFAVR, avrforth, and ByteForth.
(Links to those interpreters, as well as this StackOverflow question, are included in the "Embedded Systems" wikibook).
I would recommend LUA (or eLUA http://www.eluaproject.net/ ). I've "ported" LUA to a Cortex-M3 a while back. From the top of my head it had a flash size of 60~100KB and needed about 20KB RAM to run. I did strip down to the bare essentials, but depending on your application, that might be enough. There's still room for optimization, especially about RAM requirements, but I doubt you can run it comfortable in 8KB.
Some AVR interpreters/VMs:
http://www.cqham.ru/tbcgroup/index_eng.htm
http://www.jcwolfram.de/projekte/avr/chipbasic2/main.php
http://www.jcwolfram.de/projekte/avr/chipbasic8/main.php
http://www.jcwolfram.de/projekte/avr/main.php
http://code.google.com/p/python-on-a-chip/
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=688&item_type=project
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=626&item_type=project
http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=460&item_type=project
http://www.harbaum.org/till/nanovm/index.shtml
Wren fits your criteria -- by default it's configured to use just 4k of RAM. AFAIK it hasn't seen any actual use, since the guy I wrote it for decided he didn't need an interpreter running wholly on the target system after all.
The language is influenced most obviously by ML and Forth.
Have you considered a port in C of Tiny Basic? Or, perhaps rewriting the UCSD Pascal p-machine to your architecture from Z-80?
Seriously, though, JavaScript would make a good embedded scripting language, but I've no clue what the minimum memory requirements are for the VM + GC, nor how difficult to remove OS dependencies. I played with NJS a while back, which could possibly fit your needs. This one is interesting in that the compiler is written in JavaScript (self hosting).
You can take a look at very powerful AvrCo Multitasking Pascal for AVR. You can try it at http://www.e-lab.de. MEGA8/88 version is free. There are tons of drivers and simulator with JTAG debugger and nice live or simulated visualizations of all standard devices (LCDCHAR, LCDGRAPH, 7SEG, 14SEG, LEDDOT, KEYBOARD, RC5, SERVO, STEPPER...).
You're missing EmbedVM, homepage here, svn repo here. Remember to check out both [1,2] videos on the front page ;)
From the homepage:
EmbedVM is a small embeddable virtual machine for microcontrollers
with a C-like language frontend. It has been tested with GCC and AVR
microcontrollers. But as the Virtual machine is rather simple it
should be easy to port it to other architectures.
The VM simulates a 16bit CPU that can access up to 64kB of memory. It
can only operate on 16bit values and arrays of 16bit and 8bit values.
There is no support for complex data structures (struct, objects,
etc.). A function can have a maximum of 32 local variables and 32
arguments.
Besides the memory for the VM, a small structure holding the VM state
and the reasonable amount of memory the EmbedVM functions need on the
stack there are no additional memory requirements for the VM.
Especially the VM does not depend on any dymaic memory management.
EmbedVM is optimized for size and simplicity, not execution speed. The
VM itself takes up about 3kB of program memory on an AVR
microcontroller. On an AVR ATmega168 running at 16MHz the VM can
execute about 75 VM instructions per millisecond.
All memory accesses done by the VM are parformed using user callback
functions. So it is possible to have some or all of the VM memory on
external memory devices, flash memory, etc. or "memory-map" hardware
functions to the VM.
The compiler is a UNIX/Linux commandline tool that reads in a *.evm
file and generates bytecode in vaious formats (binary file, intel hex,
C array initializers and a special debug output format). It also
generates a symbol file that can be used to access data in the VM
memory from the host application.
The C-like language looks like this: http://svn.clifford.at/embedvm/trunk/examples/numberquizz/vmcode.evm
I would recommend MY-BASIC, runs with in minimum 8 KB RAM, and easy to port.
There's also JavaScript, via Espruino.
This is built specifically for Microcontrollers and there are builds for various different chips (mainly STM32s) that fit a full system into as little as 8kB RAM.
Have you considered simply using the /bin/sh supplied by busybox? Or on of the smaller scripting languages they recommend?
Prolog - http://www.gprolog.org/
According to a google search "prolog small" the size of the executable can be made quite small by avoiding linking the built-in predicates.
None of the languages in the list in the question or in the answers proved satisfactory for the requirement of super easy compilation and integration into an existing micro controller project (disclosure: I didn't actually try every single one of the suggestions).
I found instead tinyscript which is a single .c+.h file that compiled with the rest of the source files on my project with the only additional configuration required being to provide a void outchar(int c) which can be empty if you don't require output from the scripts.
For me speed of execution is far less important than ease of build and integration and interop with C, as my use case is mainly just calling some C functions in order.
I have been using in my previous work busybox on a BlackFin.
we compiled perl + php for it, after changing s/fork/vfork/g it worked pretty good... more or less. Not having an MMU is not a good idea. The memory fragmentation will kill the server pretty easily. All I did was:
for i in `seq 1 100`; do wget http://black-fin-ip/test.php; done
It died while I was walking to my boss and telling him that the server is going to die in production :)
I would suggest use python. But now the only problem is the memory overhead right? So I have great idea for people who may be stuck in this problem later on.
First thing's first, write a bf interpreter(or just get source code from somewhere). The interpreter will be really small. Also bf is a Turing complete language. Now you need to write your code in python and then transpiler it to bf using bfpy( https://github.com/felko/bfpy/blob/master/README.md ). I've given you the solution with the least overhead and I am pretty sure a bf interpreter will easily stay under 10KB of ram usage.
Erlang - http://erlang.org/
it can fit in 2MB
http://www.experts123.com/q/is-erlang-small-enough-for-embedded-systems.html

How to create a file of size more than 2GB in Linux/Unix?

I have this home work where I have to transfer a very big file from one source to multiple machines using bittorrent kinda of algorithm. Initially I am cutting the files in to chunks and I transfer chunks to all the targets. Targets have the intelligence to share the chunks they have with other targets. It works fine. I wanted to transfer a 4GB file so I tarred four 1GB files. It didn't error out when I created the 4GB tar file but at the other end while assembling all the chunks back to the original file it errors out saying file size limit exceeded. How can I go about solving this 2GB limitation problem?
I can think of two possible reasons:
You don't have Large File Support enabled in your Linux kernel
Your application isn't compiled with large file support (you might need to pass gcc extra flags to tell it to use 64-bit versions of certain file I/O functions. e.g. gcc -D_FILE_OFFSET_BITS=64)
This depends on the filesystem type. When using ext3, I have no such problems with files that are significantly larger.
If the underlying disk is FAT, NTFS or CIFS (SMB), you must also make sure you use the latest version of the appropriate driver. There are some older drivers that have file-size limits like the ones you experience.
Could this be related to a system limitation configuration ?
$ ulimit -a
vi /etc/security/limits.conf
vivek hard fsize 1024000
If you do not want any limit remove fsize from /etc/security/limits.conf.
If your system supports it, you can get hints with: man largefile.
You should use fseeko and ftello, see fseeko(3)
Note you should define #define _FILE_OFFSET_BITS 64
#define _FILE_OFFSET_BITS 64
#include <stdio.h>

Resources