Make/Execvp Error in Cygwin: - cygwin

The following error occurs in make, while trying to do incremental builds:
make[2]: execvp: C:/path/to/compiler.exe: Message too long
I suspect my problem here is the argument length for execvp. Any idea what that limit is? How would one go about changing that?
Some curious extra information: the same command succeeds when previous make dependencies are in a folder with a shorter name. Is the amount of memory available to execvp dependent somehow affected by previous commands?
E.g. chopping 17 characters off the path to the incremental build files (of which there are hundreds) saves about 12k characters, and the 6k char command line to the compiler succeeds. Without reducing that path, the same command line fails.

CreateProcess() from Windows has the following limitations:
1) pCommandLine [in, out, optional]
The command line to be executed. The maximum length of this string is 32,768 characters, including the Unicode terminating null character.
2) The ANSI version of this function, CreateProcessA fails if the total size of the environment block for the process exceeds 32,767 characters.
I had similar problem caused by limitation 2) but no good solution has been found. Probably recompiling cygwin with Unicode calls to CreateProcess() would help. For me it was sufficient to remove something from environment.
Krzysztof Nowak

I'm getting this error because my %PATH% (which is taken from $PATH) is too long.


Debugging "too few bytes" error in GHCI

I'm running a bunch of Get monads with runGetState at various points in my code. They run on a lazy ByteString returned by readFile. There's a main function that calls a bunch of very short functions, each of which does a little reading.
When I run main in GHCI, I get the following:
<interactive>: too few bytes. Failed reading at byte position 1
That's all the information it provides. I have two questions:
Is there any way to obtain more debugging information from this error? Can I determine which particular invocation of runGetState failed? A line number would be very helpful. Any other debugging info I could get?
Any thoughts on why it might have failed at byte position 1? Is that zero-based? I.e. did it successfully read byte 0 but fail on 1, or did it fail on the first byte? For what it's worth, I can do print theLazyByteString, and it does print 33026, which is what I expected. So the file is not empty and appears to have been successfully opened for reading. My assumption is that "byte position 1" doesn't actually refer to a point early in the file itself, but the beginning of a runGetState invocation later on.
too few bytes. Failed reading at byte position
Is the error you get (in binary < 0.6) when getBytes was called with an argument larger than the remaining input, or getLazyByteStringNul was called when the remaining input doesn't contain a 0 byte. Or when some client code calls fail "too few bytes".
Is there any way to obtain more debugging information from this error?
No, that's all you can get from that error, it doesn't know more than that.
Can I determine which particular invocation of runGetState failed? A line number would be very helpful. Any other debugging info I could get?
That is possible. You can use the ghci debugger (set breakpoints on the candidates and step through them), or you can insert some trace calls (import Debug.Trace) at strategic points in the source to see where you are.
Any thoughts on why it might have failed at byte position 1? Is that zero-based? I.e. did it successfully read byte 0 but fail on 1, or did it fail on the first byte?
It's zero-based (the number is the number of bytes read before). As to why it failed, I can't tell without seeing the source and the input.
My assumption is that "byte position 1" doesn't actually refer to a point early in the file itself, but the beginning of a runGetState invocation later on.
Not unlikely. That depends on what offset argument you pass to the runGetState calls.

gdb break when program opens specific file

Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.
GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.
(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.
Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.

Maximum number of Bash arguments != max num cp arguments?

I have recently been copying and moving a large number of files (~400,000). I know that there are limitations on the number of arguments that can be expanded on the Bash command line, so I have been using xargs to limit the numbers produced.
Out of curiosity, I wondered what the maximum number of arguments that I could use was, and I found this post saying that it was system-dependant, and that I could run this command to find out:
$ getconf ARG_MAX
To my surprise, the anwser I got back was:
Just over 2.6 million. As I said, the number of files that I am manipulating is much less than this -- around 400k. I definitely need to use the xargs method of moving and copying these files, because I tried using a normal mv * ... or cp * ... and got a 'Argument list too long' error.
So, do the mv and cp commands have their own fixed limit on the number of arguments that I can use (I couldn't find anything in their man pages), or am I missing something?
As Ignacio said, ARG_MAX is the maximum length of the buffer of arguments passed to exec(), not the maximum number of files (this page has a very in-depth explanation). Specifically, it lists fs/exec.c as checking the following condition:
PAGE_SIZE*MAX_ARG_PAGES-sizeof(void *) / sizeof(void *)
And, it seems, you have some additional limitations:
On a 32-bit Linux, this is ARGMAX/4-1 (32767). This becomes relevant if the average length of arguments is smaller than 4.
Since Linux 2.6.23, this function tests if the number exceeds MAX_ARG_STRINGS in <linux/binfmts.h> (2^32-1 = 4294967296-1).
And as additional limit, one argument must not be longer than MAX_ARG_STRLEN (131072).
ARG_MAX is the maximum length of the arguments to the exec(3) functions. A shell is not required to support passing this length of arguments from its command line.

Doing file operations with 64-bit addresses in C + MinGW32

I'm trying to read in a 24 GB XML file in C, but it won't work. I'm printing out the current position using ftell() as I read it in, but once it gets to a big enough number, it goes back to a small number and starts over, never even getting 20% through the file. I assume this is a problem with the range of the variable that's used to store the position (long), which can go up to about 4,000,000,000 according to, while my file is 25,000,000,000 bytes in size. A long long should work, but how would I change what my compiler(Cygwin/mingw32) uses or get it to have fopen64?
The ftell() function typically returns an unsigned long, which only goes up to 232 bytes (4 GB) on 32-bit systems. So you can't get the file offset for a 24 GB file to fit into a 32-bit long.
You may have the ftell64() function available, or the standard fgetpos() function may return a larger offset to you.
You might try using the OS provided file functions CreateFile and ReadFile. According to the File Pointers topic, the position is stored as a 64bit value.
Unless you can use a 64-bit method as suggested by Loadmaster, I think you will have to break the file up.
This resource seems to suggest it is possible using _telli64(). I can't test this though, as I don't use mingw.
I don't know of any way to do this in one file, a bit of a hack but if splitting the file up properly isn't a real option, you could write a few functions that temp split the file, one that uses ftell() to move through the file and swaps ftell() to a new file when its reaching the split point, then another that stitches the files back together before exiting. An absolutely botched up approach, but if no better solution comes to light it could be a way to get the job done.
I found the answer. Instead of using fopen, fseek, fread, fwrite... I'm using _open, lseeki64, read, write. And I am able to write and seek in > 4GB files.
Edit: It seems the latter functions are about 6x slower than the former ones. I'll give the bounty anyone who can explain that.
Edit: Oh, I learned here that read() and friends are unbuffered. What is the difference between read() and fread()?
Even if the ftell() in the Microsoft C library returns a 32-bit value and thus obviously will return bogus values once you reach 2 GB, just reading the file should still work fine. Or do you need to seek around in the file, too? For that you need _ftelli64() and _fseeki64().
Note that unlike some Unix systems, you don't need any special flag when opening the file to indicate that it is in some "64-bit mode". The underlying Win32 API handles large files just fine.

Core dump file name truncated

Given the configuration in /proc/sys/kernel/core_pattern set to /cores/core.%e.%p, core dumps are named according to pattern, however for processes running executables with long names e.g. SampleCrashApplication, the generated core file will contain a truncated executable name: /cores/core.SampleCrashAppl.9933
What is causing this ? The man core page talks only about maximum size of the resulting core filename being 128 (64 for kernels before 2.6.19)
The code for this can be found in exec.c here.
The code is going to copy the corename based on the pattern up to the first percentage (giving /cores/core.). At the percentage it's going to increment and process the 'e'. The code for processing the 'e' part prints out the pattern using snprintf based on the current->comm structure.
This is the executable name (excluding path) TRUNCATED to the value TASK_COMM_LEN. Since this is defined as 16 characters (at least in the Kernel I found) then SampleCrashApplication is truncated to 15 + 1 characters (1 for the null byte at the end) which explains why you get your truncated core dump name.
At to why this structure truncates the name TASK_COMM_LEN, that's a deeper question, but it's something internal to the kernel and there's some discussion here.
