Dancer randomly hangs when reading GET request - linux

I am playing with perl dancer on Linux and all is nice and dany if the browser connects to the server directly via LAN. However, when I connect via WAN and the browser is IE9, then occasionally the busy cursor won't go away.
I can provoke this, by reloading the page apx 10 times in a row. I get this problem even when I wait severall seconds between each reload. The page itself is awfully simply and passes the w3c check.
It makes no difference if I run dancer as root, or whether the port is 80 or 3000. A also tested frequent reloading of a page with apache and there does not seem to be an issue.
I ran strace and I have the impression, that the request data is sometimes not availbale at the the time dancer tries to read it. This is what the trace looks like
When it works:
{sa_family=AF_INET, sin_port=htons(52073), sin_addr=inet_addr("78.42.213.92")}, [16]) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
getpeername(4, {sa_family=AF_INET, sin_port=htons(52073), sin_addr=inet_addr("78.42.213.92")}, [16]) = 0
read(4, "G", 1) = 1
read(4, "E", 1) = 1
read(4, "T", 1) = 1
When it hangs
{sa_family=AF_INET, sin_port=htons(52225), sin_addr=inet_addr("78.42.213.92")}, [16]) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfab5028) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, 0xbfab5070, SEEK_CUR) = -1 ESPIPE (Illegal seek)
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
getpeername(4, {sa_family=AF_INET, sin_port=htons(52225), sin_addr=inet_addr("78.42.213.92")}, [16]) = 0
read(4,
and then it sits forever. Any Idea what I can do?

I ran into a similar problem with IE9 connecting to a Catalyst dev server. Eric Lawrence (IE Team Lead!?) suggested it might be due to IE9's background connection feature. IE9 opens a background TCP connection to speed up future requests to the server, but this obviously causes problems for single threaded servers. If you're running Dancer's default dev server (HTTP::Server::Simple::PSGI), you won't be able to handle IE9.
I worked around it by proxying from Apache. It makes dev a little more of a hassle, but only when I have to test IE9.

Related

Process killed by SIGHUP after read returns ERESTARTSYS

We have some application which calls a PHP script which connects to an Oracle DB to do certain things. :) This does not work out well sometimes.
We are now running the PHP part via strace from the beginning.
This is how it looks when everything works ok (everything works out, the DB connection is built, the query executed, the DB is again disconnected, etc.):
10:30:17.935486 connect(8, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.1.1.55")}, 16) = -1 EINPROGRESS (Operation now in progress)
10:30:17.935546 times(NULL) = 2908590046
10:30:17.935569 brk(0xda4000) = 0xda4000
10:30:17.935594 poll([{fd=8, events=POLLOUT}], 1, 60000) = 1 ([{fd=8, revents=POLLOUT}])
10:30:17.940338 getsockopt(8, SOL_SOCKET, SO_ERROR, [519270883345301504], [4]) = 0
10:30:17.940368 fcntl(8, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
10:30:17.940388 fcntl(8, F_SETFL, O_RDWR) = 0
10:30:17.940408 getsockname(8, {sa_family=AF_INET, sin_port=htons(62498), sin_addr=inet_addr("192.168.22.30")}, [16]) = 0
10:30:17.940437 getsockopt(8, SOL_SOCKET, SO_SNDBUF, [-4193870156763480064], [4]) = 0
10:30:17.940458 getsockopt(8, SOL_SOCKET, SO_RCVBUF, [-4193870156763409068], [4]) = 0
10:30:17.940483 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
10:30:17.940506 fcntl(8, F_SETFD, FD_CLOEXEC) = 0
10:30:17.940652 rt_sigaction(SIGPIPE, {0x1, ~[ILL ABRT BUS FPE SEGV USR2 TERM XCPU XFSZ SYS RTMIN RT_1], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f7198b2b920}, {0x1, [PIPE], SA_RESTORER|SA_RESTART, 0x7f7198b2b920}, 8) = 0
10:30:17.940725 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:30:17.940781 read(8, "\x00\x08\x00\x00\x0b\x00\x00\x00", 8208) = 8
10:30:17.974177 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:30:17.974247 read(8, "\x00\x29\x00\x00\x02\x00\x00\x00\x01\x3b\x0c\x41\x00\x00\x00\x00\x01\x00\x00\x00\x00\x29\x51\x41\x00\x00\x00\x00\x00\x00\x00\x00"..., 8208) = 41
10:30:17.976465 write(8, "\x00\x00\x00\xa4\x06\x20\x00\x00\x00\x00\xde\xad\xbe\xef\x00\x9a\x00\x00\x00\x00\x00\x04\x00\x00\x04\x00\x03\x00\x00\x00\x00\x00"..., 164) = 164
....
This is how it looks when everything does not work ok:
10:23:24.888170 connect(8, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.1.1.55")}, 16) = -1 EINPROGRESS (Operation now in progress)
10:23:24.888241 times(NULL) = 2908548738
10:23:24.888263 brk(0xda4000) = 0xda4000
10:23:24.888287 poll([{fd=8, events=POLLOUT}], 1, 60000) = 1 ([{fd=8, revents=POLLOUT}])
10:23:24.889769 getsockopt(8, SOL_SOCKET, SO_ERROR, [519270883345301504], [4]) = 0
10:23:24.889807 fcntl(8, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
10:23:24.889827 fcntl(8, F_SETFL, O_RDWR) = 0
10:23:24.889845 getsockname(8, {sa_family=AF_INET, sin_port=htons(62473), sin_addr=inet_addr("192.168.22.30")}, [16]) = 0
10:23:24.889873 getsockopt(8, SOL_SOCKET, SO_SNDBUF, [-8374476973480591360], [4]) = 0
10:23:24.889892 getsockopt(8, SOL_SOCKET, SO_RCVBUF, [-8374476973480520364], [4]) = 0
10:23:24.889915 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
10:23:24.889936 fcntl(8, F_SETFD, FD_CLOEXEC) = 0
10:23:24.890062 rt_sigaction(SIGPIPE, {0x1, ~[ILL ABRT BUS FPE SEGV USR2 TERM XCPU XFSZ SYS RTMIN RT_1], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f2ee24b4920}, {0x1, [PIPE], SA_
RESTORER|SA_RESTART, 0x7f2ee24b4920}, 8) = 0
10:23:24.890129 write(8, "\x00\xe8\x00\x00\x01\x00\x00\x00\x01\x3b\x01\x2c\x0c\x41\x20\x00\xff\xff\x7f\x08\x00\x00\x01\x00\x00\xa2\x00\x46\x00\x00\x08\x00"..., 232) = 232
10:23:24.890186 read(8, 0xd705a6, 8208) = ? ERESTARTSYS (To be restarted)
10:23:24.907853 --- SIGHUP (Hangup) # 0 (0) ---
10:23:24.908708 +++ killed by SIGHUP +++
This happens sometimes and the application (or at least the PHP script and the connection to the DB) just gets killed. That's bad.
What do you make of the above straces?
Can we tell who is killed by who?
Why would read() return ERESTARTSYS?
What does SIGHUP (Hangup) # 0 (0) tell us exactly?
Your process got sent a SIGHUP, which caused the normal action of exiting.
Can't tell who did it. Try a newer version of strace. From what I can tell, going all the way back to version 4.6 from 2011 it should display more information. The version of strace you are using is from prior to 2011 and the # 0 (0) supplies the PC of the process when the signal was received and the address associated with the signal from siginfo_t. Neither will tell you anything about this problem.
A newer version will supply something like this:
--- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=25064, si_uid=1000} ---
--- SIGHUP {si_signo=SIGHUP, si_code=SI_KERNEL} ---
This first is another process sending the SIGHUP. The second is one sent automatically because of certain events.
The latter can happen when the controlling terminal of the process closes or when the session leader exits because its terminal closed. If you determine it's the kernel sending the signal, then I'd look at your process while it's running and examine the "sid" and "tty" columns in the ps output. That will tell you the session leader and terminal responsible for causing the SIGHUP to be sent. Maybe sometimes your script has a controlling terminal and sometimes not?
The session leader would usually be the parent process that started your script, or the parent of that process, or the parent of that, etc. Looking at ps output and "sid" will tell you. If that leader process exits and has a controlling terminal, everything under it gets a SIGHUP. The way to solve this would be either have the leader not exit until the PHP process is finished, or at some point detach from that session or terminal. Usually a daemon or server process should not associated with a terminal. See daemon() and setsid().

odd delay in strace log when apache serving static content

Using Firefox's developer tools I noticed that static image loads from my site spent between 60-200ms in the "Connect" phase. I took an strace log and see where the delay happens, but am having trouble understanding why it's happening. Here is what I see in strace (log slightly edited to remove actual file paths):
30727 22:20:52.526972 getsockname(12, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::ffff:192.168.100.62", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
30727 22:20:52.527203 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR)
30727 22:20:52.527275 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0
30727 22:20:52.527419 read(12, "GET /images/blah/bla"..., 8000) = 540
30727 22:20:52.527737 brk(0x7f72665b7000) = 0x7f72665b7000
30727 22:20:52.527972 stat("/websites/imagesrepository/blah/blah/image.png", {st_mode=S_IFREG|0775, st_size=$
30727 22:20:52.528160 open("/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
30727 22:20:52.528254 open("/websites/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
30727 22:20:52.528335 open("/websites/imagesrepository/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
30727 22:20:52.528417 open("/websites/imagesrepository/blah/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
30727 22:20:52.528501 open("/websites/imagesrepository/blah/blah/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
30727 22:20:52.529090 stat("/websites/somefile.ext", 0x7f7266582d90) = -1 ENOENT (No such file or directory)
30727 22:20:52.529144 stat("/websites/someotherfile.ext", 0x7f72665812d0) = -1 ENOENT (No such file or directory)
30727 22:20:52.529431 open("/websites/imagesrepository/blah/blah/image.png", O_RDONLY|O_CLOEXEC) = 13
30727 22:20:52.529489 fcntl(13, F_GETFD) = 0x1 (flags FD_CLOEXEC)
30727 22:20:52.529519 fcntl(13, F_SETFD, FD_CLOEXEC) = 0
30727 22:20:52.529582 close(13) = 0
30727 22:20:52.529669 writev(12, [{"HTTP/1.1 304 Not Modified\r\nDate:"..., 209}], 1) = 209
30727 22:20:52.529773 write(10, "104.189.168.66 - - [04/Mar/2015:"..., 270) = 270
30727 22:20:52.529842 shutdown(12, 1 /* send */) = 0
30727 22:20:52.529905 poll([{fd=12, events=POLLIN}], 1, 2000) = 1 ([{fd=12, revents=POLLIN|POLLHUP}])
30727 22:20:52.687759 read(12, "", 512) = 0
30727 22:20:52.687849 close(12) = 0
30727 22:20:52.687951 read(8, 0x7fffc3eb081f, 1) = -1 EAGAIN (Resource temporarily unavailable)
30727 22:20:52.688062 semop(3145795, {{0, -1, SEM_UNDO}}, 1 <unfinished ...>
It looks like the delay occurs here:
30727 22:20:52.529905 poll([{fd=12, events=POLLIN}], 1, 2000) = 1 ([{fd=12, revents=POLLIN|POLLHUP}])
30727 22:20:52.687759 read(12, "", 512) = 0
File descriptor 12 is the socket the request came in on. So it looks like before closing the socket Apache polls for some event and then tries a read? Why? I have KeepAlive off in my config file (due to incompatibility with certain older mobile clients) if that matters.
EDITED TO ADD:
I dug into the Apache 2.2 source and this behavior seems to be intentional. The relevant method is ap_lingering_close() in server/connection.c:
/* we now proceed to read from the client until we get EOF, or until
* MAX_SECS_TO_LINGER has passed. the reasons for doing this are
* documented in a draft:
*
* http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt
*
* in a nutshell -- if we don't make this effort we risk causing
* TCP RST packets to be sent which can tear down a connection before
* all the response data has been sent to the client.
*/
#define SECONDS_TO_LINGER 2
AP_DECLARE(void) ap_lingering_close(conn_rec *c)
{
char dummybuf[512];
apr_size_t nbytes;
apr_time_t timeup = 0;
apr_socket_t *csd = ap_get_module_config(c->conn_config, &core_module);
if (!csd) {
return;
}
ap_update_child_status(c->sbh, SERVER_CLOSING, NULL);
#ifdef NO_LINGCLOSE
ap_flush_conn(c); /* just close it */
apr_socket_close(csd);
return;
#endif
/* Close the connection, being careful to send out whatever is still
* in our buffers. If possible, try to avoid a hard close until the
* client has ACKed our FIN and/or has stopped sending us data.
*/
/* Send any leftover data to the client, but never try to again */
ap_flush_conn(c);
if (c->aborted) {
apr_socket_close(csd);
return;
}
/* Shut down the socket for write, which will send a FIN
* to the peer.
*/
if (apr_socket_shutdown(csd, APR_SHUTDOWN_WRITE) != APR_SUCCESS
|| c->aborted) {
apr_socket_close(csd);
return;
}
/* Read available data from the client whilst it continues sending
* it, for a maximum time of MAX_SECS_TO_LINGER. If the client
* does not send any data within 2 seconds (a value pulled from
* Apache 1.3 which seems to work well), give up.
*/
apr_socket_timeout_set(csd, apr_time_from_sec(SECONDS_TO_LINGER));
apr_socket_opt_set(csd, APR_INCOMPLETE_READ, 1);
/* The common path here is that the initial apr_socket_recv() call
* will return 0 bytes read; so that case must avoid the expensive
* apr_time_now() call and time arithmetic. */
do {
nbytes = sizeof(dummybuf);
if (apr_socket_recv(csd, dummybuf, &nbytes) || nbytes == 0)
break;
if (timeup == 0) {
/*
* First time through;
* calculate now + 30 seconds (MAX_SECS_TO_LINGER).
*
* If some module requested a shortened waiting period, only wait
* for 2s (SECONDS_TO_LINGER). This is useful for mitigating
* certain DoS attacks.
*/
if (apr_table_get(c->notes, "short-lingering-close")) {
timeup = apr_time_now() + apr_time_from_sec(SECONDS_TO_LINGER);
}
else {
timeup = apr_time_now() + apr_time_from_sec(MAX_SECS_TO_LINGER);
}
continue;
}
} while (apr_time_now() < timeup);
apr_socket_close(csd);
return;
}
Still: is it normal for this approach to add 50-150ms to every request? That seems excessive.
EDITED AGAIN TO ADD:
Based on this ancient discussion among Apache devs:
http://webmail.dev411.com/t/apache/dev/9727khxsk3/apache-1-2b7-dev-performance/oldest
and specifically this comment from Roy T. Fielding:
http://webmail.dev411.com/t/apache/dev/9727khxsk3/apache-1-2b7-dev-performance/oldest#19970209aph7g21hj2jepa04nk28ywj6cr
it sounds like this may be a bug. If KeepAlive is Off then it sounds like there's no need to do a lingering close because the client has already been told the connection won't be persistent.

In Linux initrd image, ethernet fails to work

I am working on SABRE SD Development board, which uses i.Mx6 Quad core processor. I have developed a initrd image for this board. The kernel boots up and the initrd images is mounted successfully. Even the fec ethernet drivers are loaded properly.
But during the init process the dhcp fails to designate an ip for the ethernet device.
on analysis using the strace utility on the 'dhcp' command the following log was obtained:
In the log a select system call Timeouts causing the error. A selective portion of the log is given bellow.
socket(PF_INET, SOCK_RAW, IPPROTO_RAW) = 6
ioctl(6, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0
ioctl(6, SIOCGIFHWADDR, {ifr_name="eth0", ifr_hwaddr=00:04:9f:02:b3:81}) = 0
close(6) = 0
clock_gettime(CLOCK_MONOTONIC, {53, 815520338}) = 0
write(1, "Sending discover...\n", 20Sending discover...
) = 20
socket(PF_PACKET, SOCK_DGRAM, 8) = 6
bind(6, {sa_family=AF_PACKET, proto=0x800, if2, pkttype=PACKET_HOST, addr(6)={0,
ffffffffffff}, 20) = 0
sendto(6, "E\0\0014\0\0\0\0#\21y\272\0\0\0\0\377\377\377\377\0D\0C\1 ,h\1\1\6\0"..., 308, 0, {sa_family=AF_PACKET, proto=0x800, if2, pkttype=PACKET_HOST, add8
close(6) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
clock_gettime(CLOCK_MONOTONIC, {53, 990583005}) = 0
select(6, [3 5], NULL, NULL, {3, 0}) = 0 (Timeout)
But when the same rootfs used in initrd image is used with SD card boot the dhcp command does not fail.
Can any one help me with some clues?
with regards,
Vivek

Allocating Memory to a recursive function

I wrote a simple program as below and straced it.
#include<stdio.h>
int foo(int i)
{
int k=9;
if(i==10)
return 1;
else
foo(++i);
open("1",1);
}
int main()
{
foo(1);
}
My intention in doing so was to checkout how is memory allocated for the variables (int k in this case) in a function on a stack. I used an open system call as a marker. The output of strace was as below:
execve("./a.out", ["./a.out"], [/* 25 vars */]) = 0
brk(0) = 0x8653000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb777e000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=95172, ...}) = 0
mmap2(NULL, 95172, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7766000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1734120, ...}) = 0
mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75bc000
mmap2(0xb7760000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a4) = 0xb7760000
mmap2(0xb7763000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7763000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75bb000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb75bb900, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7760000, 8192, PROT_READ) = 0
mprotect(0x8049000, 4096, PROT_READ) = 0
mprotect(0xb77a1000, 4096, PROT_READ) = 0
munmap(0xb7766000, 95172) = 0
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
open("1", O_WRONLY) = -1 ENOENT (No such file or directory)
exit_group(-1) = ?
Towards the end of the strace output you can see that no system call is being called in between the open system calls. So how is the memory allocated on to the stack , for the function being called , without a system call?
Stack memory for the main thread is allocated by the kernel during the execve() system call. During this call, other mappings defined in the executable file (and possibly also for the dynamic linker specified in the executable) are also setup. For ELF files, this is done in fs/binfmt_elf.c.
Stack memory for other threads is mmap()ed by the thread support library, which is usually part of the C runtime library.
You should also note that on virtual memory systems, the main thread stack is grown by the kernel in response to page faults, up to a configurable limit (shown by ulimit -s).
Your (single threaded) program stack size is fixed so there is no further allocation to expect.
You can query and increase this size with the ulimit -s command.
Note that even if you set this limit to "unlimited", there always will be a practical limit:
With 32 bit processes, unless you are low on RAM/swap, the virtual memory space limitation will cause address collisions
With 64 bit processes, memory (RAM + swap) exhaustion will thrash your system and eventually crash your program.
Whatever the case, there are never explicit system calls to expect that would increase the stack size, it is only set when the program starts.
Note also that the stack memory is handled exactly like heap memory, i.e. only the part of it that has been accessed is mapped to real memory (either RAM or swap). This means the stack kind of grows on demand but no other mechanism than standard virtual memory management is handling that.
Stack usage and allocation (at least on Linux) works this way:
A little bit of stack is allocated.
A guard range is setup after the "other" part of the program, at about 1/4 of the address space.
If the stack is used up to its top and above, the stack gets automatically increased.
This happens either if the ulimit limit is reached (and SIGSEGVs) or, if none such exists, until it hits the guard range (and then gets a SIGBUS).
Your program doesn't begin to make any open calls until the recursion "bottoms out". At that point, the stack is allocated, and it's just popping out of the nesting.
Why don't you step through it with a debugger.
Do you want to find out where variables are allocated to 'stack frames' created for functions?
I have revised your program to show you the memory address of your stack variable k, and a parameter variable kk,
//Show stack location for a variable, k
#include <stdio.h>
int foo(int i)
{
int k=9;
if(i>=10) //relax the condition, safer
return 1;
else
foo(++i);
open("1",1);
//return i;
}
int bar(int kk, int i)
{
int k=9;
printf("&k: %x, &kk: %x\n",&k,&kk); //address variable on stack, parameter
if(i<10) //relax the condition, safer
bar(k,++i);
else
return 1;
return k;
}
int main()
{
//foo(1);
bar(0,1);
}
And the output, on my system,
$ ./foo
&k: bfa8064c, &kk: bfa80660
&k: bfa8061c, &kk: bfa80630
&k: bfa805ec, &kk: bfa80600
&k: bfa805bc, &kk: bfa805d0
&k: bfa8058c, &kk: bfa805a0
&k: bfa8055c, &kk: bfa80570
&k: bfa8052c, &kk: bfa80540
&k: bfa804fc, &kk: bfa80510
&k: bfa804cc, &kk: bfa804e0
&k: bfa8049c, &kk: bfa804b0

How process table is updated by the Linux?

Let's say I have a c program name hello.out. It prints a hello world after every 5 seconds forever. When I execute that program. The instance of that program is called process. Now, when I execute it on a terminal - foreground. What happens to the process page table? Who fills up the process table, and adds an entry in it. How it happens? I understand there is struct task to maintains it. Who fills up this task struct, is it loader?
strace ./hello.out
execve("./hello.out", ["./hello.out"], [/* 54 vars */]) = 0
brk(0) = 0x9e3e000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7713000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=176783, ...}) = 0
mmap2(NULL, 176783, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76e7000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\250\366K4\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=2012656, ...}) = 0
mmap2(0x4bf51000, 1772124, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4bf51000
mprotect(0x4c0fb000, 4096, PROT_NONE) = 0
mmap2(0x4c0fc000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1aa) = 0x4c0fc000
mmap2(0x4c0ff000, 10844, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4c0ff000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb76e6000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb76e66c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0x4c0fc000, 8192, PROT_READ) = 0
mprotect(0x4bf49000, 4096, PROT_READ) = 0
munmap(0xb76e7000, 176783) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 7), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7712000
write(1, "Hello world\r\n", 13Hello world
) = 13
exit_group(0) = ?
+++ exited with 0 +++
Where is fork here, what is it happening here. I can see these are system calls. I can make out that printf does a write call with stdout ( the last line). But where is the fork call? What is this brk? Why execve first? what is mmap2 doing here? Also, fstat? I am trying to decode this, and I am not able to understand it in details? Please help me.
When you invoke a program on a terminal, actually you are asking shell to run your program. Once you run your program, for example,
./hello.out
Now shell will read this command and creates a process using the fork() function. Then the fork() invokes the system call sys_clone(Since Version 2.3.3, passes arguments as it behaves like fork() system call). sys_clone() invokes do_fork(). do_fork() creates the process with all resources(most of them copied from the parent) and stores all information of the process in an object of type struct task_struct.
After the fork() returns the shell program now invokes another system call execve() to replace the address space of the newly created process with the program hello.out in the child process context. execve() system call reads the contents of the hello.out with the help ELF code and fills in the memory and invokes the entry point of the program. In this process the name of the program hello.out is also written into the process structure. You can call who loads this program as a loader(a part of kernel). this is done with the help of dynamic linker if there are any shared libraries that the program hello.out uses.

Resources