Fail to wake up from epoll_wait when other process closes fifo - linux

I'm seeing different epoll and select behavior in two different binaries and was hoping for some debugging help. In the following, epoll_wait and select will be used interchangeably.
I have two processes, one writer and one reader, that communicate over a fifo. The reader performs an epoll_wait to be notified of writes. I would also like to know when the writer closes the fifo, and it appears that epoll_wait should notify me of this as well. The following toy program, which behaves as expected, illustrates what I'm trying to accomplish:
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/stat.h>
#include <unistd.h>
int
main(int argc, char** argv)
{
const char* filename = "tempfile";
char buf[1024];
memset(buf, 0, sizeof(buf));
struct stat statbuf;
if (!stat(filename, &statbuf))
unlink(filename);
mkfifo(filename, S_IRUSR | S_IWUSR);
pid_t pid = fork();
if (!pid) {
int fd = open(filename, O_WRONLY);
printf("Opened %d for writing\n", fd);
sleep(3);
close(fd);
} else {
int fd = open(filename, O_RDONLY);
printf("Opened %d for reading\n", fd);
static const int MAX_LENGTH = 1;
struct epoll_event init;
struct epoll_event evs[MAX_LENGTH];
int efd = epoll_create(MAX_LENGTH);
int i;
for (i = 0; i < MAX_LENGTH; ++i) {
init.data.u64 = 0;
init.data.fd = fd;
init.events |= EPOLLIN | EPOLLPRI | EPOLLHUP;
epoll_ctl(efd, EPOLL_CTL_ADD, fd, &init);
}
while (1) {
int nfds = epoll_wait(efd, evs, MAX_LENGTH, -1);
printf("%d fds ready\n", nfds);
int nread = read(fd, buf, sizeof(buf));
if (nread < 0) {
perror("read");
exit(1);
} else if (!nread) {
printf("Child %d closed the pipe\n", pid);
break;
}
printf("Reading: %s\n", buf);
}
}
return 0;
}
However, when I do this with another reader (whose code I'm not privileged to post, but which makes the exact same calls--the toy program is modeled on it), the process does not wake when the writer closes the fifo. The toy reader also gives the desired semantics with select. The real reader configured to use select also fails.
What might account for the different behavior of the two? For any provided hypotheses, how can I verify them? I'm running Linux 2.6.38.8.

strace is a great tool to confirm that the system calls are invoked correctly (i.e. parameters are passed correctly and they don't return any unexpected errors).
In addition to that I would recommend using lsof to check that no other process has that FIFO still opened.

Related

UART reply includes previous command?

I am trying to read from a UART device in a Linux environment using a C program, but I experience different results respect to communicating to the UART using screen.
The C code I use to test the UART communication is the following:
#include <stdio.h>
#include <stdlib.h>
#include <libgen.h>
#include <unistd.h>
#include <string.h>
#include <strings.h>
#include <getopt.h>
#include <stdbool.h>
#include <sys/stat.h>
#include <termios.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/file.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/select.h>
#include <signal.h>
#include <ctype.h>
bool loop;
void sigHandler(int32_t sig)
{
if(sig == SIGINT)
{
printf("Catched SIGINT");
loop = false;
}
}
int main(int argc, char *argv[])
{
char *devname = argv[1];
int fd = -1;
int nread = -1;
int nwrite = -1;
int ret;
struct termios t_new = {0};
struct termios t_old = {0};
signal(SIGINT, sigHandler);
fd = open(devname, O_RDWR | O_NOCTTY |O_NONBLOCK);
if(fd > 0)
{
printf("TTY open ! Configuring TTY");
}
else
{
fd = -1;
return 1;
}
ret = tcgetattr(fd, &t_old);
if(ret < 0)
{
perror("tcgetattr ");
close(fd);
fd = -1;
return 1;
}
t_new = t_old;
t_new.c_cflag = (B9600 | CS8 | CREAD );
t_new.c_oflag = 0;
t_new.c_iflag = 0;
t_new.c_lflag = 0;
ret = tcsetattr(fd, TCSANOW, &t_new);
loop = true;
while(loop)
{
char s[] = "at+gmi=?\r\n";
nwrite = write(fd, s, strlen(s));
if(nwrite == strlen(s))
{
fd_set rfd;
struct timeval tm = {.tv_sec = 0, .tv_usec = 500000};
FD_ZERO(&rfd);
FD_SET(fd, &rfd);
char buffer[64] = {0};
if(select(fd + 1, &rfd, NULL, NULL, &tm) > 0)
nread = read(fd, buffer, sizeof(buffer));
if(nread > 0)
printf("Reply is: %s\n", buffer);
}
usleep(500000);
}
}
But when I read the reply, it always includes the string I have sent.
I don't experience this problem using screen.
What is the best way to read from an UART in C using Linux ?
Could the multiplexed way (using select) causing the problems ?
EDIT
For completeness, the output is:
Reply is: at+gmi=?
OK
Also, sometimes I don't read anything.
But when I read the reply, it always includes the string I have sent.
Since your termios configuration obliterated the local echo attributes and you're sending an AT modem command, you should try sending an ATE0 command to disable echoing by the modem.
I don't experience this problem using screen.
This observation confirms that the connected modem has its echoing enabled.
The AT command is echoed (by the modem) as you type, but you don't object to this received data in this situation (because you want to see what you type).
If the modem did not have echoing enabled, then you would be complaining that what you type in screen was not visible.
IOW echo is desired when using a terminal emulator program (such as screen), but echoing needs to be disabled when sending data by a program.
What is the best way to read from an UART in C using Linux ?
(Technically you are not reading from a "UART", but rather from a serial terminal that fully buffers all input and output.)
Code that conforms to POSIX standard as described in Setting Terminal Modes Properly
and Serial Programming Guide for POSIX Operating Systems would be far better that what you have now.
I'm surprised that it works at all (e.g. CREAD is not enabled).
Could the multiplexed way (using select) causing the problems ?
Not the echo "problem".
Your program does not do anything that requires using select() and nonblocking mode.
Also, sometimes I don't read anything.
When you write code that is not POSIX compliant, you should not expect reliable program behavior.

epoll: difference between level triggered and edge triggered when EPOLLONESHOT specified

What's the difference between level triggered and edge triggered mode, when EPOLLONESHOT specified?
There's a similar question already here. The answer by "Crouching Kitten" doesn't seem to be right (and as I understand, the other answer doesn't answer my question).
I've tried the following:
server sends 2 bytes to a client, while client waits in epoll_wait
client returns from epoll_wait, then reads 1 byte.
client re-arms the event (because of EPOLLONESHOT)
client calls epoll_wait again. Here, for both cases (LT & ET), epoll_wait doesn't wait, but returns immediately (contrary to the answer by "Crouching Kitten")
client can read the second byte
Is there any difference between LT & ET, when EPOLLONESHOT specified?
I think the bottom line answer is "there is not difference".
Looking at the code, it seems that the fd remembers the last set bits before being disabled by the one-shot. It remembers it was one shot, and it remembers whether it was ET or not.
Which is futile, because the fd is disabled until modified, and the next call to EPOLL_CTL_MOD will erase all of that, and replace with whatever the new MOD says.
Having said that, I do not understand why anyone would want both EPOLLET and EPOLLONESHOT. To me, the whole point of EPOLLET is that, unders certain programming models (namely, microthreads), it follows the semantics perfcetly. This means that I can add the fd to the epoll at the very start, and then never have to perform another epoll related system call.
EPOLLONESHOT, on the other hand, is used by people who want to keep a very strict control over when the fd is watched and when it isn't. That, by definition, is the opposite of what EPOLLET is used for. I just don't think the two are conceptually compatible.
The other poster said "I do not understand why anyone would want both EPOLLET and EPOLLONESHOT." Actually, according to epoll(7), there is a use case for that:
Since even with edge-triggered epoll, multiple events can be generated upon receipt of multiple chunks of data, the caller has the option to specify the EPOLLONESHOT flag, to tell epoll to disable the associated file descriptor after the receipt of an event with epoll_wait(2).
The key point is that whether EPOLL will treat the combination of EPOLLET | EPOLLONESHOT and EPOLLLT | EPOLLONESHOT as special case. As I known, it is not. EPOLL just care them seperately. To EPOLLET and EPOLLLT, the different kindly only is in function ep_send_events, if the EPOLLET is set, then the function will call list_add_tail to add the epitem into the ready list in epoll_fd/eventepoll object.
To the EPOLLONESHOT, the role is to disable the fd. So I think the different between them is the different between ET and LT. You can check the result using below codes I think
// server.cc
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <assert.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/epoll.h>
#include <pthread.h>
#define MAX_EVENT_NUMBER 1024
int setnonblocking(int fd)
{
int old_option = fcntl(fd, F_GETFL);
int new_option = old_option | O_NONBLOCK;
fcntl(fd, F_SETFL, new_option);
return old_option;
}
void addfd(int epollfd, int fd, bool oneshot)
{
epoll_event event;
event.data.fd = fd;
event.events = EPOLLIN | EPOLLET;
if(oneshot)
event.events |= EPOLLONESHOT;
epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &event);
setnonblocking(fd);
}
// reset the fd with EPOLLONESHOT
void reset_oneshot(int epollfd, int fd)
{
epoll_event event;
event.data.fd = fd;
event.events = EPOLLIN | EPOLLET | EPOLLONESHOT;
epoll_ctl(epollfd, EPOLL_CTL_MOD, fd, &event);
}
int main(int argc, char** argv)
{
if(argc <= 2)
{
printf("usage: %s ip_address port_number\n", basename(argv[0]));
return 1;
}
const char* ip = argv[1];
int port = atoi(argv[2]);
int ret = 0;
struct sockaddr_in address;
bzero(&address, sizeof(address));
address.sin_family = AF_INET;
inet_pton(AF_INET, ip, &address.sin_addr);
address.sin_port = htons(port);
int listenfd = socket(PF_INET, SOCK_STREAM, 0);
assert(listenfd >= 0);
ret = bind(listenfd, (struct sockaddr*)&address, sizeof(address));
assert(ret != -1);
ret = listen(listenfd, 5);
assert(ret != -1);
epoll_event events[MAX_EVENT_NUMBER];
int epollfd = epoll_create(5);
addfd(epollfd, listenfd, false);
while(1)
{
printf("next loop: -----------------------------");
int ret = epoll_wait(epollfd, events, MAX_EVENT_NUMBER, -1);
if(ret < 0)
{
printf("epoll failure\n");
break;
}
for(int i = 0; i < ret; i++)
{
int sockfd = events[i].data.fd;
if(sockfd == listenfd)
{
printf("into listenfd part\n");
struct sockaddr_in client_address;
socklen_t client_addrlength = sizeof(client_address);
int connfd = accept(listenfd, (struct sockaddr*)&client_address,
&client_addrlength);
printf("receive connfd: %d\n", connfd);
addfd(epollfd, connfd, true);
// reset_oneshot(epollfd, listenfd);
}
else if(events[i].events & EPOLLIN)
{
printf("into linkedfd part\n");
printf("start new thread to receive data on fd: %d\n", sockfd);
char buf[2];
memset(buf, '\0', 2);
// just read one byte, and reset the fd with EPOLLONESHOT, check whether still EPOLLIN event
int ret = recv(sockfd, buf, 2 - 1, 0);
if(ret == 0)
{
close(sockfd);
printf("foreigner closed the connection\n");
break;
}
else if(ret < 0)
{
if(errno == EAGAIN)
{
printf("wait to the client send the new data, check the oneshot memchnism\n");
sleep(10);
reset_oneshot(epollfd, sockfd);
printf("read later\n");
break;
}
}
else {
printf("receive the content: %s\n", buf);
reset_oneshot(epollfd, sockfd);
printf("reset the oneshot successfully\n");
}
}
else
printf("something unknown happend\n");
}
sleep(1);
}
close(listenfd);
return 0;
}
the Client is
from socket import *
import sys
import time
long_string = b"this is a long content which need two time to fetch"
def sendOneTimeThenSleepAndClose(ip, port):
s = socket(AF_INET, SOCK_STREAM);
a = s.connect((ip, int(port)));
print("connect success: {}".format(a));
data = s.send(b"this is test");
print("send successfuly");
time.sleep(50);
s.close();
sendOneTimeThenSleepAndClose('127.0.0.1', 9999)

Proper implementation of an inter process communication (IPC)

Is the following a proper implementation of an inter-process communication?
#include <stdio.h>
#include <fcntl.h>
#include <sys/poll.h>
int main(int argc, char** argv) {
if (argc > 1) {
//Sending side
struct stat buffer;
if (stat("/tmp/PROCAtoPROCB", &buffer) != 0)
mkfifo("/tmp/PROCAtoPROCB", (mode_t)0600);
int fdFIFO = open("/tmp/PROCAtoPROCB", O_WRONLY | O_NONBLOCK);
if (fdFIFO > 0) {
write(fdFIFO, (void *)argv[1], sizeof(argv[1]));
close(fdFIFO);
}
} else {
//Receiving side
int fdFIFO = -1;
struct stat buffer;
if (stat("/tmp/PROCAtoPROCB", &buffer) != 0)
mkfifo("/tmp/PROCAtoPROCB", (mode_t)0600);
while (1) {
struct pollfd pollfds[1];
if (fdFIFO == -1)
fdFIFO = open("/tmp/PROCAtoPROCB", O_RDONLY | O_NONBLOCK);
pollfds[0].fd = fdFIFO;
pollfds[0].events = POLLIN;
poll(pollfds, 1, -1);
if (pollfds[0].revents & POLLIN) {
char buf[1024];
read(fdFIFO, &buf, 1024);
close(fdFIFO);
fdFIFO = -1;
printf("Other process says %s\n", buf);
}
printf("End of loop\n");
}
}
return 0;
}
It seems to be working but I'm wondering if there could be a race condition leading to hanging. One constraint is that both processes need to be started independently and in any order.
Some stress tests showed no problem so the implementation seems OK if somebody wants to reuse the code.

Can not get proper response from select() using writefds

Parent receives SIGPIPE sending chars to aborted child process through FIFO pipe.
I am trying to avoid this, using select() function. In the attached sample code,
select() retruns OK even after the child at the other end of pipe having been terminated.
Tested in
RedHat EL5 (Linux 2.6.18-194.32.1.el5)
GNU C Library stable release version 2.5
Any help appreciated. Thnak you.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
static void sigpipe_fct();
main()
{
struct stat st;
int i, fd_out, fd_in, child;
char buf[1024];
#define p_out "/tmp/pout"
signal(SIGPIPE, sigpipe_fct);
if (stat(p_out, &st) != 0) {
mknod(p_out, S_IFIFO, 0);
chmod(p_out, 0666);
}
/* start receiving process */
if ((child = fork()) == 0) {
if ((fd_in = open(p_out, O_RDONLY)) < 0) {
perror(p_out);
exit(1);
}
while(1) {
i = read(fd_in, buf, sizeof(buf));
fprintf(stderr, "child %d read %.*s\n", getpid(), i, buf);
lseek(fd_in, 0, 0);
}
}
else {
fprintf(stderr,
"reading from %s - exec \"kill -9 %d\" to test\n", p_out, child);
if ((fd_out = open(p_out, O_WRONLY + O_NDELAY)) < 0) { /* output */
perror(p_out);
exit(1);
}
while(1) {
if (SelectChkWrite(fd_out) == fd_out) {
fprintf(stderr, "SelectChkWrite() success write abc\n");
write(fd_out, "abc", 3);
}
else
fprintf(stderr, "SelectChkWrite() failed\n");
sleep(3);
}
}
}
static void sigpipe_fct()
{
fprintf(stderr, "SIGPIPE received\n");
exit(-1);
}
SelectChkWrite(ch)
int ch;
{
#include <sys/select.h>
fd_set writefds;
int i;
FD_ZERO(&writefds);
FD_SET (ch, &writefds);
i = select(ch + 1, NULL, &writefds, NULL, NULL);
if (i == -1)
return(-1);
else if (FD_ISSET(ch, &writefds))
return(ch);
else
return(-1);
}
From the Linux select(3) man page:
A descriptor shall be considered ready for writing when a call to an
output function with O_NONBLOCK clear would not block, whether or not
the function would transfer data successfully.
When the pipe is closed, it won't block, so it is considered "ready" by select.
BTW, having #include <sys/select.h> inside your SelectChkWrite() function is extremely bad form.
Although select() and poll() are both in the POSIX standard, select() is much older and more limited than poll(). In general, I recommend people use poll() by default and only use select() if they have a good reason. (See here for one example.)

Passing struct through socket via recv C/C++

Helo, i am trying to pass it like this
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
struct t_timeSliceRequest request = { 1,2,1 };
sendFlag = send(socketID,(timeSliceRequest *) &request, sin_size ,0);
and on server side
recvFlag = recv(socketID,(timeSliceRequest *) &request,sin_size,0);
but its receiving garbage, even recv returning -1, please help
This is my full Conde
#include<sys/socket.h>
#include<sys/types.h>
#include<string.h>
#include<stdio.h>
#include<arpa/inet.h>
#include<time.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <netinet/in.h>
enum priority_e{ high, normal, low };
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
typedef struct t_TimeSliceResponse {
timeSliceRequest original_req;
// Unix time stamp of when process was started on server
unsigned int time_started;
// Waiting and running time till end of CPU bust
unsigned int ttl;
} TimeSliceResponse;
int main(int argc, char ** argv){
int socketID = 0, clientID = 0;
char sendBuffer[1024], recvBuffer[1024];
time_t time;
struct sockaddr_in servAddr, clientAddr;
struct t_timeSliceRequest request = {1,1,0};
memset(sendBuffer, '0', sizeof(sendBuffer));
memset(recvBuffer, '0', sizeof(recvBuffer));
fprintf(stdout,"\n\n --- Server starting up --- \n\n");
fflush(stdout);
socketID = socket(AF_INET, SOCK_STREAM, 0);
if(socketID == -1){
fprintf(stderr, " Can't create Socket");
fflush(stdout);
}
servAddr.sin_family = AF_INET;
servAddr.sin_port = htons(5000);
servAddr.sin_addr.s_addr = htonl(INADDR_ANY);
int bindID, sin_size, recvFlag;
bindID = bind(socketID, (struct sockaddr *)&servAddr, sizeof(servAddr)); // Casting sockaddr_in on sockaddr and binding it with socket id
if(bindID!=-1){
fprintf(stdout," Bind SucessFull");
fflush(stdout);
listen(socketID,5);
fprintf(stdout, " Server Waiting for connections\n");
fflush(stdout);
while(1){
sin_size = sizeof(struct sockaddr_in);
clientID = accept(socketID, (struct sockaddr *) &clientAddr, &sin_size);
fprintf(stdout,"\n I got a connection from (%s , %d)", inet_ntoa(clientAddr.sin_addr), ntohs(clientAddr.sin_port));
fflush(stdout);
sin_size = sizeof(request);
recvFlag = recv(socketID, &request,sin_size,0);
perror("\n Err: ");
fprintf(stdout, "\n recvFlag: %d", recvFlag);
fprintf(stdout, "\n Time Slice request received:\n\tPid: %d \n\tTime Required: %d ", ntohs(request.processId), ntohs(request.timeRequired));
fflush(stdout);
snprintf(sendBuffer, sizeof(sendBuffer), "%.24s\n", ctime(&time));
write(clientID, sendBuffer, strlen(sendBuffer));
close(clientID);
sleep(1);
}
}else{
fprintf(stdout, " Unable to Bind");
}
close(socketID);
return 0;
}
And Client Code is:
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <arpa/inet.h>
enum priority_e{ high = +1, normal = 0, low = -1};
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
int main(int argc, char *argv[])
{
int socketID = 0 /*Socket Descriptor*/, n = 0;
char recvBuffer[1024];
memset(recvBuffer, '0',sizeof(recvBuffer));
struct sockaddr_in servAddr;
struct t_timeSliceRequest request = { 1,2,high };
if(argc!=2){
fprintf(stderr,"\n Usage: %s <ip of server> \n",argv[0]);
return 1;
}
socketID = socket(AF_INET, SOCK_STREAM, 0);
if(socketID == -1){
fprintf(stderr, "\n Can't create socket \n");
return 1;
}
servAddr.sin_family = AF_INET;
servAddr.sin_port = htons(5000);
if(inet_pton(AF_INET, argv[1], &servAddr.sin_addr)==-1){
fprintf(stderr, "\n Unable to convert given IP to Network Form \n inet_pton Error");
return 1;
}
int connectFlag, sendFlag = 0;
connectFlag = connect(socketID, (struct sockaddr *)&servAddr, sizeof(servAddr));
if(connectFlag == -1){
fprintf(stderr, " Connection Failed\n");
return 1;
}
int sin_size = sizeof(struct t_timeSliceRequest);
fprintf(stdout, " \n %d \n %d \n %d", request.processId, request.timeRequired, request.priority);
sendFlag = send(socketID, &request, sin_size ,0);
fprintf(stdout, "\nSend Flag: %d\n", sendFlag);
n = read(socketID, recvBuffer, sizeof(recvBuffer)-1);
recvBuffer[n] = 0;
fprintf(stdout, "%s",recvBuffer);
if(n < 0){
fprintf(stderr, " Read error\n");
}
return 0;
}
This is the full Code, its giving 'Transport endpoint is not connected'
Keep in mind that sending structs like this over the network may lead to interoperability problems:
if source and destination have different endianess, you're going to receive wrong data (consider using functions like htonl to convert the data to network endianess)
you struct needs to be packed, otherwise different compilers can align differently the variables of the struct (see this to get an idea about aligning the variables)
In any case, ENOTCONN suggests an error establishing the connection between the two hosts.
Transport endpoint is not connected error is returned when your socket isn't bound to any (port,address) pair.
If it's a server side, you should use the socket descriptor that is returned by accept call. In case of a client - you should use a socket that is returned by the successful call to connect.
Btw, sending structure the way you are is quite dangerous. Compilers might insert padding bytes between structure members (invisible to you program, but they take space in the structure) to conform some alignment rules for the target platform. Besides, different platforms might have different endianness, which might screw your structure completely. If your client and server are compiled for different machines, the structure layout and endianness can be incompatible. To solve this problem, you can use packed structures. A way of declaring a structure as packed depends on a compiler. For GCC this can be done by means of adding a special attribute to a structure.
Another way to solve this problem is to put each individual field of a structure to a raw byte-buffer manually. The receiving side should take all this data out in exactly the same way as the data was originally put into that buffer. This approach can be tricky, since you need to take into account a network byte order when saving multi-byte values (like int, long etc). There is a special set of functions like htonl, htons, ntohs etc for that.
Updated
In your server:
recvFlag = recv(socketID, &request,sin_size,0);
Here it should be
recvFlag = recv(clientID, &request,sin_size,0);
socketID is a passive socket and can only accept connections (not send any data).
What is more, the result of accept isn't checked for -1.

Resources