Related
I am trying on a problem with writer and reader. I am trying with windows semaphore functionality.
It is very simple as follows
char n[200];
volatile HANDLE hSem=NULL; // handle to semaphore
The write function for console. Which release the semaphore for the read process.
void * write_message_function ( void *ptr )
{
/* do the work */
while(1){
printf("Enter a string");
scanf("%s",n);
ReleaseSemaphore(hSem,1,NULL); // unblock all the threads
}
pthread_exit(0); /* exit */
}
The print message wait for the release from the write message to print the message.
void * print_message_function ( void *ptr )
{
while(1){
WaitForSingleObject(hSem,INFINITE);
printf("The string entered is :");
printf("==== %s\n",n);
}
pthread_exit(0); /* exit */
}
The main function launch the application.
int main(int argc, char *argv[])
{
hSem=CreateSemaphore(NULL,0,1,NULL);
pthread_t thread1, thread2; /* thread variables */
/* create threads 1 and 2 */
pthread_create (&thread1, NULL, print_message_function, NULL);
pthread_create (&thread2, NULL, write_message_function, NULL);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
/* exit */
CloseHandle(hSem);
}
The program executes but does not show the string input console.
ReleaseSemaphore in write_message_function will force the following:
The print_message_function will start the output and
The write_message_function will output and scan for input.
These two things occur at the same time. Using the semaphore to trigger the output is
fine. However, using MaximumCount=1 is a waste of capabilities, you may have mutiple inputs before an output occurs.
But the main problem here is that the I/O resources and the use of char n[200]; are not
implemented thread-safe. See What is meant by “thread-safe” code? for details. You'd still have to protect these resources by for example a mutex or a critical section.
I'm implementing a misc device driver for linux.
This driver implements file_operations::poll and
I want to make it so that poll(2) would return POLLHUP if the descriptor is closed.
Supposed driver client code(userland code) follows.
void ThreadA(int fd){
// initialization codes...
pfd[0].fd = fd;
pfd[0].event = POLLIN;
int r = poll(pfd, 1, -1);
if(r > 0 && pfd[0].revent & POLLHUP){
// Detect fd is closed
return; // Exit thread
}
}
void ThreadB(int fd){
// waiting some events. ex.signals
// I expect close(fd) will cause poll(2) return and ThreadA will exit.
close(fd);
return;
}
But I could not implement this behavior to in my driver code.
Calling poll(2) never returns even if descriptor is closed. so threadA never exits.
Trivial test driver code follows.
static wait_queue_head_t q;
static int CLOSED = 0;
int my_open(struct inode *a, struct file *b){
printk(KERN_DEBUG "my_open");
return 0;
}
int my_release(struct inode *a, struct file *b){
printk(KERN_DEBUG "my_release");
CLOSED = 1;
wake_up_interruptible(&q);
// I expect this call will wake up q and recall my_poll.
// but doesn't
return 0;
}
unsigned int my_poll(struct file *a, struct poll_table_struct *b){
printk(KERN_DEBUG "my_poll");
poll_wait(file, &q, a);
if(CLOSED != 0)
return POLLHUP;
return 0;
}
static const struct file_operations my_fops = {
.owner = THIS_MODULE,
.open = &my_open,
.release = &my_release,
.poll = &my_poll
};
static struct miscdevice mydevice =
{
.minor = MISC_DYNAMIC_MINOR,
.name = "TESTDEV",
.fops = &my_fops
};
static int __init myinit(void){
init_waitqueue_head(&q);
misc_register(&mydevice);
return 0;
}
static void __exit myexit(void){
misc_deregister(&mydevice);
}
module_init(myinit);
module_exit(myexit);
MODULE_LICENSE("GPL");
I think calling wake_up_interruptible() doesn't effect in my_release().
so that my_poll() will never be recalled and poll(2) will never return.
How should I implement my_poll() in a correct manner?
My test environment:
kernel is linux-3.10.20
The manual page for close warns:
It is probably unwise to close file descriptors while they may be in use by
system calls in other threads in the same process. Since a file descriptor
may be reused, there are some obscure race conditions that may cause unintended
side effects.
I suspect that something in the higher levels of the kernel is cancelling your poll operation when you execute close, before the release() function actually gets called. I'd think about solving your problem a different way.
I'm trying to write some code to communicate with wpa_supplicant using DBUS. As I'm working in an embedded system (ARM), I'd like to avoid the use of Python or the GLib. I'm wondering if I'm stupid because I really have the feeling that there is no nice and clear documentation about D-Bus. Even with the official one, I either find the documentation too high level, or the examples shown are using Glib! Documentation I've looked at: http://www.freedesktop.org/wiki/Software/dbus
I found a nice article about using D-Bus in C: http://www.matthew.ath.cx/articles/dbus
However, this article is pretty old and not complete enough! I also found the c++-dbus API but also here, I don't find ANY documentation! I've been digging into wpa_supplicant and NetworkManager source code but it's quite a nightmare! I've been looking into the "low-level D-Bus API" as well but this doesn't tell me how to extract a string parameter from a D-Bus message! http://dbus.freedesktop.org/doc/api/html/index.html
Here is some code I wrote to test a little but I really have trouble to extract string values. Sorry for the long source code but if someone want to try it ... My D-Bus configuration seems fine because it "already" catches "StateChanged" signals from wpa_supplicant but cannot print the state:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <dbus/dbus.h>
//#include "wpa_supp_dbus.h"
/* Content of wpa_supp_dbus.h */
#define WPAS_DBUS_SERVICE "fi.epitest.hostap.WPASupplicant"
#define WPAS_DBUS_PATH "/fi/epitest/hostap/WPASupplicant"
#define WPAS_DBUS_INTERFACE "fi.epitest.hostap.WPASupplicant"
#define WPAS_DBUS_PATH_INTERFACES WPAS_DBUS_PATH "/Interfaces"
#define WPAS_DBUS_IFACE_INTERFACE WPAS_DBUS_INTERFACE ".Interface"
#define WPAS_DBUS_NETWORKS_PART "Networks"
#define WPAS_DBUS_IFACE_NETWORK WPAS_DBUS_INTERFACE ".Network"
#define WPAS_DBUS_BSSIDS_PART "BSSIDs"
#define WPAS_DBUS_IFACE_BSSID WPAS_DBUS_INTERFACE ".BSSID"
int running = 1;
void stopLoop(int sig)
{
running = 0;
}
void sendScan()
{
// TODO !
}
void loop(DBusConnection* conn)
{
DBusMessage* msg;
DBusMessageIter args;
DBusMessageIter subArgs;
int argType;
int i;
int buffSize = 1024;
char strValue[buffSize];
const char* member = 0;
sendScan();
while (running)
{
// non blocking read of the next available message
dbus_connection_read_write(conn, 0);
msg = dbus_connection_pop_message(conn);
// loop again if we haven't read a message
if (!msg)
{
printf("No message received, waiting a little ...\n");
sleep(1);
continue;
}
else printf("Got a message, will analyze it ...\n");
// Print the message member
printf("Got message for interface %s\n",
dbus_message_get_interface(msg));
member = dbus_message_get_member(msg);
if(member) printf("Got message member %s\n", member);
// Check has argument
if (!dbus_message_iter_init(msg, &args))
{
printf("Message has no argument\n");
continue;
}
else
{
// Go through arguments
while(1)
{
argType = dbus_message_iter_get_arg_type(&args);
if (argType == DBUS_TYPE_STRING)
{
printf("Got string argument, extracting ...\n");
/* FIXME : got weird characters
dbus_message_iter_get_basic(&args, &strValue);
*/
/* FIXME : segmentation fault !
dbus_message_iter_get_fixed_array(
&args, &strValue, buffSize);
*/
/* FIXME : segmentation fault !
dbus_message_iter_recurse(&args, &subArgs);
*/
/* FIXME : deprecated!
if(dbus_message_iter_get_array_len(&args) > buffSize)
printf("message content to big for local buffer!");
*/
//printf("String value was %s\n", strValue);
}
else
printf("Arg type not implemented yet !\n");
if(dbus_message_iter_has_next(&args))
dbus_message_iter_next(&args);
else break;
}
printf("No more arguments!\n");
}
// free the message
dbus_message_unref(msg);
}
}
int main(int argc, char* argv[])
{
DBusError err;
DBusConnection* conn;
int ret;
char signalDesc[1024]; // Signal description as string
// Signal handling
signal(SIGKILL, stopLoop);
signal(SIGTERM, stopLoop);
// Initialize err struct
dbus_error_init(&err);
// connect to the bus
conn = dbus_bus_get(DBUS_BUS_SYSTEM, &err);
if (dbus_error_is_set(&err))
{
fprintf(stderr, "Connection Error (%s)\n", err.message);
dbus_error_free(&err);
}
if (!conn)
{
exit(1);
}
// request a name on the bus
ret = dbus_bus_request_name(conn, WPAS_DBUS_SERVICE, 0, &err);
if (dbus_error_is_set(&err))
{
fprintf(stderr, "Name Error (%s)\n", err.message);
dbus_error_free(&err);
}
/* Connect to signal */
// Interface signal ..
sprintf(signalDesc, "type='signal',interface='%s'",
WPAS_DBUS_IFACE_INTERFACE);
dbus_bus_add_match(conn, signalDesc, &err);
dbus_connection_flush(conn);
if (dbus_error_is_set(&err))
{
fprintf(stderr, "Match Error (%s)\n", err.message);
exit(1);
}
// Network signal ..
sprintf(signalDesc, "type='signal',interface='%s'",
WPAS_DBUS_IFACE_NETWORK);
dbus_bus_add_match(conn, signalDesc, &err);
dbus_connection_flush(conn);
if (dbus_error_is_set(&err))
{
fprintf(stderr, "Match Error (%s)\n", err.message);
exit(1);
}
// Bssid signal ..
sprintf(signalDesc, "type='signal',interface='%s'",
WPAS_DBUS_IFACE_BSSID);
dbus_bus_add_match(conn, signalDesc, &err);
dbus_connection_flush(conn);
if (dbus_error_is_set(&err))
{
fprintf(stderr, "Match Error (%s)\n", err.message);
exit(1);
}
// Do main loop
loop(conn);
// Main loop exited
printf("Main loop stopped, exiting ...\n");
dbus_connection_close(conn);
return 0;
}
Any pointer to any nice, complete, low-level C tutorial is strongly appreciated! I'm also planning to do some remote method call, so if the tutorial covers this subject it would be great! Saying I'm not very smart because I don't get it with the official tutorial is also appreciated :-p!
Or is there another way to communicate with wpa_supplicant (except using wpa_cli)?
EDIT 1:
Using 'qdbusviewer' and the introspection capabilty, this helped me a lot discovering what and how wpa_supplicant works using dbus. Hopping that this would help someone else!
Edit 2:
Will probably come when I'll find a way to read string values on D-Bus!
You have given up the tools that would help you to learn D-Bus more easily and are using the low level libdbus implementation, so maybe you deserve to be in pain. BTW, are you talking about ARM, like a cell phone ARM ? With maybe 500 Mhz and 256 MB RAM ? In this case the processor is well suited to using glib, Qt or even python. And D-Bus is most useful when you're writing asynchronous event driven code, with an integrated main loop, for example from glib, even when you're using the low level libdbus (it has functions to connect to the glib main loop, for example).
Since you're using the low level library, then documentation is what you already have:
http://dbus.freedesktop.org/doc/api/html/index.html
Also, libdbus source code is also part of the documentation:
http://dbus.freedesktop.org/doc/api/html/files.html
The main entry point for the documentation is the Modules page (in particular, the public API section):
http://dbus.freedesktop.org/doc/api/html/modules.html
For message handling, the section DBusMessage is the relevant one:
DBusMessage
There you have the documentation for functions that parse item values. In your case, you started with a dbus_message_iter_get_basic. As described in the docs, retrieving the string requires a const char ** variable, since the returned value will point to the pre-allocated string in the received message:
So for int32 it should be a "dbus_int32_t*" and for string a "const char**". The returned value is by reference and should not be freed.
So you can't define an array, because libdbus won't copy the text to your array. If you need to save the string, first get the constant string reference, then strcpy to your own array.
Then you tried to get a fixed array without moving the iterator. You need a call to the next iterator (dbus_message_iter_next) between the basic string and the fixed array. Same right before recursing into the sub iterator.
Finally, you don't call get_array_len to get the number of elements on the array. From the docs, it only returns byte counts. Instead you loop over the sub iterator using iter_next the same way you should have done with the main iterator. After you have iterated past the end of the array, dbus_message_iter_get_arg_type will return DBUS_TYPE_INVALID.
For more info, read the reference manual, don't look for a tutorial. Or just use a reasonable d-bus implementation:
https://developer.gnome.org/gio/2.36/gdbus-codegen.html
GIO's GDBus automatically creates wrappers for your d-bus calls.
http://qt-project.org/doc/qt-4.8/intro-to-dbus.html
http://dbus.freedesktop.org/doc/dbus-python/doc/tutorial.html
etc.
You don't need to use/understand working of dbus If you just need to write a C program to communicate with wpa_supplicant. I reverse engineered the wpa_cli's source code. Went through its implementation and used functions provided in wpa_ctrl.h/c. This implementation takes care of everything. You can use/modify whatever you want, build your executable and you're done!
Here's the official link to wpa_supplicant's ctrl_interface:
http://hostap.epitest.fi/wpa_supplicant/devel/ctrl_iface_page.html
I doubt this answer will still be relevant to the author of this question,
but for anybody who stumbles upon this like I did:
The situation is now better than all those years ago if you don't want to include GTK/QT in your project to access dbus.
There is dbus API in Embedded Linux Library by Intel (weird I remember it being open, maybe it is just for registered users now?)
and systemd sd-bus library now offers public API. You probably run systemd anyway unless you have a really constrained embedded system.
I have worked with GDbus, dbus-cpp and sd-bus and although I wanted a C++ library,
I found sd-bus to be the simplest and the least problematic experience.
I did not try its C++ bindings but they also look nice
#include <stdio.h>
#include <systemd/sd-bus.h>
#include <stdlib.h>
const char* wpa_service = "fi.w1.wpa_supplicant1";
const char* wpa_root_obj_path = "/fi/w1/wpa_supplicant1";
const char* wpa_root_iface = "fi.w1.wpa_supplicant1";
sd_bus_error error = SD_BUS_ERROR_NULL;
sd_bus* system_bus = NULL;
sd_event* loop = NULL;
sd_bus_message* reply = NULL;
void cleanup() {
sd_event_unref(loop);
sd_bus_unref(system_bus);
sd_bus_message_unref(reply);
sd_bus_error_free(&error);
}
void print_error(const char* msg, int code) {
fprintf(stderr, "%s %s\n", msg, strerror(-code));
exit(EXIT_FAILURE);
}
const char* get_interface(const char* iface) {
int res = sd_bus_call_method(system_bus,
wpa_service,
wpa_root_obj_path,
wpa_root_iface,
"GetInterface",
&error,
&reply,
"s",
"Ifname", "s", iface,
"Driver", "s", "nl80211");
if (res < 0) {
fprintf(stderr, "(get) error response: %s\n", error.message);
return NULL;
}
const char* iface_path;
/*
* an object path was returned in reply
* this works like an iterator, if a method returns (osu), you could call message_read_basic in succession
* with arguments SD_BUS_TYPE_OBJECT_PATH, SD_BUS_TYPE_STRING, SD_BUS_TYPE_UINT32 or you could
* call sd_bus_message_read() and provides the signature + arguments in one call
* */
res = sd_bus_message_read_basic(reply, SD_BUS_TYPE_OBJECT_PATH, &iface_path);
if (res < 0) {
print_error("getIface: ", res);
return NULL;
}
return iface_path;
}
const char* create_interface(const char* iface) {
int res = sd_bus_call_method(system_bus,
wpa_service,
wpa_root_obj_path,
wpa_root_iface,
"CreateInterface",
&error,
&reply,
"a{sv}", 2, //pass array of str:variant (dbus dictionary) with 2
//entries to CreateInterface
"Ifname", "s", iface, // "s" variant parameter contains string, then pass the value
"Driver", "s", "nl80211");
if (res < 0) {
fprintf(stderr, "(create) error response: %s\n", error.message);
return NULL;
}
const char* iface_path;
res = sd_bus_message_read_basic(reply, SD_BUS_TYPE_OBJECT_PATH, &iface_path);
if (res < 0) {
print_error("createIface: ", res);
}
return iface_path;
}
int main() {
int res;
const char* iface_path;
//open connection to system bus - default either opens or reuses existing connection as necessary
res = sd_bus_default_system(&system_bus);
if (res < 0) {
print_error("open: ", res);
}
//associate connection with event loop, again default either creates or reuses existing
res = sd_event_default(&loop);
if (res < 0) {
print_error("event: ", res);
}
// get obj. path to the wireless interface on dbus so you can call methods on it
// this is a wireless interface (e.g. your wifi dongle) NOT the dbus interface
// if you don't know the interface name in advance, you will have to read the Interfaces property of
// wpa_supplicants root interface — call Get method on org.freedesktop.DBus properties interface,
// while some libraries expose some kind of get_property convenience function sd-bus does not
const char* ifaceName = "wlp32s0f3u2";
if (!(iface_path = get_interface(ifaceName))) { //substitute your wireless iface here
// sometimes the HW is present and listed in "ip l" but dbus does not reflect that, this fixes it
if (!(iface_path = create_interface(ifaceName))) {
fprintf(stderr, "can't create iface: %s" , ifaceName);
cleanup();
return EXIT_FAILURE;
}
}
/*
call methods with obj. path iface_path and dbus interface of your choice
this will likely be "fi.w1.wpa_supplicant1.Interface", register for signals etc...
you will need the following to receive those signals
*/
int runForUsec = 1000000; //usec, not msec!
sd_event_run(loop, runForUsec); //or sd_event_loop(loop) if you want to loop forever
cleanup();
printf("Finished OK\n");
return 0;
}
I apologize if the example above does not work perfectly. It is an excerpt from an old project I rewrote to C from C++ (I think it's C(-ish), compiler does not protest and you asked for C) but I can't test it as all my dongles refuse to work with my desktop right now. It should give you a general idea though.
Note that you will likely encounter several magical or semi-magical issues.
To ensure smooth developing/testing do the following:
make sure other network management applications are disabled (networkmanager, connman...)
restart the wpa_supplicant service
make sure the wireless interface is UP in ip link
Also, because is not that well-documented right now:
You can access arrays and inner variant values by sd_bus_message_enter_container
and _exit counterpart. sd_bus_message_peek_type might come handy while doing that.
Or sd_bus_message_read_array for a homogenous array.
The below snippet works for me
if (argType == DBUS_TYPE_STRING)
{
printf("Got string argument, extracting ...\n");
char* strBuffer = NULL;
dbus_message_iter_get_basic(&args, &strBuffer);
printf("Received string: \n %s \n",strBuffer);
}
I'm trying to write a simple thread pool program in pthread. However, it seems that pthread_cond_signal doesn't block, which creates a problem. For example, let's say I have a "producer-consumer" program:
pthread_cond_t my_cond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t my_cond_m = PTHREAD_MUTEX_INITIALIZER;
void * liberator(void * arg)
{
// XXX make sure he is ready to be freed
sleep(1);
pthread_mutex_lock(&my_cond_m);
pthread_cond_signal(&my_cond);
pthread_mutex_unlock(&my_cond_m);
return NULL;
}
int main()
{
pthread_t t1;
pthread_create(&t1, NULL, liberator, NULL);
// XXX Don't take too long to get ready. Otherwise I'll miss
// the wake up call forever
//sleep(3);
pthread_mutex_lock(&my_cond_m);
pthread_cond_wait(&my_cond, &my_cond_m);
pthread_mutex_unlock(&my_cond_m);
pthread_join(t1, NULL);
return 0;
}
As described in the two XXX marks, if I take away the sleep calls, then main() may stall because it has missed the wake up call from liberator(). Of course, sleep isn't a very robust way to ensure that either.
In real life situation, this would be a worker thread telling the manager thread that it is ready for work, or the manager thread announcing that new work is available.
How would you do this reliably in pthread?
Elaboration
#Borealid's answer kind of works, but his explanation of the problem could be better. I suggest anyone looking at this question to read the discussion in the comments to understand what's going on.
In particular, I myself would amend his answer and code example like this, to make this clearer. (Since Borealid's original answer, while compiled and worked, confused me a lot)
// In main
pthread_mutex_lock(&my_cond_m);
// If the flag is not set, it means liberator has not
// been run yet. I'll wait for him through pthread's signaling
// mechanism
// If it _is_ set, it means liberator has been run. I'll simply
// skip waiting since I've already synchronized. I don't need to
// use pthread's signaling mechanism
if(!flag) pthread_cond_wait(&my_cond, &my_cond_m);
pthread_mutex_unlock(&my_cond_m);
// In liberator thread
pthread_mutex_lock(&my_cond_m);
// Signal anyone who's sleeping. If no one is sleeping yet,
// they should check this flag which indicates I have already
// sent the signal. This is needed because pthread's signals
// is not like a message queue -- a sent signal is lost if
// nobody's waiting for a condition when it's sent.
// You can think of this flag as a "persistent" signal
flag = 1;
pthread_cond_signal(&my_cond);
pthread_mutex_unlock(&my_cond_m);
Use a synchronization variable.
In main:
pthread_mutex_lock(&my_cond_m);
while (!flag) {
pthread_cond_wait(&my_cond, &my_cond_m);
}
pthread_mutex_unlock(&my_cond_m);
In the thread:
pthread_mutex_lock(&my_cond_m);
flag = 1;
pthread_cond_broadcast(&my_cond);
pthread_mutex_unlock(&my_cond_m);
For a producer-consumer problem, this would be the consumer sleeping when the buffer is empty, and the producer sleeping when it is full. Remember to acquire the lock before accessing the global variable.
I found out the solution here. For me, the tricky bit to understand the problem is that:
Producers and consumers must be able to communicate both ways. Either way is not enough.
This two-way communication can be packed into one pthread condition.
To illustrate, the blog post mentioned above demonstrated that this is actually meaningful and desirable behavior:
pthread_mutex_lock(&cond_mutex);
pthread_cond_broadcast(&cond):
pthread_cond_wait(&cond, &cond_mutex);
pthread_mutex_unlock(&cond_mutex);
The idea is that if both the producers and consumers employ this logic, it will be safe for either of them to be sleeping first, since the each will be able to wake the other role up. Put it in another way, in a typical producer-consumer sceanrio -- if a consumer needs to sleep, it's because a producer needs to wake up, and vice versa. Packing this logic in a single pthread condition makes sense.
Of course, the above code has the unintended behavior that a worker thread will also wake up another sleeping worker thread when it actually just wants to wake the producer. This can be solved by a simple variable check as #Borealid suggested:
while(!work_available) pthread_cond_wait(&cond, &cond_mutex);
Upon a worker broadcast, all worker threads will be awaken, but one-by-one (because of the implicit mutex locking in pthread_cond_wait). Since one of the worker threads will consume the work (setting work_available back to false), when other worker threads awake and actually get to work, the work will be unavailable so the worker will sleep again.
Here's some commented code I tested, for anyone interested:
// gcc -Wall -pthread threads.c -lpthread
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <assert.h>
pthread_cond_t my_cond = PTHREAD_COND_INITIALIZER;
pthread_mutex_t my_cond_m = PTHREAD_MUTEX_INITIALIZER;
int * next_work = NULL;
int all_work_done = 0;
void * worker(void * arg)
{
int * my_work = NULL;
while(!all_work_done)
{
pthread_mutex_lock(&my_cond_m);
if(next_work == NULL)
{
// Signal producer to give work
pthread_cond_broadcast(&my_cond);
// Wait for work to arrive
// It is wrapped in a while loop because the condition
// might be triggered by another worker thread intended
// to wake up the producer
while(!next_work && !all_work_done)
pthread_cond_wait(&my_cond, &my_cond_m);
}
// Work has arrived, cache it locally so producer can
// put in next work ASAP
my_work = next_work;
next_work = NULL;
pthread_mutex_unlock(&my_cond_m);
if(my_work)
{
printf("Worker %d consuming work: %d\n", (int)(pthread_self() % 100), *my_work);
free(my_work);
}
}
return NULL;
}
int * create_work()
{
int * ret = (int *)malloc(sizeof(int));
assert(ret);
*ret = rand() % 100;
return ret;
}
void * producer(void * arg)
{
int i;
for(i = 0; i < 10; i++)
{
pthread_mutex_lock(&my_cond_m);
while(next_work != NULL)
{
// There's still work, signal a worker to pick it up
pthread_cond_broadcast(&my_cond);
// Wait for work to be picked up
pthread_cond_wait(&my_cond, &my_cond_m);
}
// No work is available now, let's put work on the queue
next_work = create_work();
printf("Producer: Created work %d\n", *next_work);
pthread_mutex_unlock(&my_cond_m);
}
// Some workers might still be waiting, release them
pthread_cond_broadcast(&my_cond);
all_work_done = 1;
return NULL;
}
int main()
{
pthread_t t1, t2, t3, t4;
pthread_create(&t1, NULL, worker, NULL);
pthread_create(&t2, NULL, worker, NULL);
pthread_create(&t3, NULL, worker, NULL);
pthread_create(&t4, NULL, worker, NULL);
producer(NULL);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_join(t3, NULL);
pthread_join(t4, NULL);
return 0;
}
I am currently using select loop to manage sockets in a proxy. One of the requirements of this proxy is that if the proxy sends a message to the outside server and does not get a response in a certain time, the proxy should close that socket and try to connect to a secondary server. The closing happens in a separate thread, while the select thread blocks waiting for activity.
I am having trouble figuring out how to detect that this socket closed specifically, so that I can handle the failure. If I call close() in the other thread, I get an EBADF, but I can't tell which socket closed. I tried to detect the socket through the exception fdset, thinking it would contain the closed socket, but I'm not getting anything returned there. I have also heard calling shutdown() will send a FIN to the server and receive a FIN back, so that I can close it; but the whole point is me trying to close this as a result of not getting a response within the timeout period, so I cant do that, either.
If my assumptions here are wrong, let me know. Any ideas would be appreciated.
EDIT:
In response to the suggestions about using select time out: I need to do the closing asynchronously, because the client connecting to the proxy will time out and I can't wait around for the select to be polled. This would only work if I made the select time out very small, which would then constantly be polling and wasting resources which I don't want.
Generally I just mark the socket for closing in the other thread, and then when select() returns from activity or timeout, I run a cleanup pass and close out all dead connections and update the fd_set. Doing it any other way opens you up to race conditions where you gave up on the connection, just as select() finally recognized some data for it, then you close it, but the other thread tries to process the data that was detected and gets upset to find the connection closed.
Oh, and poll() is generally better than select() in terms of not having to copy as much data around.
You cannot free a resource in one thread while another thread is or might be using it. Calling close on a socket that might be in use in another thread will never work right. There will always be potentially disastrous race conditions.
There are two good solutions to your problem:
Have the thread that calls select always use a timeout no greater than the longest you're willing to wait to process a timeout. When a timeout occurs, indicate that some place the thread that calls select will notice when it returns from select. Have that thread do the actual close of the socket in-between calls to select.
Have the thread that detects the timeout call shutdown on the socket. This will cause select to return and then have that thread do the close.
How to cope with EBADF on select():
int fopts = 0;
for (int i = 0; i < num_clients; ++i) {
if (fcntl(client[i].fd, F_GETFL, &fopts) < 0) {
// call close(), FD_CLR(), and remove i'th element from client list
}
}
This code assumes you have an array of client structures which have "fd" members for the socket descriptor. The fcntl() call checks whether the socket is still "alive", and if not, we do what we have to to remove the dead socket and its associated client info.
It's hard to comment when seeing only a small part of the elephant but maybe you are over complicating things?
Presumably you have some structure to keep track of each socket and its info (like time left to receive a reply). You can change the select() loop to use a timeout. Within it check whether it is time to close the socket. Do what you need to do for the close and don't add it to the fd sets the next time around.
If you use poll(2) as suggested in other answers, you can use the POLLNVAL status, which is essentially EBADF, but on a per-file-descriptor basis, not on the whole system call as it is for select(2).
Use a timeout for the select, and if the read-ready/write-ready/had-error sequences are all empty (w.r.t that socket), check if it was closed.
Just run a "test select" on every single socket that might have closed with a zero timeout and check the select result and errno until you found the one that has closed.
The following piece of demo code starts two server sockets on separate threads and creates two client sockets to connect to either server socket. Then it starts another thread, that will randomly kill one of the client sockets after 10 seconds (it will just close it). Closing either client socket causes select to fail with error in the main thread and the code below will now test which of the two sockets has actually closed.
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdint.h>
#include <pthread.h>
#include <stdbool.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <sys/select.h>
#include <sys/socket.h>
static void * serverThread ( void * threadArg )
{
int res;
int connSo;
int servSo;
socklen_t addrLen;
struct sockaddr_in soAddr;
uint16_t * port = threadArg;
servSo = socket(PF_INET, SOCK_STREAM, 0);
assert(servSo >= 0);
memset(&soAddr, 0, sizeof(soAddr));
soAddr.sin_family = AF_INET;
soAddr.sin_port = htons(*port);
// Uncommend line below if your system offers this field in the struct
// and also needs this field to be initialized correctly.
// soAddr.sin_len = sizeof(soAddr);
res = bind(servSo, (struct sockaddr *)&soAddr, sizeof(soAddr));
assert(res == 0);
res = listen(servSo, 10);
assert(res == 0);
addrLen = 0;
connSo = accept(servSo, NULL, &addrLen);
assert(connSo >= 0);
for (;;) {
char buffer[2048];
ssize_t bytesRead;
bytesRead = recv(connSo, buffer, sizeof(buffer), 0);
if (bytesRead <= 0) break;
printf("Received %zu bytes on port %d.\n", bytesRead, (int)*port);
}
free(port);
close(connSo);
close(servSo);
return NULL;
}
static void * killSocketIn10Seconds ( void * threadArg )
{
int * so = threadArg;
sleep(10);
printf("Killing socket %d.\n", *so);
close(*so);
free(so);
return NULL;
}
int main ( int argc, const char * const * argv )
{
int res;
int clientSo1;
int clientSo2;
int * socketArg;
uint16_t * portArg;
pthread_t killThread;
pthread_t serverThread1;
pthread_t serverThread2;
struct sockaddr_in soAddr;
// Create a server socket at port 19500
portArg = malloc(sizeof(*portArg));
assert(portArg != NULL);
*portArg = 19500;
res = pthread_create(&serverThread1, NULL, &serverThread, portArg);
assert(res == 0);
// Create another server socket at port 19501
portArg = malloc(sizeof(*portArg));
assert(portArg != NULL);
*portArg = 19501;
res = pthread_create(&serverThread1, NULL, &serverThread, portArg);
assert(res == 0);
// Create two client sockets, one for 19500 and one for 19501
// and connect both to the server sockets we created above.
clientSo1 = socket(PF_INET, SOCK_STREAM, 0);
assert(clientSo1 >= 0);
clientSo2 = socket(PF_INET, SOCK_STREAM, 0);
assert(clientSo2 >= 0);
memset(&soAddr, 0, sizeof(soAddr));
soAddr.sin_family = AF_INET;
soAddr.sin_port = htons(19500);
res = inet_pton(AF_INET, "127.0.0.1", &soAddr.sin_addr);
assert(res == 1);
// Uncommend line below if your system offers this field in the struct
// and also needs this field to be initialized correctly.
// soAddr.sin_len = sizeof(soAddr);
res = connect(clientSo1, (struct sockaddr *)&soAddr, sizeof(soAddr));
assert(res == 0);
soAddr.sin_port = htons(19501);
res = connect(clientSo2, (struct sockaddr *)&soAddr, sizeof(soAddr));
assert(res == 0);
// We want either client socket to be closed locally after 10 seconds.
// Which one is random, so try running test app multiple times.
socketArg = malloc(sizeof(*socketArg));
srandomdev();
*socketArg = (random() % 2 == 0 ? clientSo1 : clientSo2);
res = pthread_create(&killThread, NULL, &killSocketIn10Seconds, socketArg);
assert(res == 0);
for (;;) {
int ndfs;
int count;
fd_set readSet;
// ndfs must be the highest socket number + 1
ndfs = (clientSo2 > clientSo1 ? clientSo2 : clientSo1);
ndfs++;
FD_ZERO(&readSet);
FD_SET(clientSo1, &readSet);
FD_SET(clientSo2, &readSet);
// No timeout, that means select may block forever here.
count = select(ndfs, &readSet, NULL, NULL, NULL);
// Without a timeout count should never be zero.
// Zero is only returned if select ran into the timeout.
assert(count != 0);
if (count < 0) {
int error = errno;
printf("Select terminated with error: %s\n", strerror(error));
if (error == EBADF) {
fd_set closeSet;
struct timeval atonce;
FD_ZERO(&closeSet);
FD_SET(clientSo1, &closeSet);
memset(&atonce, 0, sizeof(atonce));
count = select(clientSo1 + 1, &closeSet, NULL, NULL, &atonce);
if (count == -1 && errno == EBADF) {
printf("Socket 1 (%d) closed.\n", clientSo1);
break; // Terminate test app
}
FD_ZERO(&closeSet);
FD_SET(clientSo2, &closeSet);
// Note: Standard requires you to re-init timeout for every
// select call, you must never rely that select has not changed
// its value in any way, not even if its all zero.
memset(&atonce, 0, sizeof(atonce));
count = select(clientSo2 + 1, &closeSet, NULL, NULL, &atonce);
if (count == -1 && errno == EBADF) {
printf("Socket 2 (%d) closed.\n", clientSo2);
break; // Terminate test app
}
}
}
}
// Be a good citizen, close all sockets, join all threads
close(clientSo1);
close(clientSo2);
pthread_join(killThread, NULL);
pthread_join(serverThread1, NULL);
pthread_join(serverThread2, NULL);
return EXIT_SUCCESS;
}
Sample output for running this test code twice:
$ ./sockclose
Killing socket 3.
Select terminated with error: Bad file descriptor
Socket 1 (3) closed.
$ ./sockclose
Killing socket 4.
Select terminated with error: Bad file descriptor
Socket 1 (4) closed.
However, if your system supports poll(), I would strongly advise you to consider using this API instead of select(). Select is a rather ugly, legacy API from the past, only left there for backward compatibility with existing code. Poll has a much better interface for this task and it has an extra flag to directly signal you that a socket has closed locally: POLLNVAL will be set on revents if this socket has been closed, regardless which flags you requested on events, since POLLNVAL is an output only flags, that means it is ignored when being set on events. If the socket was not closed locally but the remote server has just closed the connection, the flag POLLHUP will be set in revents (also an output only flag). Another advantage of poll is that the timeout is simply an int value (milliseconds, fine grained enough for real network sockets) and that there are no limitations to the number of sockets that can be monitored or their numeric value range.