I should state that I'm not asking about specific implementation details (yet), but just a general overview of what's going on. I understand the basic concept behind a socket, and need clarification on the process as a whole. My (probably very wrong) understanding is currently this:
A socket is constantly listening for clients that want to connect (in its own thread). When a connection occurs, an event is raised that spawns another thread to perform the connection process. During the connection process the client is assigned it's own socket in which to communicate with the server. The server then waits for data from the client and when data arrives an event is raised which spawns a thread to read the data from a stream into a buffer.
My questions are:
How off is my understanding?
Does each client socket require it's own thread to listen for data on?
How is data routed to the correct client socket? Is this something taken care of by the guts of TCP/UDP/kernel?
In this threaded environment, what kind of data is typically being shared, and what are the points of contention?
Any clarifications and additional explanation would be greatly appreciated.
Regarding the question about what data is typically shared and points of contention, I realize this is more of an implementation detail than it is a question regarding general process of accepting connections and sending/receiving data. I had looked at a couple implementations (SuperSocket and Kayak) and noticed some synchronization for things like session cache and reusable buffer pools. Feel free to ignore this question. I've appreciated all your feedback.

One thread per connection is bad design (not scalable, overly complex) but unfortunately way too common.
A socket server works more or less like this:
A listening socket is setup to accept connections, and added to a socketset
The socket set is checked for events
If the listening socket has pending connections, new sockets are created by accepting the connections, and then added to the socket set
If a connected socket has events, the relevant IO functions are called
The socket set is checked for events again
This happens in one thread, you can easily handle thousands of connected sockets in a single thread, and there's few valid reasons for making this more complex by introducing threads.
while running
select on socketset
for each socket with events
if socket is listener
accept new connected socket
add new socket to socketset
else if socket is connection
if event is readable
read data
process data
else if event is writable
write queued data
else if event is closed connection
remove socket from socketset
The IP stack takes care of all the details of which packets go to what "socket" in which order. Seen from the applications point of view, a socket represents a reliable ordered byte stream (TCP) or an unreliable unordered sequence of packets(UDP)
EDIT: In response to updated question.
I don't know either of the libraries you mention, but on the concepts you mention:
A session cache typically keeps data associated with a client, and can reuse this data for multiple connections. This makes sense when your application logic requires state information, but it's a layer higher than the actual networking end. In the above sample, the session cache would be used by the "process data" part.
Buffer pools are also an easy and often effective optimization of a high-traffic server. The concept is very easy to implement, instead of allocating/deallocating space for storing data you read/write, you fetch a preallocated buffer from a pool, use it, then return it to a pool. This avoids the (sometimes relatively expensive) backend allocation/deallocation mechanisms. This is not directly related to networking, you can just as well use buffer pools for e.g. something that reads chunks of files and process them.

How off is my understanding?
Pretty far.
Does each client socket require it's own thread to listen for data on?
How is data routed to the correct client socket? Is this something taken care of by the guts of TCP/UDP/kernel?
TCP/IP is a number of layers of protocol. There's no "kernel" to it. It's pieces, each with a separate API to the other pieces.
The IP Address is handled in on place.
The port # is handled in another place.
The IP addresses are matched up with MAC addresses to identify a particular host. The port # is what ties a TCP (or UDP) socket to a particular piece of application software.
In this threaded environment, what kind of data is typically being shared, and what are the points of contention?
What threaded environment?
Data sharing? What?
Contention? The physical channel is the number one point of contention. (Ethernet, for example depends on collision-detection.) After that, well, every part of the computer system is a scarce resource shared by multiple applications and is a point of contention.


server.listen(5) vs multithreading in socket programming

I am working on socket programming in python. I am a bit confused with the concept of s.listen(5) and multithreading.
As I know, s.listen(5) is used so that the server can listen upto 5 clients.
And multithreading is also used so that server can get connected to many clients.
Please explain me in which condition we do use multithreading?
Thanks in advance
You will need to use multithreading to handle multiple clients. When you accept a connection you receive a new socket instance that represents the connection with that new client. Now lets suppose you are making a chat and you need to receive the data from one client and send it to all connected clients, if you are not using multithreading you will need to implement a non-performatic logic using a single process loop to walk your connected clients reading each one and after all send to them the data, but you will have another problem because the listen function creates an IO interruption that waits until a new client try to connect if you don't use non-block socket. It's all about architecture, performance and good practices.
A good reading about multithreading follow this link
As I know, s.listen(5) is used so that the server can listen upto 5 clients.
No. s.listen(5) declares a backlog of size 5. Than means that the listening socket will let 5 connection requests in pending stated before they are accepted. Each time a connection request is accepted it is no longer in the pending backlog. So there is no limit (other than the server resources) to the number of accepted connections.
A common use of multithreading is to start a new thread after a connection has been accepted to process that connection. An alternative is to use select on a single thread to process all the connections in the same thread. It used to be the rule before multithreading became common, but it can lead to more complex programs

Is it safe in ZeroMQ to zmq_poll() a REP socket + send() from multiple threads?

I am wondering if a ZeroMQ REP socket is allowed to be poll()-ed on incoming data in one thread and used to send data from the other thread.
The idea I am trying to follow is the following:
A REP socket is not going to receive anything, as long as it did not send a reply to the incoming request. Thus if a zmq_poll() was called for such a socket, it'd just block (until timeout or forever).
Now, while this socket is a part of the zmq_poll() call for incoming data, what happens if another thread prepared a reply and uses this socket to send this reply.
Is is safe to do so or are race conditions possible than?
ZeroMQ has been built on a set of a few maxims.
Zero-sharing is one of these core maxims.
While a user can at her/his own risk experiment with sharing, ZeroMQ best practices avoid doing that, except for very few and very specific cases and not on a socket-level. Sockets are knowingly not thread-safe, for the sake of the higher overall performance and lower latency.
This is the reason why a question "What happens if another thread ..." may sound legitimate, but not inside the domain of ZeroMQ Best Practices zone.

What kind of server needs select

I know that a server normally open one port and listen it.
Today I learnt that there was a function select in system Unix-Like. With select we can listen multi-sockets.
I just can't imagine a case where we need to use select. If we have two sockets, it means that we are listening two ports, right? So I have a question:
What kind of server would open more than one port but receive and process the same type of requests?
Using select helps with handling reads and writes on multiple sockets. It doesn't have to be multiple server sockets. The most typical use is for multiplexing a large number of client sockets.
You have a server with one listening socket. Each time you accept a connection, you add the new client socket to the multiplexing pool. select then returns any time any of those sockets has data available to read. The big win is that you're doing all this with one thread.
You also get as socket for each connection that you've accepted on the listening (server) socket.
selecting among these (client) sockets and the server socket (readable => new connection) allows you to write apps such as chat servers efficiently.
Ummm... remember the difference between ports and sockets.
A "port" is like a telephone-number. But a single phone-number could be handling any number of "calls!"
A "socket," then, represents a single telephone-call: a currently active connection between this server and a particular client. Each connection, by definition, "takes place over a particular port," but any number of connections might exist at the same time.
(The "accept" operation corresponds to: picking up the phone.)
So, then, what select() buys you is the ability to monitor any number of sockets at one time. It examines all the sockets, waits (if necessary) for something to happen on any one of them, and returns one message to you. Now, the design of your server becomes "a simple loop." No matter how many sockets you're listening to, and no matter how many of them have messages waiting, select() will return messages to you one at a time.
It's basically the case that "every server out there will use a select() loop at its heart, unless there's an exceptionally wonderful reason not to.
Take a look here:
One traditional way to write network servers is to have the main
server block on accept(), waiting for a connection. Once a connection
comes in, the server fork()s, the child process handles the connection
and the main server is able to service new incoming requests.
With select(), instead of having a process for each request, there is
usually only one process that "multi-plexes" all requests, servicing
each request as much as it can.
So one main advantage of using select() is that your server will only
require a single process to handle all requests. Thus, your server
will not need shared memory or synchronization primitives for
different 'tasks' to communicate.
One major disadvantage of using select(), is that your server cannot
act like there's only one client, like with a fork()'ing solution. For
example, with a fork()'ing solution, after the server fork()s, the
child process works with the client as if there was only one client in
the universe -- the child does not have to worry about new incoming
connections or the existence of other sockets. With select(), the
programming isn't as transparent.

winsock application and multhreading - listening to socket event from another thread

assume we have an application which uses winsock to implement tcp communication.
for each socket we create a thread and block-receiving on it.
when data arrives, we would like to notify other threads (listening threads).
i was wondering what is the best way to implement this:
move away from this design and use a non-blocking socket, then the listening thread will have to iterate constantly and call a non-blocking receive, thus making it thread safe (no extra threads for the sockets)
use asynchronous procedure calls to notify listening threads - which again will have to alert-wait for apc to queue for them.
implement some thread safe message queue, where each socket thread will post messages to it, and the listener, again, will go over it every interval and pull data from it.
also, i read about WSAAsyncSelect, but i saw that this is used to send messages to a window. isnt there something similar for other threads? (well i guess apcs are...)
Use I/O completion ports. See the CreateIoCompletionPort() and the GetQueuedCompletionStatus() functions of the Win32 API (under File Management functions). In this instance, the socket descriptors are used in place of file handles.
You'll always be better off abstracting the mechanics of socket API (listening, accepting, reading & writing) in a separate layer from the application logic. Have an object that captures the state of a connection, which is created during an incoming connection and you can maintain buffers in this object for the incoming and outgoing traffic. This will allow your network interface layer to be independent of the application code. This will also make the code cleaner by separating the application functionality from the underlying communication mechanism.
Blocking or non-blocking socket decision depends on the level of scalability that your applications needs to achieve. If your application needs to support hundreds of incoming connections, adopting a thread-per-socket approach is not going to be very wise. You'll be better off going for an Io ports based implementation, which will make your app immensely scaleable at added code complexity. However, if you only foresee a few 10s of connections at any point in time, you can go for an asynchronous sockets model using Win32 events or messages. Win32 events based approach doesn't scale very well beyond a certain limit as you would have to manage multiple threads if the number of concurrent sockets exceed 63 (as WaitForMultipleObjects can only support a max of 64 sockets). Windows message based mechanism doesn't have this limitation though. OHOH, Win32 event based approach does not require a GUI window to work.
Check out WSAEventSelect along with WSAAsyncSelect API documentation in MSDN.
You might want to take a look at boost::asio package as well. It provides a neat (though a little complex) C++ abstraction over sockets API.

choose between tcp "long" connection and "short" connection for internal service

I got an app that web server re-direct some requests to backend servers, and the backend servers(Linux) will do complicated computations and response to web server.
For the tcp socket connection management between web server and backend server, i think there are two basic strategy:
"short" connection: that is, one connection per request. This seems very easy for socket management and simplify the whole program structure. After accept, we just get some thread to process the request and finally close this socket.
"long" connection: that is, for one tcp connection, there could be multi request one by one. It seems this strategy could make better use of socket resource and bring some performance improvement(i am not quite sure). BUT it seems this brings a lot of complexity than "short" connection. For example, since now socket fd may be used by multi-threads, synchronization must be involved. and there are more, socket failure process, message sequence...
Is there any suggestions for these two strategies?
UPDATE:, #SargeATM 's answer remind me that i should tell more about the backend service.
Each request is kind of context-free. Backend service can do calculation based on one single request message. It seems to be sth. stateless.
Without getting into the architecture of the backend which I think heavily influences this decision, I prefer short connections for stateless "quick" request/response type traffic and long connections for stateful protocols like a synchronization or file transfer.
I know there is some tcp overhead for establishing a new connection (if it isn't local host) but that has never been anything I have had to optimize in my applications.
Ok I will get a little into architecture since this is important. I always use threads not per request but by function. So I would have a thread that listened on the socket. Another thread that read packets off of all the active connections and another thread doing the backend calculations and a last thread saving to a database if needed. This keep things clean and simple. Easy to measure slow spots, maintain, and to optimize later when needed if needed.
What about a third option... no connection!
If your job description and job results are both of small size, UDP sockets may be a good idea. You have even less resources to manage, as there's no need to bound the request/response to a file descriptor, which give you some flexibility for the future. Imagine you have more backend services and would like to do some load balancing – a busy service can send the job to another one with UDP address of job submitter. The latter just waits for the result and doesn't care where you performed the task.
Obviously you'd have to deal with lost, duplicated and out of order packets, but as a reward you don't have to deal with broken connections. Out of order packets are probably not a big deal if you can fit the request and response in one UDP message, duplication can be taken care of by some job ids, and lost packet... well, they can be simply resent ;-)
Consider this!
Well, you are right.
The biggest problem with persistent connections will be making sure that app got "clean" connection from pool. Without any garbage left of data from another request.
There are a lot of ways to deal with that problem, but at the end it is better to close() tainted connection and open new one than trying to clean it...
