Haskell concurrency and Handles - haskell

I'm writing a little notification server to push data to a client. The basic architecture looks something like this (pared down pseudo-code):
acceptConnections sock = forever $ do
connection <- accept sock
forkIO (handleConnection connection)
handleConnection connection = do
connectionHandle <- socketToHandle connection ReadWriteMode
handleMessage connectionHandle
hClose connectionHandle
handleMessage connectionHandle = forever $ do
message <- hGetLine connectionHandle
if shouldPushMessage message
then hPutStrLn targetConnection message
else return ()
Where targetConnection (in handleMessage) is from a separate connection and is hanging up handleMessage in a different thread waiting for its buffer to be filled. I would think this would cause a problem as I have 2 threads accessing the same Handle. So, my question is, why isn't this a problem? Or is it, and I just haven't seen it turn into an issue yet? In my actual application, when I grab targetConnection, I do so through a map I access via an MVar, but it's not being safely accessed at the hGetLine call.
Disclaimer: I'm a complete Haskell and multi-threaded newb
Thanks for any explanations/insight!

Handle, as implemented in GHC, is already an MVar wrapping over the underlying IODevice. I didn't quite get what you're doing (not saying it was unclear, I'm a little ill so perhaps I'm slow) but am guessing GHCs built in thread-safe handling of Handle is saving you.

Related

Which high level interface to use instead of Network for sending and haskell `Strings`

The first line of documentation of the Network module reads:
This module is kept for backwards-compatibility. New users are encouraged to use Network.Socket instead.
The Network library has functions that allow to send and receive Strings in a convenient manner, for instance:
h <- connectTo "localhost" (PortNumber 9090)
-- ...
line <- hGetLine h
Without using this library (and using Network.Socket instead), the code above will become something like:
addrinfos <- getAddrInfo Nothing (Just "") (Just "9090")
let serveraddr = head addrinfos
sock <- socket (addrFamily serveraddr) Stream defaultProtocol
connect sock (addrAddress serveraddr)
-- ...
msg <- recv sock Size
-- What `Size` should be? The line above should be probably repeated until all the data is received.
Which is is quite low level, and requires coding/decoding a String's. So my question is, if all I want to do is to send and receive strings (with some encoding) over a socket what are the alternatives for accomplishing this task, provided that Network is not an option?
EDIT: although pipes are conduits are good options, the project I'm working on makes heavy use of these Network functions, so I ended up developing a library for sending and receiving Text using similar functions as in the Network library.

Why should buffering not be used in the following example?

I was reading this tutorial:
http://www.catonmat.net/blog/simple-haskell-tcp-server/
To learn the basics of Haskell's Network module. He wrote a small function called sockHandler:
sockHandler :: Socket -> IO ()
sockHandler sock = do
(handle, _, _) <- accept sock
hSetBuffering handle NoBuffering
forkIO $ commandProcessor handle
sockHandler sock
That accepts a connection, and forks it to a new thread. While breaking down the code, he says:
"Next we use hSetBuffering to change buffering mode for the client's socket handle to NoBuffering, so we didn't have buffering surprises."
But doesn't elaborate on that point. What surprises is he talking about? I Google'd it, and saw a few security articles (Which I'm guessing is related to the cache being intercepted), but nothing seemingly related to the content of this tutorial.
What is the issue? I thought about it, but I don't think I have enough networking experience to fill in the blanks.
Thank you.
For the sake of illustration, suppose the protocol allows the server to query the client for some information, e.g. (silly example follows)
hPutStr sock "Please choose between A or B"
choice <- hGetLine sock
case decode choice of
Just A -> handleA
Just B -> handleB
Nothing -> protocolError
Everything looks fine... but the server seems to hang. Why? This is because the message was not really sent over the network by hPutStr, but merely inserted in a local buffer. Hence, the other end never receives the query, so does not reply, causing the server to get stuck in its read.
A solution here would be to insert an hFlush sock before reading. This has to be manually inserted at the "right" points, and is prone to error. A lazier option would be to disable buffering entirely -- this is safer, albeit it severely impacts performance.

Why is chat server example on haskell.org thread safe?

I'm new to Haskell and I can't figure out what I'm not understanding about this example on the Haskell wiki: http://www.haskell.org/haskellwiki/Implement_a_chat_server
The specific code in question is this:
runConn :: (Socket, SockAddr) -> Chan Msg -> -> IO ()
runConn (sock, _) chan = do
let broadcast msg = writeChan chan msg
hdl <- socketToHandle sock ReadWriteMode
hSetBuffering hdl NoBuffering
chan' <- dupChan chan
-- fork off thread for reading from the duplicated channel
forkIO $ fix $ \loop -> do
line <- readChan chan'
hPutStrLn hdl line
loop
-- read lines from socket and echo them back to the user
fix $ \loop -> do
line <- liftM init (hGetLine hdl)
broadcast line
loop
The code above has one thread writing to the handle hdl at the same time (potentially) as another thread is reading from it. Is this safe?
I suspect the nature of forkIO (being internal to Haskell and not a system thread library or process) is what makes this work, but I'm not sure.
I checked the documentation of forkIO for any mention of IO handles
but found nothing. I also checked the documentation of System.IO but couldn't find any mention of using handles between threads without using locking.
So can someone tell me how I should know when something like this is safe when the docs don't mention anything about thread safety?
It's not the nature of forkIO that makes this works but the nature of MVar that is used to implement both Chan and Handle.
If you want to understand how Chan works take a look at this section "MVar as building blocks: Unbounded Channels" in chapter 7 of the excellent book "Parallel and Concurrent Programming in Haskell" by Simon Marlow. In the same chapter there is a section about forkIO and MVar that will help you understand how Handle can be implemented in a thread safe way.
Chapter 12 talks specifically about various ways to implement network servers, including a chat server that is implemented using STM instead of Chans.
If it wasn't safe, blocking sockets would be almost impossible to use. If your protocol is asynchronous and you're using blocking sockets, you need a thread blocking on read pretty much all the time. If you then needed to send a message to the other side, how could you do it? Wait for the other side to send you a message?

Asynchronous UDP server/client as base for IPC in Haskell

I want to put together the basics for asynchronous UDP IPC in Haskell. For this the sender/receiver should issue e.g. an synchronous receive (or send, depending from what side you view it) thread and carry on with other tasks.
This might involve to define a new data type that consists of optional message/data serial numbers and some sort of buffer so that the send thread can stop sending when it gets a notification from the receiver that it cannot cope with the speed.
I aim to make this as light weight & asynchronous as possible.
I have tried a number of things such as starting a new receive thread for every packet (took this approach from a paper about multi player online games), but this was grinding almost everything to a halt.
Below is my innocent first take on this. Any help on e.g. creating buffers, creating serial numbers or a DCCP implementation (that I could not find) in Haskell appreciated. - I would not like to get into opinionated discussions about UDP vs TCP etc..
My snippet stops working once something gets out of sync e.g. when no data arrives any more or when less data arrives than expected. I am looking as said for some way of lightweight (featherweight :D) sync between the send and the receive thread of for an example of such.
main = withSocketsDo $ do
s <- socket AF_INET Datagram defaultProtocol
hostAddr <- inet_addr host
done <- newEmptyMVar
let p = B.pack "ping"
thread <- forkIO $ receiveMessages s done
forM_ [0 .. 10000] $ \i -> do
sendAllTo s (B.pack "ping") (SockAddrInet port hostAddr)
takeMVar done
killThread thread
sClose s
return()
receiveMessages :: Socket -> MVar () -> IO ()
receiveMessages socket done = do
forM_ [0 .. 10000] $ \i -> do
r <- recvFrom socket 1024
print (r) --this is a placeholder to make the fun complete
putMVar done ()
If you don't trust your messenger, you can never agree on anything -- not even a single bit like "are we done yet"!

Automatically reconnect a Haskell Network connection in an idiomatic way

I've worked my way through Don Stewart's Roll your own IRC bot tutorial, and am playing around with some extensions to it. My current code is essentially the same as the "The monadic, stateful, exception-handling bot in all its glory"; it's a bit too long to paste here unless someone requests it.
Being a Comcast subscriber, it's particularly important that the bot be able to reconnect after periods of poor connectivity. My approach is to simply time the PING requests from the server, and if it goes without seeing a PING for a certain time, to try reconnecting.
So far, the best solution I've found is to wrap the hGetLine in the listen loop with System.Timeout.timeout. However, this seems to require defining a custom exception so that the catch in main can call main again, rather than return (). It also seems quite fragile to specify a timeout value for each individual hGetLine.
Is there a better solution, perhaps something that wraps an IO a like bracket and catch so that the entire main can handle network timeouts without the overhead of a new exception type?
How about running a separate thread that performs all the reading and writing and takes care of periodically reconnecting the handle?
Something like this
input :: Chan Char
output :: Chan Char
putChar c = writeChan output c
keepAlive = forever $ do
h <- connectToServer
catch
(forever $
do c <- readChan output; timeout 4000 (hPutChar h c); return ())
(\_ -> return ())
The idea is to encapsulate all the difficulty with periodically reconnecting into a separate thread.

Resources