Automatically reconnect a Haskell Network connection in an idiomatic way

Automatically reconnect a Haskell Network connection in an idiomatic way - haskell

I've worked my way through Don Stewart's Roll your own IRC bot tutorial, and am playing around with some extensions to it. My current code is essentially the same as the "The monadic, stateful, exception-handling bot in all its glory"; it's a bit too long to paste here unless someone requests it.
Being a Comcast subscriber, it's particularly important that the bot be able to reconnect after periods of poor connectivity. My approach is to simply time the PING requests from the server, and if it goes without seeing a PING for a certain time, to try reconnecting.
So far, the best solution I've found is to wrap the hGetLine in the listen loop with System.Timeout.timeout. However, this seems to require defining a custom exception so that the catch in main can call main again, rather than return (). It also seems quite fragile to specify a timeout value for each individual hGetLine.
Is there a better solution, perhaps something that wraps an IO a like bracket and catch so that the entire main can handle network timeouts without the overhead of a new exception type?

How about running a separate thread that performs all the reading and writing and takes care of periodically reconnecting the handle?
Something like this
input :: Chan Char
output :: Chan Char
putChar c = writeChan output c
keepAlive = forever $ do
h <- connectToServer
catch
(forever $
do c <- readChan output; timeout 4000 (hPutChar h c); return ())
(\_ -> return ())
The idea is to encapsulate all the difficulty with periodically reconnecting into a separate thread.

Related

TMVar, but without the buffer?

I'm trying to do communication between Haskell lightweight threads. Threads want to send each other messages for communication and synchronisation.
I was originally using TMVar for this, but I've just realised that the semantics are wrong: a TMVar will store one message in it internally, so positing a message to an empty TMVar won't block. It'll only block if you post a message to a full TMVar.
Can anyone suggest a similar STM IPC construct which:
will cause all writes to block until the message is consumed;
will cause all reads to block until a message is provided?
i.e. a zero-length pipe would be ideal; but I don't think BoundedChan would be happy if I gave it a capacity of 0. (Also, it's not STM.)

If I understand your problem correctly, I don't think you can, since the transactional guarantees mean that transaction A can't read from transaction B's write until transaction B is committed, at which point it can no longer block.
TMVar is the closest you're going to get if you're using STM. With IO, you may be able to build a structure which only completes a write when a reader is available (this structure may already exist, but I'm not aware of it).

I'd suggest to reformulate the two requirements:
will cause all writes to block until the message is consumed;
will cause all reads to block until a message is provided.
The problem is with terms block and consumed/provided. With STM there is no notion of block, there is just retry, which has a different semantics: It restarts the current transaction - it doesn't wait until something happens (this could cause deadlocks). So we can't say "block until ...", we can only say something like "the transaction succeeds only when ...".
Similarly, what does "until a message is consumed/provided" mean? Since transactions are atomic, it can only be "until the transaction that consumed/provided a message succeeded".
So let's try to reformulate:
will cause all writes to retry until a transaction that consumes the message succeeds;
will cause all reads to retry until a transaction that provides a message succeeds.
But now the first point doesn't make sense: If a write retries, there is no message to be consumed, the transaction didn't pause, it's been discarded and started over - possibly producing a different message!
In other words: Any data can ever leave a STM transaction only when it succeeds (completes). This is by design - the transactions are always atomic from the point of view of the outside world / other transactions - you can never observe results of only a part of a transaction. You can never observe two transactions interacting.
So a 0-length queue is a bad analogy - it will never allow to pass any data though. At the end of any transaction, it'll have to have to be empty, so no data will ever pass through.
Nevertheless I believe it'll be possible to reformulate the requirements according to your goals and subsequently find a solution.

You say you would be happy with one side or the other being in IO rather than STM. So then it is not too hard to code this up. Let's start with the version that has receiving in IO. To make this happen, the receiver will have to initiate the handshake.
type SynchronousVar a = TChan (TMVar a)
send :: SynchronousVar a -> a -> STM a
receive :: SynchronousVar a -> IO a
send svar a = do
tmvar <- readTChan svar
putTMVar tmvar a
receive svar = do
tmvar <- newEmptyTMVarIO
atomically $ writeTChan svar tmvar
atomically $ takeTMVar tmvar
A similar protocol can be written that has sending start the handshake.
type SynchronousVar a = TChan (a, TMVar ())
send :: SynchronousVar a -> a -> IO a
receive :: SynchronousVar a -> STM a
send svar a = do
tmvar <- newEmptyTMVarIO
atomically $ writeTChan svar (a, tmvar)
atomically $ takeTMVar tmvar
receive svar = do
(a, tmvar) <- readTChan svar
putTMvar tmvar ()
return a
Probably, if you really need synchronous communication, this is because you want two-way communication (i.e. the action that's running in IO wants to know something about the thread it's synchronizing with). It is not hard to extend the above protocol to pass off a tad more information about the synchronization (by adding it to the one-tuple in the former case or to the TMVar in the latter case).

Why should buffering not be used in the following example?

I was reading this tutorial:
http://www.catonmat.net/blog/simple-haskell-tcp-server/
To learn the basics of Haskell's Network module. He wrote a small function called sockHandler:
sockHandler :: Socket -> IO ()
sockHandler sock = do
(handle, _, _) <- accept sock
hSetBuffering handle NoBuffering
forkIO $ commandProcessor handle
sockHandler sock
That accepts a connection, and forks it to a new thread. While breaking down the code, he says:
"Next we use hSetBuffering to change buffering mode for the client's socket handle to NoBuffering, so we didn't have buffering surprises."
But doesn't elaborate on that point. What surprises is he talking about? I Google'd it, and saw a few security articles (Which I'm guessing is related to the cache being intercepted), but nothing seemingly related to the content of this tutorial.
What is the issue? I thought about it, but I don't think I have enough networking experience to fill in the blanks.
Thank you.

For the sake of illustration, suppose the protocol allows the server to query the client for some information, e.g. (silly example follows)
hPutStr sock "Please choose between A or B"
choice <- hGetLine sock
case decode choice of
Just A -> handleA
Just B -> handleB
Nothing -> protocolError
Everything looks fine... but the server seems to hang. Why? This is because the message was not really sent over the network by hPutStr, but merely inserted in a local buffer. Hence, the other end never receives the query, so does not reply, causing the server to get stuck in its read.
A solution here would be to insert an hFlush sock before reading. This has to be manually inserted at the "right" points, and is prone to error. A lazier option would be to disable buffering entirely -- this is safer, albeit it severely impacts performance.

Long polling in Yesod

Can I do long polling in Yesod, or any other Haskell web framework with comparable database facilities?
To be precise, I want to delay a HTTP response until something interesting happens. There should also be a timeout after which the client will be served a response saying "nothing happened" and then the client will issue the same request.
To make life even more complicated, the app I have in mind is serving all its stuff over both HTTP/HTML5 and a really compact UDP protocol to MIDP clients. Events from either protocol can release responses in either protocol.
TIA,
Adrian.

I can't answer all the issues of the more complicated UDP stuff, but the short answer is that, yes, Yesod supports long polling. You can essentially do something like:
myHandler = do
mres <- timeout timeoutInMicroseconds someAction
case mres of
Nothing -> return nothingHappenedResponse
Just res -> doSomething res
You'll probably want to used System.Timeout.Lifted from the lifted-base package.

Michael's answer hits the timeout requirement. For general clients you do not want to keep HTTP responses waiting for more than about 60 seconds as they may be connecting through a proxy or similar which tend to get impatient after about that long. If you're on a more tightly controlled network then you may be able to relax this timeout. One minor correction is that the parameter to timeout is in microseconds not nanoseconds.
For the 'wait for something interesting to happen' part, we use the check combinator from Control.Concurrent.STM (which wraps up retry) so our handler thread waits on a TVar:
someAction = do
interestingStuff <- atomically $ do
currentStuff <- readTVar theStuff
check $ isInteresting currentStuff
return currentStuff
respondWith interestingStuff
Meanwhile, other threads (incl HTTP handlers) are updating theStuff :: TVar Stuff - each update triggers a new calculation of isInteresting and potentially a response if it returns True.
This is compatible with serving the same information over UDP: simply share theStuff between your UDP server threads and the Yesod threads.

Detecting I/O exceptions in a lazy String from hGetContents?

hGetContents returns a lazy String object that can be used in purely functional code to read from a file handle. If an I/O exception occurs while reading this lazy string, the underlying file handle is closed silently and no additional characters are added to the lazy string.
How can this I/O exception be detected?
As a concrete example, consider the following program:
import System.IO -- for stdin
lengthOfFirstLine :: String -> Int
lengthOfFirstLine "" = 0
lengthOfFirstLine s = (length . head . lines) s
main :: IO ()
main = do
lazyStdin <- hGetContents stdin
print (lengthOfFirstLine lazyStdin)
If an exception occurs while reading the first line of the file, this program will print the number of characters until the I/O exception occurs. Instead I want the program to crash with the appropriate I/O exception. How could this program be modified to have that behavior?
Edit: Upon closer inspection of the hGetContents implementation, it appears that the I/O exception is not ignored but rather bubbles up through the calling pure functional code to whatever IO code happened to trigger evaluation, which has the opportunity to then handle it. (I was not previously aware that pure functional code could raise exceptions.) Thus this question is a misunderstanding.
Aside: It would be best if this exceptional behavior were verified empirically. Unfortunately it is difficult to simulate a low level I/O error.

Lazy IO is considered to be a pitfall by many haskellers and as such is advised to keep away from. Your case colorfully describes why.
There is a non-lazy alternative of hGetContents function. It works on Text, but Text is also a generally preferred alternative to String. For convenience, there are modern preludes, replacing the String with Text: basic-prelude and classy-prelude.

Aside: It would be best if this exceptional behavior were verified
empirically. Unfortunately it is difficult to simulate a low level I/O
error.
I was wondering about the same thing, found this old question, and decided to perform an experiment.
I ran this little program in Windows, that listens for a connection and reads from it lazily:
import System.IO
import Network
import Control.Concurrent
main :: IO ()
main = withSocketsDo (do
socket <- listenOn (PortNumber 19999)
print "created socket"
(h,_,_) <- accept socket
print "accepted connection"
contents <- hGetContents h
print contents)
From a Linux machine, I opened a connection using nc:
nc -v mymachine 19999
Connection to mymachine 19999 port [tcp/*] succeeded!
And then used Windows Sysinternal's TCPView utility to forcibly close the connection. The result was:
Main.exe: <socket: 348>: hGetContents: failed (Unknown error)
It appears that I/O exceptions do bubble up.
A further experiment: I added a delay just after the hGetContents call:
...
contents <- hGetContents h
threadDelay (60 * 1000^2)
print contents)
With this change, killing the connection doesn't immediately raise an exception because, thanks to lazy I/O, nothing is actually read until print executes.

Implement main server loop in Haskell?

What is the generally accepted way to implement the main loop of a server that needs to wait on a heterogeneous set of events? That is the server should wait (not busywait) until one of the following occurs:
new socket connection
data available on an existing socket
OS signal
third-party library callbacks

I think you're thinking in terms of a C paradigm with a single thread, nonblocking I/O, and a select() call.
You can manage to write something like that in Haskell, but Haskell has much more to offer:
lightweight threads
safe and efficient concurrent data primitives like Mvar and Chan
the Big Gun: Software Transactional Memory
I recommend you fork a new thread for every separate point of contact with the outside world, and keep everything coordinated with STM.

Use takeMVar and putMVar to synchronize between threads. They generally block the thread if operation is not permitted.
Read ghc docs.

I'd like to make it clear I think the two solutions posted first are better than this one for the specific problem you have, but here's a way to solve the type of problem you presented.
A simple way round this is to take your definitions like
data SocketConn = ....
data DataAvail = ...
data OSSignal = ...
data Callback = ...
and define the unsimplified version of
data ServerEvent = Sok SocketConn | Dat DataAvail | Sig OSSignal | Call Callback
handleEvent :: ServerEvent -> IO ()
handleEvent (Soc s) = ....
handleEvent (Dat d) = ....
handleEvent (Sig o) = ....
handleEvent (Call c) = ....
Like I say, read up on the other answers!

Software Transactional Memory (STM) is the main way to do a multi-way wait.
However, by the looks of things, in your case you probably just want to spawn a seperate Haskell thread for each task, and let each such thread block while there's nothing happening.
You wouldn't want to create a thousand OS threads, but a thousand Haskell threads is no trouble at all.
(If these threads need to coordinate from time to time, then again, STM is probably the simplest, most reliable way to do that.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string