I am trying to understand the following code (taken from https://github.com/zadarnowski/postgresql-wire-protocol) which asynchronously sends and receives PSQL wire protocol messages. I am wondering how and when it would be useful to have this notion of an asynchronous send and receive. If I am getting the responses back on a separate thread, how do I know which request any given response belongs to.
Additionally, if a particular message caused an error, how would I know which message caused that error?
module Database.PostgreSQL.Protocol.Client.Connection (
Frontend
) where
import Control.Concurrent.Chan
import Control.Concurrent.MVar
data FrontendSession = FrontendSession {
frontendSendQueue :: Chan FrontendMessage,
frontendRecvQueue :: Chan BackendMessage,
frontendProcQueue :: Chan BackendMessageHandler,
frontendSendThread :: ThreadId,
frontendRecvThread :: ThreadId,
frontendProcThread :: ThreadId
}
newtype BackendMessageHandler :: Consume (BackendMessage -> IO BackendMessageHandler) | Pass
beginFrontendSession :: (Int32 -> IO Lazy.ByteString) -> (Lazy.ByteString -> IO ()) -> IO FrontendSession
beginFrontendSession readData writeData = do
sendQueue <- newChan
recvQueue <- newChan
procQueue <- newChan
sendThread <- forkIO $ forever $ readChan sendQueue >>= writeData . toLazyByteString . frontendMessage
recvThread <- forkIO $ forever $ readBackendMessage readData >>= writeChan recvQueue
procThread <- forkIO $ let getNextMessage h = readChan recvQueue >>= apply h
apply (Consume h) m = h m >>= getNextMessage
apply (Pass) m = readChan procQueue >>= flip apply m
in getNextMessage Pass
return FrontendSession {
frontendSendQueue = sendQueue,
frontendRecvQueue = recvQueue,
frontendProcQueue = procQueue,
frontendSendThread = sendThread,
frontendRecvThread = recvThread
}
The library in question appears to be in its very early stages. The file you are asking about doesn't even parse; and after fixing the parse errors doesn't compile. I doubt very much whether it's possible to give serious answers to any question about this codebase for some time yet. Still, I will make an attempt.
I am wondering how and when it would be useful to have this notion of an asynchronous send and receive. If I am getting the responses back on a separate thread, how do I know which request any given response belongs to.
It's not clear to me that "asynchronous" implies "you must send and receive on separate threads". However, it may imply that doing so is allowed; in which case presumably your two threads will need to be in communication with each other somehow.
Additionally, if a particular message caused an error, how would I know which message caused that error?
Presumably messages will include an identifier of some sort, and the error would report that identifier.
Related
I have a Haskell-based web service that performs a calculation that for some input can take a really long time to finish. ("really long" here means over a minute)
Because performing that calculation takes all the CPU available on the server, I place incoming requests in a queue (well, actually a stack for reasons that have to do with the typical client, but that's besides the point) when they arrive and service them when the currently running calculation finishes.
My problem is that the clients don't always wait long enough, and sometimes time out on their end, disconnect, and try a different server (well, they try again and hit the elb, and usually get a different instance). Also, occasionally the calculation the web client was asking for will become obsolete because of external factors and the web client will be killed.
In those cases I'd really like to be able to detect that the web client has gone away before I pull the next request off the stack and start the (expensive) calculation. Unfortunately, my experience with snap leads me to believe that there's no way in that framework to ask "is the client's TCP connection still connected?" and I haven't found any documentation for other web frameworks that cover the "client disconnected" case.
So is there a Haskell web framework that makes it easy to detect whether a web client has disconnected? Or failing that, is there one that at least makes it possible?
(I understand that it may not be possible to be absolutely certain in all cases whether a TCP client is still there without sending data to the other end; however, when the client actually sends RST packets to the server and the server's framework doesn't let the application code determine that the connection is gone, that's a problem)
Incidentally, though one might suspect that warp's onClose handler would let you do this, this fires only when a response is ready and written to the client so is useless as a way of aborting a calculation in progress. There also seems to be no way to get access to the accepted socket so as to set SO_KEEPALIVE or similar. (There are ways to access the initial listening socket, but not the accepted one)
So I found an answer that works for me and it might work for someone else.
It turns out that you can in fact mess around enough with the internals of Warp to do this, but then what you're left with is a basic version of Warp and if you need things like logging, etc., will need to add other packages on to that.
Also, note that so-called "half-closed" connections (when the client closes their sending end, but is still waiting for data) will be detected as closed, interrupting your calculation. I don't know of any HTTP clients that deal in half-closed connections, but just something to be aware of.
Anyway, what I did was first copy the functions runSettings and runSettingsSocket exposed by Network.Wai.Handler.Warp and Network.Wai.Handler.Warp.Internal and made versions that called a function I supplied instead of WarpI.socketConnection, so that I have the signature:
runSettings' :: Warp.Settings -> (Socket -> IO (IO WarpI.Connection))
-> Wai.Application -> IO ()
This required copying out a few helper methods, like setSocketCloseOnExec and windowsThreadBlockHack. The double-IO signature there might look weird, but it's what you want - the outer IO is run in the main thread (that calls accept) and the inner IO is run in the per-connection thread that is forked after accept returns. The original Warp function runSettings is equivalent to:
\set -> runSettings' set (WarpI.socketConnection >=> return . return)
Then I did:
data ClientDisappeared = ClientDisappeared deriving (Show, Eq, Enum, Ord)
instance Exception ClientDisappeared
runSettingsSignalDisconnect :: Warp.Settings -> Wai.Application -> IO ()
runSettingsSignalDisconnect set =
runSettings' set (WarpI.socketConnection >=> return . wrapConn)
where
-- Fork a 'monitor' thread that does nothing but attempt to
-- perform a read from conn in a loop 1/sec, and wrap the receive
-- methods on conn so that they first consume from the stuff read
-- by the monitoring thread. If the monitoring thread sees
-- end-of-file (signaled by an empty string read), raise
-- ClientDisappered on the per-connection thread.
wrapConn conn = do
tid <- myThreadId
nxtBstr <- newEmptyMVar :: IO (MVar ByteString)
semaphore <- newMVar ()
readerCount <- newIORef (0 :: Int)
monitorThread <- forkIO (monitor tid nxtBstr semaphore readerCount)
return $ conn {
WarpI.connClose = throwTo monitorThread ClientDisappeared
>> WarpI.connClose conn
, WarpI.connRecv = newRecv nxtBstr semaphore readerCount
, WarpI.connRecvBuf = newRecvBuf nxtBstr semaphore readerCount
}
where
newRecv :: MVar ByteString -> MVar () -> IORef Int
-> IO ByteString
newRecv nxtBstr sem readerCount =
bracket_
(atomicModifyIORef' readerCount $ \x -> (succ x, ()))
(atomicModifyIORef' readerCount $ \x -> (pred x, ()))
(withMVar sem $ \_ -> do w <- tryTakeMVar nxtBstr
case w of
Just w' -> return w'
Nothing -> WarpI.connRecv conn
)
newRecvBuf :: MVar ByteString -> MVar () -> IORef Int
-> WarpI.Buffer -> WarpI.BufSize -> IO Bool
newRecvBuf nxtBstr sem readerCount buf bufSize =
bracket_
(atomicModifyIORef' readerCount $ \x -> (succ x, ()))
(atomicModifyIORef' readerCount $ \x -> (pred x, ()))
(withMVar sem $ \_ -> do
(fulfilled, buf', bufSize') <-
if bufSize == 0 then return (False, buf, bufSize)
else
do w <- tryTakeMVar nxtBstr
case w of
Nothing -> return (False, buf, bufSize)
Just w' -> do
let wlen = B.length w'
if wlen > bufSize
then do BU.unsafeUseAsCString w' $ \cw' ->
copyBytes buf (castPtr cw') bufSize
putMVar nxtBstr (B.drop bufSize w')
return (True, buf, 0)
else do BU.unsafeUseAsCString w' $ \cw' ->
copyBytes buf (castPtr cw') wlen
return (wlen == bufSize, plusPtr buf wlen,
bufSize - wlen)
if fulfilled then return True
else WarpI.connRecvBuf conn buf' bufSize'
)
dropClientDisappeared :: ClientDisappeared -> IO ()
dropClientDisappeared _ = return ()
monitor tid nxtBstr sem st =
catch (monitor' tid nxtBstr sem st) dropClientDisappeared
monitor' tid nxtBstr sem st = do
(hitEOF, readerCount) <- withMVar sem $ \_ -> do
w <- tryTakeMVar nxtBstr
case w of
-- No one picked up our bytestring from last time
Just w' -> putMVar nxtBstr w' >> return (False, 0)
Nothing -> do
w <- WarpI.connRecv conn
putMVar nxtBstr w
readerCount <- readIORef st
return (B.null w, readerCount)
if hitEOF && (readerCount == 0)
-- Don't signal if main thread is also trying to read -
-- in that case, main thread will see EOF directly
then throwTo tid ClientDisappeared
else do threadDelay oneSecondInMicros
monitor' tid nxtBstr sem st
oneSecondInMicros = 1000000
Assuming that 'web service' means HTTP(S)-based clients, one option is to use a RESTful approach. Instead of assuming that clients are going to stay connected, the service could accept the request and return 202 Accepted. As the HTTP status code specification outlines:
The request has been accepted for processing, but the processing has not been completed [...]
The 202 response is intentionally non-committal. Its purpose is to allow a server to accept a request for some other process (perhaps a batch-oriented process that is only run once per day) without requiring that the user agent's connection to the server persist until the process is completed. The entity returned with this response SHOULD include an indication of the request's current status and either a pointer to a status monitor or some estimate of when the user can expect the request to be fulfilled.
The server immediately responds with a 202 Accepted response, and also includes a URL that the client can use to poll for status. One option is to put this URL in the response's Location header, but you can also put the URL in a link in the response's body.
The client can poll the status URL for status. Once the calculation finishes, the status resource can provide a link to the finished result.
You can add cache headers to the status resource and final result if you're concerned that the clients will be polling too hard.
REST in Practice outlines the general concepts, while the RESTful Web Services Cookbook has lots of good details.
I'm not saying that you can't do something with either HTTP or TCP/IP (I don't know), but if you can't, then the above is a tried-and-true solution to similar problems.
Obviously, this is completely independent on programming language, but it's been my experience that REST and algebraic data types go well together.
I wonder how I/O were done in Haskell in the days when IO monad was still not invented. Anyone knows an example.
Edit: Can I/O be done without the IO Monad in modern Haskell? I'd prefer an example that works with modern GHC.
Before the IO monad was introduced, main was a function of type [Response] -> [Request]. A Request would represent an I/O action like writing to a channel or a file, or reading input, or reading environment variables etc.. A Response would be the result of such an action. For example if you performed a ReadChan or ReadFile request, the corresponding Response would be Str str where str would be a String containing the read input. When performing an AppendChan, AppendFile or WriteFile request, the response would simply be Success. (Assuming, in all cases, that the given action was actually successful, of course).
So a Haskell program would work by building up a list of Request values and reading the corresponding responses from the list given to main. For example a program to read a number from the user might look like this (leaving out any error handling for simplicity's sake):
main :: [Response] -> [Request]
main responses =
[
AppendChan "stdout" "Please enter a Number\n",
ReadChan "stdin",
AppendChan "stdout" . show $ enteredNumber * 2
]
where (Str input) = responses !! 1
firstLine = head . lines $ input
enteredNumber = read firstLine
As Stephen Tetley already pointed out in a comment, a detailed specification of this model is given in chapter 7 of the 1.2 Haskell Report.
Can I/O be done without the IO Monad in modern Haskell?
No. Haskell no longer supports the Response/Request way of doing IO directly and the type of main is now IO (), so you can't write a Haskell program that doesn't involve IO and even if you could, you'd still have no alternative way of doing any I/O.
What you can do, however, is to write a function that takes an old-style main function and turns it into an IO action. You could then write everything using the old style and then only use IO in main where you'd simply invoke the conversion function on your real main function. Doing so would almost certainly be more cumbersome than using the IO monad (and would confuse the hell out of any modern Haskeller reading your code), so I definitely would not recommend it. However it is possible. Such a conversion function could look like this:
import System.IO.Unsafe
-- Since the Request and Response types no longer exist, we have to redefine
-- them here ourselves. To support more I/O operations, we'd need to expand
-- these types
data Request =
ReadChan String
| AppendChan String String
data Response =
Success
| Str String
deriving Show
-- Execute a request using the IO monad and return the corresponding Response.
executeRequest :: Request -> IO Response
executeRequest (AppendChan "stdout" message) = do
putStr message
return Success
executeRequest (AppendChan chan _) =
error ("Output channel " ++ chan ++ " not supported")
executeRequest (ReadChan "stdin") = do
input <- getContents
return $ Str input
executeRequest (ReadChan chan) =
error ("Input channel " ++ chan ++ " not supported")
-- Take an old style main function and turn it into an IO action
executeOldStyleMain :: ([Response] -> [Request]) -> IO ()
executeOldStyleMain oldStyleMain = do
-- I'm really sorry for this.
-- I don't think it is possible to write this function without unsafePerformIO
let responses = map (unsafePerformIO . executeRequest) . oldStyleMain $ responses
-- Make sure that all responses are evaluated (so that the I/O actually takes
-- place) and then return ()
foldr seq (return ()) responses
You could then use this function like this:
-- In an old-style Haskell application to double a number, this would be the
-- main function
doubleUserInput :: [Response] -> [Request]
doubleUserInput responses =
[
AppendChan "stdout" "Please enter a Number\n",
ReadChan "stdin",
AppendChan "stdout" . show $ enteredNumber * 2
]
where (Str input) = responses !! 1
firstLine = head . lines $ input
enteredNumber = read firstLine
main :: IO ()
main = executeOldStyleMain doubleUserInput
I'd prefer an example that works with modern GHC.
For GHC 8.6.5:
import Control.Concurrent.Chan(newChan, getChanContents, writeChan)
import Control.Monad((<=<))
type Dialogue = [Response] -> [Request]
data Request = Getq | Putq Char
data Response = Getp Char | Putp
runDialogue :: Dialogue -> IO ()
runDialogue d =
do ch <- newChan
l <- getChanContents ch
mapM_ (writeChan ch <=< respond) (d l)
respond :: Request -> IO Response
respond Getq = fmap Getp getChar
respond (Putq c) = putChar c >> return Putp
where the type declarations are from page 14 of How to Declare an Imperative by Philip Wadler. Test programs are left as an exercise for curious readers :-)
If anyone is wondering:
-- from ghc-8.6.5/libraries/base/Control/Concurrent/Chan.hs, lines 132-139
getChanContents :: Chan a -> IO [a]
getChanContents ch
= unsafeInterleaveIO (do
x <- readChan ch
xs <- getChanContents ch
return (x:xs)
)
yes - unsafeInterleaveIO does make an appearance.
#sepp2k already clarified how this works, but i wanted to add a few words
I'm really sorry for this. I don't think it is possible to write this function without unsafePerformIO
Of course you can, you should almost never use unsafePerformIO
http://chrisdone.com/posts/haskellers
I'm using slightly different Request type constructor, so that it does not take channel version (stdin / stdout like in #sepp2k's code). Here is my solution for this:
(Note: getFirstReq doesn't work on empty list, you would have to add a case for that, bu it should be trivial)
data Request = Readline
| PutStrLn String
data Response = Success
| Str String
type Dialog = [Response] -> [Request]
execRequest :: Request -> IO Response
execRequest Readline = getLine >>= \s -> return (Str s)
execRequest (PutStrLn s) = putStrLn s >> return Success
dialogToIOMonad :: Dialog -> IO ()
dialogToIOMonad dialog =
let getFirstReq :: Dialog -> Request
getFirstReq dialog = let (req:_) = dialog [] in req
getTailReqs :: Dialog -> Response -> Dialog
getTailReqs dialog resp =
\resps -> let (_:reqs) = dialog (resp:resps) in reqs
in do
let req = getFirstReq dialog
resp <- execRequest req
dialogToIOMonad (getTailReqs dialog resp)
Since sodium has been deprecated by the author I'm trying to port my code to reactive-banana. However, there seem to be some incongruencies between the two that I'm having a hard time overcomming.
For example, in sodium it was easy to retrieve the current value of a behaviour:
retrieve :: Behaviour a -> IO a
retrieve b = sync $ sample b
I don't see how to do this in reactive-banana
(The reason I want this is because I'm trying to export the behaviour as a dbus property. Properties can be queried from other dbus clients)
Edit: Replaced the word "poll" as it was misleading
If you have a Behaviour modelling the value of your property, and you have an Event modelling the incoming requests for the property's value, then you can just use (<#) :: Behavior b -> Event a -> Event b1 to get a new event occurring at the times of your incoming requests with the value the property has at that time). Then you can transform that into the actual IO actions you need to take to reply to the request and use reactimate as usual.
1 https://hackage.haskell.org/package/reactive-banana-1.1.0.0/docs/Reactive-Banana-Combinators.html#v:-60--64-
For conceptual/architectural reasons, Reactive Banana has functions from Event to Behavior, but not vice versa, and it makes sense too, given th nature and meaning of FRP. I'm quite sure you can write a polling function, but instead you should consider changing the underlying code to expose events instead.
Is there a reason you can't change your Behavior into an Event? If not, that would be a good way to go about resolving your issue. (It might in theory even reveal a design shortcoming you have been overlooking so far.)
The answer seems to be "it's sort of possible".
sample corresponds to valueB, but there is no direct equivalent to sync.
However, it can be re-implemented with the help of execute:
module Sync where
import Control.Monad.Trans
import Data.IORef
import Reactive.Banana
import Reactive.Banana.Frameworks
data Network = Network { eventNetwork :: EventNetwork
, run :: MomentIO () -> IO ()
}
newNet :: IO Network
newNet = do
-- Create a new Event to handle MomentIO actions to be executed
(ah, call) <- newAddHandler
network <- compile $ do
globalExecuteEV <- fromAddHandler ah
-- Set it up so it executes MomentIO actions passed to it
_ <- execute globalExecuteEV
return ()
actuate network
return $ Network { eventNetwork = network
, run = call -- IO Action to fire the event
}
-- To run a MomentIO action within the context of the network, pass it to the
-- event.
sync :: Network -> MomentIO a -> IO a
sync Network{run = call} f = do
-- To retrieve the result of the action we set up an IORef
ref <- newIORef (error "Network hasn't written result to ref")
-- (`call' passes the do-block to the event)
call $ do
res <- f
-- Put the result into the IORef
liftIO $ writeIORef ref res
-- and read it back once the event has finished firing
readIORef ref
-- Example
main :: IO ()
main = do
net <- newNet -- Create an empty network
(bhv1, set1) <- sync net $ newBehavior (0 :: Integer)
(bhv2, set2) <- sync net $ newBehavior (0 :: Integer)
set1 3
set2 7
let sumB = (liftA2 (+) bhv1 bhv2)
print =<< sync net (valueB sumB)
set1 5
print =<< sync net (valueB sumB)
return ()
I have a TChan as input for a thread which should behave like this:
If sombody writes to the TChan within a specific time, the content should be retrieved. If there is nothing written within the specified time, it should unblock and continue with Nothing.
My attempt on this was to use the timeout function from System.Timeout like this:
timeout 1000000 $ atomically $ readTChan pktChannel
This seemed to work but now I discovered, that I am sometimes loosing packets (they are written to the channel, but not read on the other side. In the log I get this:
2014.063.11.53.43.588365 Pushing Recorded Packet: 2 1439
2014.063.11.53.43.592319 Run into timeout
2014.063.11.53.44.593396 Run into timeout
2014.063.11.53.44.593553 Pushing Recorded Packet: 3 1439
2014.063.11.53.44.597177 Sending Recorded Packet: 3 1439
Where "Pushing Recorded Packet" is the writing from the one thread and "Sending Recorded Packet" is the reading from the TChan in the sender thread. The line with Sending Recorded Packet 2 1439 is missing, which would indicate a successful read from the TChan.
It seems that if the timeout is received at the wrong point in time, the channel looses the packet. I suspect that the threadKill function used inside timeout and STM don't play well together.
Is this correct? Does somebody have another solution that does not loose the packet?
Use registerDelay, an STM function, to signal a TVar when the timeout is reached. You can then use the orElse function or the Alternative operator <|> to select between the next TChan value or the timeout.
import Control.Applicative
import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
import System.Random
-- write random values after a random delay
packetWriter :: Int -> TChan Int -> IO ()
packetWriter maxDelay chan = do
let xs = randomRs (10000 :: Int, maxDelay + 50000) (mkStdGen 24036583)
forM_ xs $ \ x -> do
threadDelay x
atomically $ writeTChan chan x
-- block (retry) until the delay TVar is set to True
fini :: TVar Bool -> STM ()
fini = check <=< readTVar
-- Read the next value from a TChan or timeout
readTChanTimeout :: Int -> TChan a -> IO (Maybe a)
readTChanTimeout timeoutAfter pktChannel = do
delay <- registerDelay timeoutAfter
atomically $
Just <$> readTChan pktChannel
<|> Nothing <$ fini delay
-- | Print packets until a timeout is reached
readLoop :: Show a => Int -> TChan a -> IO ()
readLoop timeoutAfter pktChannel = do
res <- readTChanTimeout timeoutAfter pktChannel
case res of
Nothing -> putStrLn "timeout"
Just val -> do
putStrLn $ "packet: " ++ show val
readLoop timeoutAfter pktChannel
main :: IO ()
main = do
let timeoutAfter = 1000000
-- spin up a packet writer simulation
pktChannel <- newTChanIO
tid <- forkIO $ packetWriter timeoutAfter pktChannel
readLoop timeoutAfter pktChannel
killThread tid
The thumb rule of concurrency is: if adding a sleep in some point inside an IO action matters, your program is not safe.
To understand why the code timeout 1000000 $ atomically $ readTChan pktChannel does not work, consider the following alternative implementation of atomically:
atomically' :: STM a -> IO a
atomically' action = do
result <- atomically action
threadDelay someTimeAmount
return result
The above is equal to atomically, but for an extra innocent delay. Now it is easy to see that if timeout kills the thread during the threadDelay, the atomic action has completed (consuming a message from the channel), yet timeout will return Nothing.
A simple fix to timeout n $ atomically ... could be the following
smartTimeout :: Int -> STM a -> IO (Maybe a)
smartTimeout n action = do
v <- atomically $ newEmptyTMvar
_ <- timeout n $ atomically $ do
result <- action
putTMvar v result
atomically $ tryTakeTMvar v
The above uses an extra transactional variable v to do the trick. The result value of the action is stored into v inside the same atomic block in which the action is run. The return value of timeout is not trusted, since it does not tell us if action was run or not. After that, we check the TMVar v, which will be full if and only if action was run.
Instead of TChan a, use TChan (Maybe a) . Your normal producer (of x) now writes Just x. Fork an extra "ticking" process that writes Nothing to the channel (every x seconds). Then have a reader for the channel, and abort if you get two successive Nothing. This way, you avoid exceptions, which may cause data to get lost in your case (but I am not sure).
I'm working on a haskell network application and I use the actor pattern to manage multithreading. One thing I came across is how to store for example a set of client sockets/handles. Which of course must be accessible for all threads and can change when clients log on/off.
Since I'm coming from the imperative world I thought about some kind of lock-mechanism but when I noticed how ugly this is I thought about "pure" mutability, well actually it's kind of pure:
import Control.Concurrent
import Control.Monad
import Network
import System.IO
import Data.List
import Data.Maybe
import System.Environment
import Control.Exception
newStorage :: (Eq a, Show a) => IO (Chan (String, Maybe (Chan [a]), Maybe a))
newStorage = do
q <- newChan
forkIO $ storage [] q
return q
newHandleStorage :: IO (Chan (String, Maybe (Chan [Handle]), Maybe Handle))
newHandleStorage = newStorage
storage :: (Eq a, Show a) => [a] -> Chan (String, Maybe (Chan [a]), Maybe a) -> IO ()
storage s q = do
let loop = (`storage` q)
(req, reply, d) <- readChan q
print ("processing " ++ show(d))
case req of
"add" -> loop ((fromJust d) : s)
"remove" -> loop (delete (fromJust d) s)
"get" -> do
writeChan (fromJust reply) s
loop s
store s d = writeChan s ("add", Nothing, Just d)
unstore s d = writeChan s ("remove", Nothing, Just d)
request s = do
chan <- newChan
writeChan s ("get", Just chan, Nothing)
readChan chan
The point is that a thread (actor) is managing a list of items and modifies the list according to incoming requests. Since thread are really cheap I thought this could be a really nice functional alternative.
Of course this is just a prototype (a quick dirty proof of concept).
So my question is:
Is this a "good" way of managing shared mutable variables (in the actor world) ?
Is there already a library for this pattern ? (I already searched but I found nothing)
Regards,
Chris
Here is a quick and dirty example using stm and pipes-network. This will set up a simple server that allows clients to connect and increment or decrement a counter. It will display a very simple status bar showing the current tallies of all connected clients and will remove client tallies from the bar when they disconnect.
First I will begin with the server, and I've generously commented the code to explain how it works:
import Control.Concurrent.STM (STM, atomically)
import Control.Concurrent.STM.TVar
import qualified Data.HashMap.Strict as H
import Data.Foldable (forM_)
import Control.Concurrent (forkIO, threadDelay)
import Control.Monad (unless)
import Control.Monad.Trans.State.Strict
import qualified Data.ByteString.Char8 as B
import Control.Proxy
import Control.Proxy.TCP
import System.IO
main = do
hSetBuffering stdout NoBuffering
{- These are the internal data structures. They should be an implementation
detail and you should never expose these references to the
"business logic" part of the application. -}
-- I use nRef to keep track of creating fresh Ints (which identify users)
nRef <- newTVarIO 0 :: IO (TVar Int)
{- hMap associates every user (i.e. Int) with a counter
Notice how I've "striped" the hash map by storing STM references to the
values instead of storing the values directly. This means that I only
actually write the hashmap when adding or removing users, which reduces
contention for the hash map.
Since each user gets their own unique STM reference for their counter,
modifying counters does not cause contention with other counters or
contention with the hash map. -}
hMap <- newTVarIO H.empty :: IO (TVar (H.HashMap Int (TVar Int)))
{- The following code makes heavy use of Haskell's pure closures. Each
'let' binding closes over its current environment, which is safe since
Haskell is pure. -}
let {- 'getCounters' is the only server-facing command in our STM API. The
only permitted operation is retrieving the current set of user
counters.
'getCounters' closes over the 'hMap' reference currently in scope so
that the server never needs to be aware about our internal
implementation. -}
getCounters :: STM [Int]
getCounters = do
refs <- fmap H.elems (readTVar hMap)
mapM readTVar refs
{- 'init' is the only client-facing command in our STM API. It
initializes the client's entry in the hash map and returns two
commands: the first command is what the client calls to 'increment'
their counter and the second command is what the client calls to log
off and delete
'delete' command.
Notice that those two returned commands each close over the client's
unique STM reference so the client never needs to be aware of how
exactly 'init' is implemented under the hood. -}
init :: STM (STM (), STM ())
init = do
n <- readTVar nRef
writeTVar nRef $! n + 1
ref <- newTVar 0
modifyTVar' hMap (H.insert n ref)
let incrementRef :: STM ()
incrementRef = do
mRef <- fmap (H.lookup n) (readTVar hMap)
forM_ mRef $ \ref -> modifyTVar' ref (+ 1)
deleteRef :: STM ()
deleteRef = modifyTVar' hMap (H.delete n)
return (incrementRef, deleteRef)
{- Now for the actual program logic. Everything past this point only uses
the approved STM API (i.e. 'getCounters' and 'init'). If I wanted I
could factor the above approved STM API into a separate module to enforce
the encapsulation boundary, but I am lazy. -}
{- Fork a thread which polls the current state of the counters and displays
it to the console. There is a way to implement this without polling but
this gets the job done for now.
Most of what it is doing is just some simple tricks to reuse the same
console line instead of outputting a stream of lines. Otherwise it
would be just:
forkIO $ forever $ do
ns <- atomically getCounters
print ns
-}
forkIO $ (`evalStateT` 0) $ forever $ do
del <- get
lift $ do
putStr (replicate del '\b')
putStr (replicate del ' ' )
putStr (replicate del '\b')
ns <- lift $ atomically getCounters
let str = show ns
lift $ putStr str
put $! length str
lift $ threadDelay 10000
{- Fork a thread for each incoming connection, which listens to the client's
commands and translates them into 'STM' actions -}
serve HostAny "8080" $ \(socket, _) -> do
(increment, delete) <- atomically init
{- Right now, just do the dumb thing and convert all keypresses into
increment commands, with the exception of the 'q' key, which will
quit -}
let handler :: (Proxy p) => () -> Consumer p Char IO ()
handler () = runIdentityP loop
where
loop = do
c <- request ()
unless (c == 'q') $ do
lift $ atomically increment
loop
{- This uses my 'pipes' library. It basically is a high-level way to
say:
* Read binary packets from the socket no bigger than 4096 bytes
* Get the first character from each packet and discard the rest
* Handle the character using the above 'handler' function -}
runProxy $ socketReadS 4096 socket >-> mapD B.head >-> handler
{- The above pipeline finishes either when the socket closes or
'handler' stops looping because it received a 'q'. Either case means
that the client is done so we log them out using 'delete'. -}
atomically delete
Next up is the client, which simply opens a connections and forwards all key presses as single packets:
import Control.Monad
import Control.Proxy
import Control.Proxy.Safe
import Control.Proxy.TCP.Safe
import Data.ByteString.Char8 (pack)
import System.IO
main = do
hSetBuffering stdin NoBuffering
hSetEcho stdin False
{- Again, this uses my 'pipes' library. It basically says:
* Read characters from the console using 'commands'
* Pack them into a binary format
* send them to a server running at 127.0.0.1:8080
This finishes looping when the user types a 'q' or the connection is
closed for whatever reason.
-}
runSafeIO $ runProxy $ runEitherK $
try . commands
>-> mapD (\c -> pack [c])
>-> connectWriteD Nothing "127.0.0.1" "8080"
commands :: (Proxy p) => () -> Producer p Char IO ()
commands () = runIdentityP loop
where
loop = do
c <- lift getChar
respond c
unless (c == 'q') loop
It's pretty simple: commands generates a stream of Chars, which then get converted to ByteStrings and then sent as packets to the server.
If you run the server and a few clients and have them each type in a few keys, your server display will output a list showing how many keys each client typed:
[1,6,4]
... and if some of the clients disconnect they will be removed from the list:
[1,4]
Note that the pipes component of these examples will simplify greatly in the upcoming pipes-4.0.0 release, but the current pipes ecosystem still gets the job done as is.
First, I'd definitely recommend using your own specific data type for representing commands. When using (String, Maybe (Chan [a]), Maybe a) a buggy client can crash your actor simply by sending an unknown command or by sending ("add", Nothing, Nothing), etc. I'd suggest something like
data Command a = Add a | Remove a | Get (Chan [a])
Then you can pattern match on commands in storage in a save way.
Actors have their advantages, but also I feel that they have some drawbacks. For example, getting an answer from an actor requires sending it a command and then waiting for a reply. And the client can't be completely sure that it gets a reply and that the reply will be of some specific type - you can't say I want only answers of this type (and how many of them) for this particular command.
So as an example I'll give a simple, STM solution. It'd be better to use a hash table or a (balanced tree) set, but since Handle implements neither Ord nor Hashable, we can't use these data structures, so I'll keep using lists.
module ThreadSet (
TSet, add, remove, get
) where
import Control.Monad
import Control.Monad.STM
import Control.Concurrent.STM.TVar
import Data.List (delete)
newtype TSet a = TSet (TVar [a])
add :: (Eq a) => a -> TSet a -> STM ()
add x (TSet v) = readTVar v >>= writeTVar v . (x :)
remove :: (Eq a) => a -> TSet a -> STM ()
remove x (TSet v) = readTVar v >>= writeTVar v . delete x
get :: (Eq a) => TSet a -> STM [a]
get (TSet v) = readTVar v
This module implements a STM based set of arbitrary elements. You can have multiple such sets and use them together in a single STM transaction that succeeds or fails at once. For example
-- | Ensures that there is exactly one element `x` in the set.
add1 :: (Eq a) => a -> TSet a -> STM ()
add1 x v = remove x v >> add x v
This would be difficult with actors, you'd have to add it as another command for the actor, you can't compose it of existing actions and still have atomicity.
Update: There is an interesting article explaining why Clojure designers chose not to use actors. For example, using actors, even if you have many reads and only very little writes to a mutable structure, they're all serialized, which can greatly impact performance.