Cloud Haskell hanging forever when sending messages to ManagedProcess - haskell

The Problem
Hello! I'm writing in Cloud Haskell a simple Server - Worker program. The problem is, that when I try to create ManagedProcess, after the server disovery step, my example hangs forever even while using callTimeout (which should break after 100 ms). The code is very simple, but I cannot find anything wrong with it.
I've posted the question on the mailing list also, but as far as I know the SO community, I canget the answer a lot faster here. If I get the answer from mailing list, I will postit here also.
Source Code
The Worker.hs:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TemplateHaskell #-}
module Main where
import Network.Transport (EndPointAddress(EndPointAddress))
import Control.Distributed.Process hiding (call)
import Control.Distributed.Process.Platform hiding (__remoteTable)
import Control.Distributed.Process.Platform.Async
import Control.Distributed.Process.Platform.ManagedProcess
import Control.Distributed.Process.Platform.Time
import Control.Distributed.Process.Platform.Timer (sleep)
import Control.Distributed.Process.Closure (mkClosure, remotable)
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Control.Distributed.Process.Node hiding (call)
import Control.Concurrent (threadDelay)
import GHC.Generics (Generic)
import Data.Binary (Binary)
import Data.Typeable (Typeable)
import Data.ByteString.Char8 (pack)
import System.Environment (getArgs)
import qualified Server as Server
main = do
[host, port, serverAddr] <- getArgs
Right transport <- createTransport host port defaultTCPParameters
node <- newLocalNode transport initRemoteTable
let addr = EndPointAddress (pack serverAddr)
srvID = NodeId addr
_ <- forkProcess node $ do
sid <- discoverServer srvID
liftIO $ putStrLn "x"
liftIO $ print sid
r <- callTimeout sid (Server.Add 5 6) 100 :: Process (Maybe Double)
liftIO $ putStrLn "x"
liftIO $ threadDelay (10 * 1000 * 1000)
threadDelay (10 * 1000 * 1000)
return ()
discoverServer srvID = do
whereisRemoteAsync srvID "serverPID"
reply <- expectTimeout 100 :: Process (Maybe WhereIsReply)
case reply of
Just (WhereIsReply _ msid) -> case msid of
Just sid -> return sid
Nothing -> discoverServer srvID
Nothing -> discoverServer srvID
The Server.hs:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TemplateHaskell #-}
module Server where
import Control.Distributed.Process hiding (call)
import Control.Distributed.Process.Platform hiding (__remoteTable)
import Control.Distributed.Process.Platform.Async
import Control.Distributed.Process.Platform.ManagedProcess
import Control.Distributed.Process.Platform.Time
import Control.Distributed.Process.Platform.Timer (sleep)
import Control.Distributed.Process.Closure (mkClosure, remotable)
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Control.Distributed.Process.Node hiding (call)
import Control.Concurrent (threadDelay)
import GHC.Generics (Generic)
import Data.Binary (Binary)
import Data.Typeable (Typeable)
data Add = Add Double Double
deriving (Typeable, Generic)
instance Binary Add
launchServer :: Process ProcessId
launchServer = spawnLocal $ serve () (statelessInit Infinity) server >> return () where
server = statelessProcess { apiHandlers = [ handleCall_ (\(Add x y) -> liftIO (putStrLn "!") >> return (x + y)) ]
, unhandledMessagePolicy = Drop
}
main = do
Right transport <- createTransport "127.0.0.1" "8080" defaultTCPParameters
node <- newLocalNode transport initRemoteTable
_ <- forkProcess node $ do
self <- getSelfPid
register "serverPID" self
liftIO $ putStrLn "x"
mid <- launchServer
liftIO $ putStrLn "y"
r <- call mid (Add 5 6) :: Process Double
liftIO $ print r
liftIO $ putStrLn "z"
liftIO $ threadDelay (10 * 1000 * 1000)
liftIO $ putStrLn "z2"
threadDelay (10 * 1000 * 1000)
return ()
We can run them as follow:
runhaskell Server.hs
runhaskell Worker.hs 127.0.0.2 8080 127.0.0.1:8080:0
The Results
When we run the programs, we got following results:
from Server:
x
y
!
11.0 -- this one shows that inside the same process we were able to use the "call" function
z
-- waiting - all the output above were tests from inside the server now it waits for external messages
from Worker:
x
pid://127.0.0.1:8080:0:10 -- this is the process id of the server optained with whereisRemoteAsync
-- waiting forever on the "callTimeout sid (Server.Add 5 6) 100" code!
As a sidenote - I've found out that, when sending messages with send (from Control.Distributed.Process) and reciving them with expect works. But sending them with call (from Control.Distributed.Process.Platform) and trying to recive them with ManagedProcess api handlers - hangs the call forever (even using callTimeout!)

Your client is getting an exception, which you are not able to observe easily because you are running your client in a forkProcess. If you want to do that it is fine but then you need to monitor or link to that process. In this case, simply using runProcess would be much simpler. If you do that, you will see you get this exception:
Worker.hs: trying to call fromInteger for a TimeInterval. Cannot guess units
callTimeout does not take an Integer, it takes a TimeInterval which are constructed with the functions in the Time module. This is a pseudo-Num - it does not actually support fromInteger it seems. I would consider that a bug or at least bad form (in Haskell) but in any case the way to fix your code is simply
r <- callTimeout sid (Server.Add 5 6) (milliSeconds 100) :: Process (Maybe Double)
To fix the problem with the client calling into the server, you need to register the pid of the server process you spawned rather than the main process you spawn it from - i.e. change
self <- getSelfPid
register "serverPID" self
liftIO $ putStrLn "x"
mid <- launchServer
liftIO $ putStrLn "y"
to
mid <- launchServer
register "serverPID" mid
liftIO $ putStrLn "y"

Related

Haskell Persistent Library - How do I get data from my database to my frontend?

Hi and thanks for your time.
I'm trying to create a website that features a button that increments a counter. I want the current counter to be persistent and if somebody goes to my page, the current counter should be displayed.
A request should be send every time I click the button to increment the counter. The request does not contain any information about the counter value. The server - in my case a warp web server - should update the counter value in the database, read the value after the update and then send it to the frontend if successful, of an error message if not.
So far, only the updating works, since I did not manage to figure out how to get the data from the database to the frontend.
Here is the code from my Repository module that should do the updating:
{-# LANGUAGE EmptyDataDecls, FlexibleContexts, GADTs, GeneralizedNewtypeDeriving#-}
{-# LANGUAGE MultiParamTypeClasses, OverloadedStrings, QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell, TypeFamilies, DataKinds, FlexibleInstances#-}
{-# LANGUAGE DerivingStrategies, StandaloneDeriving, UndecidableInstances #-}
module Repository (increaseCounter) where
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Logger (runStderrLoggingT)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Control.Monad.Reader
import Data.Text
import Data.Maybe
-- setting up the Counter entity with a unique key so I can use the getBy function
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Counter
counterName String
counterCount Int Maybe
UniqueCounterName counterName
deriving Show
|]
increaseCounter :: IO ()
increaseCounter =
runStderrLoggingT $ withSqliteConn "//absolute/path/database.db" $ runSqlConn $ do
runMigration migrateAll -- only for developing
updateWhere [CounterCounterName ==. "unique name"] [CounterCounterCount +=. Just 1]
counterEntity <- getBy $ UniqueCounterName name
liftIO $ print counterEntity
This compiles and actually persists the counter and updates the value every time its called. But as you can tell from the types, after update it only prints the counter value to the console.
I seem to have problems understanding how to use the data that is returned from the getBy function.
The docs say:
getBy :: (PersistUniqueRead backend, MonadIO m, PersistRecordBackend record backend) =>
Unique record -> ReaderT backend m (Maybe (Entity record))
Is the 'backend m' basically a nested monad?
Assuming I simply want to send the value of the counter if it is Just Int and I want to return -1 if it is Nothing.
I assume I can not modify the increaseCounter function so that its type is Maybe Int. But how do I pass functions into the monad / access the value inside to send a response to the frontend?
If this question is to superficial and/or I lack too much knowledge to proceed at this point, can you recommend good sources for information? Something like a good tutorial or youtube channel or something?
Thanks!
You can ignore all the monadic parts of getBy's type signature. Provided you get your code to type check, counterEntity has type Maybe (Entity Counter), and that's all that's important here.
The counterEntity is Nothing if the query fails (i.e., no record in the table for that counter). Otherwise, it's Just an Entity Counter containing the retrieved record:
case counterEntity of
Just e -> ...
This e :: Entity Counter can be turned into a Counter via entityVal. The desired field of that Counter can be extracted with counterCounterCount. The result will be a Maybe Int because you've tagged that field as Maybe, so you'll have another layer of Maybe to unpack:
case counterEntity of
Nothing -> -1 -- no record for this counter
Just e -> case counterCounterCount (entityVal e) of
Nothing -> -1 -- record, but counter value missing
Just v -> v
You'll want to return this value from increaseCounter, so the final version will look like this:
increaseCounter :: IO Int
increaseCounter =
runStderrLoggingT $ withSqliteConn "//absolute/path/database.db" $ runSqlConn $ do
runMigration migrateAll -- only for developing
updateWhere [CounterCounterName ==. "unique name"] [CounterCounterCount +=. Just 1]
counterEntity <- getBy $ UniqueCounterName "unique name"
return $ case counterEntity of
Nothing -> -1
Just e -> case counterCounterCount . entityVal $ e of
Nothing -> -1
Just v -> v
Wherever you previously successfully used increaseCounter to increase the counter, you'll now want to write:
updatedCounterValue <- increaseCounter
and you can pass the plain old updatedCounterValue :: Int to the front end.
You might find it more sensible to use upsertBy, which can insert the counter record if it's missing and update it otherwise. It also returns the inserted/updated entity, saving you a separate getBy call. I also don't understand why you've tagged counterCount with Maybe. Why would you insert a counter into your table with no value? Wouldn't "0" be a better starting value if you wanted to indicate "no count"?
So, I'd rewrite the schema as:
Counter
counterName String
counterCount Int -- no Maybe
UniqueCounterName counterName
deriving Show
and the increaseCounter function as:
increaseCounter :: IO Int
increaseCounter =
runStderrLoggingT $ withSqliteConn "//absolute/path/database.db" $ runSqlConn $ do
runMigration migrateAll -- only for developing
let name = "unique name"
counterEntity <- upsertBy (UniqueCounterName name)
(Counter name 1)
[CounterCounterCount +=. 1]
return $ counterCounterCount (entityVal counterEntity)
To either insert a 1-count or increase an existing count.
Finally, as a general design approach, it's probably better to move the database migration and connection setup into the main function, and maybe use a pool of connections, something like:
#!/usr/bin/env stack
-- stack --resolver lts-18.0 script
-- --package warp
-- --package persistent
-- --package persisent-sqlite
{-# LANGUAGE EmptyDataDecls, FlexibleContexts, GADTs, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses, OverloadedStrings, QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell, TypeFamilies, DataKinds, FlexibleInstances#-}
{-# LANGUAGE DerivingStrategies, StandaloneDeriving, UndecidableInstances #-}
import Control.Monad.Logger (runStderrLoggingT)
import Database.Persist
import Database.Persist.TH
import Database.Persist.Sqlite
import Control.Monad.Reader
import Network.HTTP.Types
import Network.Wai
import Network.Wai.Handler.Warp
import qualified Data.ByteString.Lazy.Char8 as C8
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Counter
counterName String
counterCount Int
UniqueCounterName counterName
deriving Show
|]
increaseCounter :: ReaderT SqlBackend IO Int
increaseCounter = do
let name = "unique name"
counterEntity <- upsertBy (UniqueCounterName name)
(Counter name 1)
[CounterCounterCount +=. 1]
return $ counterCounterCount (entityVal counterEntity)
main :: IO ()
main = runStderrLoggingT $ withSqlitePool "some_database.db" 5 $ \pool -> do
runSqlPool (runMigration migrateAll) pool
let runDB act = runSqlPool act pool
liftIO $ run 3000 $ \req res -> do
count <- runDB $ increaseCounter
res $ responseLBS
status200
[("Content-Type", "text/plain")]
(C8.pack $ show count ++ "\n")

Haskell server does not reply to client

I tried building a simple client-server program following this tutorial about Haskell's network-conduit library.
This is the client, which concurrently sends a file to the server and receives the answer:
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent.Async (concurrently)
import Data.Functor (void)
import Conduit
import Data.Conduit.Network
main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
void $ concurrently
(runConduitRes $ sourceFile "input.txt" .| appSink server)
(runConduit $ appSource server .| stdoutC)
And this is the server, which counts the occurrences of each word and sends the result back to the client:
{-# LANGUAGE OverloadedStrings #-}
import Data.ByteString.Char8 (pack)
import Data.Foldable (toList)
import Data.HashMap.Lazy (empty, insertWith)
import Data.Word8 (isAlphaNum)
import Conduit
import Data.Conduit.Network
import qualified Data.Conduit.Combinators as CC
main = runTCPServer (serverSettings 4000 "*") $ \appData -> do
hashMap <- runConduit $ appSource appData
.| CC.splitOnUnboundedE (not . isAlphaNum)
.| foldMC insertInHashMap empty
runConduit $ yield (pack $ show $ toList hashMap)
.| iterMC print
.| appSink appData
insertInHashMap x v = do
return (insertWith (+) v 1 x)
The problem is that the server doesn't reach the yield phase until I manually shut down the client and therefore never answers to it. I noticed that removing the concurrency from the client and keeping only the part in which it sends data to the server, everything works fine.
So, how can I preserve the receiving part of the client without breaking the flow?
You have a deadlock: the client is waiting for the server to respond before it closes the connection, but the server is unaware that the client is done sending data and is waiting for more. This is basically the problem described at https://cr.yp.to/tcpip/twofd.html:
When the generate-data program finishes, the same fd is still open in the consume-data program, so the kernel has no idea that it should send a FIN.
In your case, the fix needs to go on the client side. You need to call shutdown with ShutdownSend on the socket once conduit is done sending the contents of input.txt over it.
Here's one way to do so (I'm not sure if there's a nicer one):
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent.Async (concurrently)
import Data.Functor (void)
import Data.Foldable (traverse_)
import Conduit
import Data.Conduit.Network
import Data.Streaming.Network (appRawSocket)
import Network.Socket (shutdown, ShutdownCmd(..))
main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
void $ concurrently
((runConduitRes $ sourceFile "input.txt" .| appSink server) >> doneWriting server)
(runConduit $ appSource server .| stdoutC)
doneWriting = traverse_ (`shutdown` ShutdownSend) . appRawSocket
Side note: you don't really need concurrency in the client in this case, since there will never be anything to read from the server until you're done writing to the server. You could just do the reading after the writing and shutdown.

Getting "stuck" using STM

I have the following Scotty app which tries to use STM to keep a count of API calls served:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Web.Scotty
import Data.Monoid (mconcat)
import Control.Concurrent.STM
import Control.Monad.IO.Class
main :: IO ()
main = do
counter <- newTVarIO 0
scotty 3000 $
get "/:word" $ do
liftIO $ atomically $ do
counter' <- readTVar counter
writeTVar counter (counter' + 1)
liftIO $ do
counter' <- atomically (readTVar counter)
print counter'
beam <- param "word"
html $ mconcat ["<h1>Scotty, ", beam, " me up!</h1>"]
I "load test" the API like this:
ab -c 100 -n 100000 http://127.0.0.1:3000/z
However, the API serves roughly about 16 thousand requests and then gets "stuck" - ab stops with error apr_socket_recv: Operation timed out (60).
I think I'm misusing STM, but not sure what I'm doing wrong. Any suggestions?
Quick guess here. 16,000 is about the number of available TCP ports. Is is possible you have not closed any connections and therefore run out of open ports for ab?

LiftIO, do block, and syntax

I'm getting to grips with writing an API in Haskell using Scotty. My files are provided below. My questions are:
In the routes definition, I'm extracting from liftIO whatsTheTime in a do block. This works, but it seems verbose. Is there a nicer syntax?
In the whatsTheTime definition, I'm needing to do fromString. I'd have thought OverloadedString would take care of that, but that's not the case. I'd really appreciate it if somebody pointed out why it doesn't work without fromString.
In a stack project, if I need a directive like OverloadedStrings, do I need to include it every file that needs it, or just at the top of the main entrypoint?
Api.hs:
{-# LANGUAGE OverloadedStrings #-}
module Api
( whatsTheTime
) where
import Data.Time (getCurrentTime)
import Web.Scotty
import Data.String
whatsTheTime :: IO (ActionM ())
whatsTheTime = do
time <- getCurrentTime
return $ text $ fromString ("The time is now " ++ show time)
Main.hs:
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
module Main where
import Api
import Web.Scotty
import Control.Monad.IO.Class
routes = do
get "/" $ do
res <- liftIO whatsTheTime
res
main :: IO ()
main = do
putStrLn "Starting server..."
scotty 3000 routes
(1) This:
do
res <- liftIO whatsTheTime
res
Desugars to this:
liftIO whatsTheTime >>= \ res -> res
If you look at the type of \ m -> m >>= id:
(Monad m) => m (m a) -> m a
That’s exactly the type of join (Hoogle), so you can use:
get "/" $ join $ liftIO whatsTheTime
join is a common idiom for “execute this action which returns an action, and also execute the returned action”.
(2) OverloadedStrings is for overloading of string literals. You have an overloaded literal "The time is now ", but you constrain it to be of type String by using it as an operand of (++) with a String (the result of show time). You can pack the result of show time as a Text instead using fromString or Data.Text.pack:
import Data.Monoid ((<>))
import qualified Data.Text as Text
-- ...
return $ text $ "The time is now " <> Text.pack (show time)
(3) LANGUAGE pragmas operate per file; as #mgsloan notes, you can add OverloadedStrings to the default-extensions: field of your library or executable in your .cabal file.

How to exit a conduit when using mergeSources

I have a simple forked conduit setup, with two inputs feeding one single output....
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.TMChan
import Data.Conduit.Network
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
input <- mergeSources [
transPipe liftIO (appSource server),
infiniteSource
] 2
input $$ transPipe liftIO (appSink server)
infiniteSource::MonadIO m=>Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource
(here I connect to a tcp socket, then combine the socket input with a timed infinite source, then respond back to the socket)
This works great, until the connection is dropped.... Because the second input still exists, the conduit keeps running. (In this case, the program does end when the timed input fires and there is no socket to write to, but this isn't always the case in my real example).
What is the proper way to shut down the full conduit when one of the inputs is closed?
I tried to brute force a crash by adding the following
crashOnEndOfStream::MonadIO m=>Conduit B.ByteString m B.ByteString
crashOnEndOfStream = do
awaitForever $ yield
error "the peer connection has disconnected" --tried with error
liftIO $ exitWith ExitSuccess --also tried with exitWith
but because the input conduit runs in a thread, the executable was immune to runtime exceptions shutting it down (plus, there is probably a smoother way to shut stuff down than halting the program).
the Source created by mergeSources keeps a count of unclosed sources. It's only closed when the count reaches 0 i.e. every upstream source is closed. This mechanism and the underlying TBMChannel is hidden from user code so you have no way to change its behavior.
One possible solution is to create the channel and the source manually with some medium-level functions exported by Data.Conduit.TMChan, so you can finalize the source by closing the TBMChannel. I haven't tested the code below since your program is not runnable on my machine.
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.Network
import Data.Conduit.TMChan
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
-- create the TBMChannel
chan <- liftIO $ newTBMChanIO 2
let
-- everything piped to the sink will appear at the source
chanSink = sinkTBMChan chan True
chanSource = sourceTBMChan chan
tid1 <- resourceForkIO $ appSource server $$ chanSink
tid2 <- resourceForkIO $ infiniteSource $$ chanSink
chanSource $$ transPipe liftIO (appSink server)
-- and call 'closeTBMChan chan' when you want to exit.
-- 'chanSource' will be closed when the underlying TBMChannel is closed.
infiniteSource :: MonadIO m => Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource

Resources