I tried building a simple client-server program following this tutorial about Haskell's network-conduit library.
This is the client, which concurrently sends a file to the server and receives the answer:
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent.Async (concurrently)
import Data.Functor (void)
import Conduit
import Data.Conduit.Network
main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
void $ concurrently
(runConduitRes $ sourceFile "input.txt" .| appSink server)
(runConduit $ appSource server .| stdoutC)
And this is the server, which counts the occurrences of each word and sends the result back to the client:
{-# LANGUAGE OverloadedStrings #-}
import Data.ByteString.Char8 (pack)
import Data.Foldable (toList)
import Data.HashMap.Lazy (empty, insertWith)
import Data.Word8 (isAlphaNum)
import Conduit
import Data.Conduit.Network
import qualified Data.Conduit.Combinators as CC
main = runTCPServer (serverSettings 4000 "*") $ \appData -> do
hashMap <- runConduit $ appSource appData
.| CC.splitOnUnboundedE (not . isAlphaNum)
.| foldMC insertInHashMap empty
runConduit $ yield (pack $ show $ toList hashMap)
.| iterMC print
.| appSink appData
insertInHashMap x v = do
return (insertWith (+) v 1 x)
The problem is that the server doesn't reach the yield phase until I manually shut down the client and therefore never answers to it. I noticed that removing the concurrency from the client and keeping only the part in which it sends data to the server, everything works fine.
So, how can I preserve the receiving part of the client without breaking the flow?
You have a deadlock: the client is waiting for the server to respond before it closes the connection, but the server is unaware that the client is done sending data and is waiting for more. This is basically the problem described at https://cr.yp.to/tcpip/twofd.html:
When the generate-data program finishes, the same fd is still open in the consume-data program, so the kernel has no idea that it should send a FIN.
In your case, the fix needs to go on the client side. You need to call shutdown with ShutdownSend on the socket once conduit is done sending the contents of input.txt over it.
Here's one way to do so (I'm not sure if there's a nicer one):
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent.Async (concurrently)
import Data.Functor (void)
import Data.Foldable (traverse_)
import Conduit
import Data.Conduit.Network
import Data.Streaming.Network (appRawSocket)
import Network.Socket (shutdown, ShutdownCmd(..))
main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
void $ concurrently
((runConduitRes $ sourceFile "input.txt" .| appSink server) >> doneWriting server)
(runConduit $ appSource server .| stdoutC)
doneWriting = traverse_ (`shutdown` ShutdownSend) . appRawSocket
Side note: you don't really need concurrency in the client in this case, since there will never be anything to read from the server until you're done writing to the server. You could just do the reading after the writing and shutdown.
Related
In my program I am starting external process and communicate with it via stdin and stdout. I'm feeding the input through conduit (producer) started from STMs TQueue. It worked like a charm until I've decided to bump lts version. It worked great with lts <= 8.24.
Here is the minimized program that reproduces my problem:
#!/usr/bin/env stack
-- stack --resolver lts-10.4 --install-ghc runghc --package conduit-extra --package stm-conduit
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent
import Control.Monad.STM
import Control.Concurrent.STM.TQueue
import Data.Conduit
import qualified Data.Conduit.Binary as CB
import qualified Data.Conduit.List as CL
import Data.Conduit.Process (CreateProcess (..),
proc, sourceProcessWithStreams)
import qualified Data.Conduit.TQueue as CTQ
import qualified Data.ByteString.Char8 as BS
import Data.Monoid ((<>))
main :: IO ()
main = do
putStrLn "Enter \"exit\" to exit."
q <- open
putStrLn "connection opened"
loop q
where loop q = do
s <- BS.getLine
case s of
"exit" -> return ()
req -> do
atomically $ writeTQueue q req
loop q
open :: IO (TQueue BS.ByteString)
open = do
req <- atomically newTQueue
let chat :: CreateProcess
chat = proc "cat" []
input :: Producer IO BS.ByteString
input = toProducer
$ CTQ.sourceTQueue req
-- .| CL.mapM_ (\bs -> BS.putStrLn (("queue: " :: BS.ByteString) <> bs))
output :: Consumer BS.ByteString IO ()
output = toConsumer
$ CL.mapM_ BS.putStrLn
_ <- forkIO (sourceProcessWithStreams chat input output output >> pure ())
pure req
With newer lts it seems like the problem is not with communication via TQueue, as uncommenting the line which prints content from input conduit gives shows data from the queue. It looks like the spawned process never receives anything on it's stdin.
Furthermore writing to spawned cat stdin from console, like so:
echo "test" > /proc/<pid of spawned cat>/fd/0
produces output in my program.
Am I missing something that changed between versions?
So the issue was that default behaviour of sinkHandle was changed to not flush after every chunk of data.
I've fixed the issue by first porting to Data.Conduit.Process.Typed and then rolling my own variant of createSink that is using sinkHandleFlush instead of sinkHandle.
I have a simple forked conduit setup, with two inputs feeding one single output....
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.TMChan
import Data.Conduit.Network
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
input <- mergeSources [
transPipe liftIO (appSource server),
infiniteSource
] 2
input $$ transPipe liftIO (appSink server)
infiniteSource::MonadIO m=>Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource
(here I connect to a tcp socket, then combine the socket input with a timed infinite source, then respond back to the socket)
This works great, until the connection is dropped.... Because the second input still exists, the conduit keeps running. (In this case, the program does end when the timed input fires and there is no socket to write to, but this isn't always the case in my real example).
What is the proper way to shut down the full conduit when one of the inputs is closed?
I tried to brute force a crash by adding the following
crashOnEndOfStream::MonadIO m=>Conduit B.ByteString m B.ByteString
crashOnEndOfStream = do
awaitForever $ yield
error "the peer connection has disconnected" --tried with error
liftIO $ exitWith ExitSuccess --also tried with exitWith
but because the input conduit runs in a thread, the executable was immune to runtime exceptions shutting it down (plus, there is probably a smoother way to shut stuff down than halting the program).
the Source created by mergeSources keeps a count of unclosed sources. It's only closed when the count reaches 0 i.e. every upstream source is closed. This mechanism and the underlying TBMChannel is hidden from user code so you have no way to change its behavior.
One possible solution is to create the channel and the source manually with some medium-level functions exported by Data.Conduit.TMChan, so you can finalize the source by closing the TBMChannel. I haven't tested the code below since your program is not runnable on my machine.
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.Network
import Data.Conduit.TMChan
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
-- create the TBMChannel
chan <- liftIO $ newTBMChanIO 2
let
-- everything piped to the sink will appear at the source
chanSink = sinkTBMChan chan True
chanSource = sourceTBMChan chan
tid1 <- resourceForkIO $ appSource server $$ chanSink
tid2 <- resourceForkIO $ infiniteSource $$ chanSink
chanSource $$ transPipe liftIO (appSink server)
-- and call 'closeTBMChan chan' when you want to exit.
-- 'chanSource' will be closed when the underlying TBMChannel is closed.
infiniteSource :: MonadIO m => Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource
Running websockets Haskell example, show below, obviously does not work since there is no main function.
running ghc --make Main.hs -o Main confirms that
Meow function requires websockets connection. How to open the connection?
The library used is https://github.com/jaspervdj/websockets.
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad (forever)
import qualified Data.Text as T
import qualified Network.WebSockets as WS
meow :: WS.Connection -> IO ()
meow conn = forever $ do
msg <- WS.receiveData conn
WS.sendTextData conn $ msg `T.append` ", meow"
If you have a look at the example folder:
https://github.com/jaspervdj/websockets/blob/master/example/client.hs
main = withSocketsDo $ WS.runClient "echo.websocket.org"
80 "/" app
app is of type app :: WS.ClientApp () which is a synonym for Connection -> IO a
runClient will open the socket connection for you. If you want to know how, take a look at the source of the function (https://hackage.haskell.org/package/websockets-0.9.3.0/docs/src/Network-WebSockets-Client.html#runClient).
As an aside, withSocketDo belongs to socket. You will find the explanation here: https://hackage.haskell.org/package/network-2.6.0.2/docs/Network-Socket.html#v:withSocketsDo
There are other examples here: http://jaspervdj.be/websockets/example.html
I haven't used websocket and it is usually not such a good idea to answer unfamiliar topic. Hope to be of some help.
The Problem
Hello! I'm writing in Cloud Haskell a simple Server - Worker program. The problem is, that when I try to create ManagedProcess, after the server disovery step, my example hangs forever even while using callTimeout (which should break after 100 ms). The code is very simple, but I cannot find anything wrong with it.
I've posted the question on the mailing list also, but as far as I know the SO community, I canget the answer a lot faster here. If I get the answer from mailing list, I will postit here also.
Source Code
The Worker.hs:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TemplateHaskell #-}
module Main where
import Network.Transport (EndPointAddress(EndPointAddress))
import Control.Distributed.Process hiding (call)
import Control.Distributed.Process.Platform hiding (__remoteTable)
import Control.Distributed.Process.Platform.Async
import Control.Distributed.Process.Platform.ManagedProcess
import Control.Distributed.Process.Platform.Time
import Control.Distributed.Process.Platform.Timer (sleep)
import Control.Distributed.Process.Closure (mkClosure, remotable)
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Control.Distributed.Process.Node hiding (call)
import Control.Concurrent (threadDelay)
import GHC.Generics (Generic)
import Data.Binary (Binary)
import Data.Typeable (Typeable)
import Data.ByteString.Char8 (pack)
import System.Environment (getArgs)
import qualified Server as Server
main = do
[host, port, serverAddr] <- getArgs
Right transport <- createTransport host port defaultTCPParameters
node <- newLocalNode transport initRemoteTable
let addr = EndPointAddress (pack serverAddr)
srvID = NodeId addr
_ <- forkProcess node $ do
sid <- discoverServer srvID
liftIO $ putStrLn "x"
liftIO $ print sid
r <- callTimeout sid (Server.Add 5 6) 100 :: Process (Maybe Double)
liftIO $ putStrLn "x"
liftIO $ threadDelay (10 * 1000 * 1000)
threadDelay (10 * 1000 * 1000)
return ()
discoverServer srvID = do
whereisRemoteAsync srvID "serverPID"
reply <- expectTimeout 100 :: Process (Maybe WhereIsReply)
case reply of
Just (WhereIsReply _ msid) -> case msid of
Just sid -> return sid
Nothing -> discoverServer srvID
Nothing -> discoverServer srvID
The Server.hs:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TemplateHaskell #-}
module Server where
import Control.Distributed.Process hiding (call)
import Control.Distributed.Process.Platform hiding (__remoteTable)
import Control.Distributed.Process.Platform.Async
import Control.Distributed.Process.Platform.ManagedProcess
import Control.Distributed.Process.Platform.Time
import Control.Distributed.Process.Platform.Timer (sleep)
import Control.Distributed.Process.Closure (mkClosure, remotable)
import Network.Transport.TCP (createTransport, defaultTCPParameters)
import Control.Distributed.Process.Node hiding (call)
import Control.Concurrent (threadDelay)
import GHC.Generics (Generic)
import Data.Binary (Binary)
import Data.Typeable (Typeable)
data Add = Add Double Double
deriving (Typeable, Generic)
instance Binary Add
launchServer :: Process ProcessId
launchServer = spawnLocal $ serve () (statelessInit Infinity) server >> return () where
server = statelessProcess { apiHandlers = [ handleCall_ (\(Add x y) -> liftIO (putStrLn "!") >> return (x + y)) ]
, unhandledMessagePolicy = Drop
}
main = do
Right transport <- createTransport "127.0.0.1" "8080" defaultTCPParameters
node <- newLocalNode transport initRemoteTable
_ <- forkProcess node $ do
self <- getSelfPid
register "serverPID" self
liftIO $ putStrLn "x"
mid <- launchServer
liftIO $ putStrLn "y"
r <- call mid (Add 5 6) :: Process Double
liftIO $ print r
liftIO $ putStrLn "z"
liftIO $ threadDelay (10 * 1000 * 1000)
liftIO $ putStrLn "z2"
threadDelay (10 * 1000 * 1000)
return ()
We can run them as follow:
runhaskell Server.hs
runhaskell Worker.hs 127.0.0.2 8080 127.0.0.1:8080:0
The Results
When we run the programs, we got following results:
from Server:
x
y
!
11.0 -- this one shows that inside the same process we were able to use the "call" function
z
-- waiting - all the output above were tests from inside the server now it waits for external messages
from Worker:
x
pid://127.0.0.1:8080:0:10 -- this is the process id of the server optained with whereisRemoteAsync
-- waiting forever on the "callTimeout sid (Server.Add 5 6) 100" code!
As a sidenote - I've found out that, when sending messages with send (from Control.Distributed.Process) and reciving them with expect works. But sending them with call (from Control.Distributed.Process.Platform) and trying to recive them with ManagedProcess api handlers - hangs the call forever (even using callTimeout!)
Your client is getting an exception, which you are not able to observe easily because you are running your client in a forkProcess. If you want to do that it is fine but then you need to monitor or link to that process. In this case, simply using runProcess would be much simpler. If you do that, you will see you get this exception:
Worker.hs: trying to call fromInteger for a TimeInterval. Cannot guess units
callTimeout does not take an Integer, it takes a TimeInterval which are constructed with the functions in the Time module. This is a pseudo-Num - it does not actually support fromInteger it seems. I would consider that a bug or at least bad form (in Haskell) but in any case the way to fix your code is simply
r <- callTimeout sid (Server.Add 5 6) (milliSeconds 100) :: Process (Maybe Double)
To fix the problem with the client calling into the server, you need to register the pid of the server process you spawned rather than the main process you spawn it from - i.e. change
self <- getSelfPid
register "serverPID" self
liftIO $ putStrLn "x"
mid <- launchServer
liftIO $ putStrLn "y"
to
mid <- launchServer
register "serverPID" mid
liftIO $ putStrLn "y"
I'm trying to POST a multipart form request to an internal website which should reply with an XML response. Using another simple script I have in Python with the requests library, everything works fine, however, using http-conduit I keep receiving an exception ExpectedBlankAfter100Continue.
If I replace the internal url with "https://httpbin.org/post", I also receive a reply back without issue.
Is there something I'm doing wrong? It seems like either a bug in the library or the site is not behaving as expected. If the latter is the case, is there an option for me to disable this check in http-conduit?
Sample code:
{-# LANGUAGE OverloadedStrings #-}
import Network
import Network.HTTP.Conduit
import Network.HTTP.Client.MultipartFormData
import qualified Data.ByteString.Lazy.Char8 as BL
import Data.Maybe (fromJust)
import System.Environment (getArgs)
import Control.Monad.IO.Class
main = do
[x] <- getArgs
--let url = "https://url.net/api.asp"
let url = "https://httpbin.org/post"
withSocketsDo $ withManager $ \m -> do
r <- flip httpLbs m =<< (formDataBody (request $ BL.pack x) $ fromJust $ parseUrl url)
liftIO $ BL.putStrLn $ responseBody r
request :: BL.ByteString -> [Part]
request x = <code removed>
This sounds like the server is returning a malformed 100-continue response. But there's not enough information here to properly debug this, it's probably better to handle this in a Github issue.