I am curious to know what the best way to send 10 GB once over the network in Haskell? I don't want to send it as binary without having to write a decode/encoder. Is it best to keep it strict or lazy?
You can also use Pipes. A zlib compressed network example can be:
module Main where
--------------------------------------------------------------------------------
import Pipes
import qualified Pipes.Prelude as P
import Pipes.Network.TCP
import Data.ByteString
import Pipes.ByteString as PB
import System.Environment
import Pipes.Zlib
--------------------------------------------------------------------------------
myReceive :: (MonadIO m) => m ()
myReceive = serve (Host "127.0.0.1") "8000" $ \(sock, remote) -> do
Prelude.putStrLn $ "TCP connection established from " ++ show remote
runEffect $ decompress defaultWindowBits (fromSocket sock 4096) >-> PB.stdout
mySend :: IO ()
mySend = connect "127.0.0.1" "8000" $ \(sock, remote) -> do
Prelude.putStrLn $ "Connection established to " ++ show remote
runEffect $ compress bestCompression defaultWindowBits (PB.stdin) >-> toSocket sock
main = do
a:[] <- getArgs
case a of
"send" -> mySend
"receive" -> myReceive
_ -> Prelude.putStrLn "Usage: netsend <send|receive>"
Related
When running this TCP server, using Network.Simple.TCP's serve command, I get an invalid argument from accept; this happens both in this example, which is whittled down and extracted slightly from a program that instead receives protobuf messages (rather than text messages) over TCP. The error, however, is the same:
#!/usr/bin/env stack
{- stack script --nix --resolver lts-14.27
--nix-packages zlib
--no-nix-pure
--package bytestring
--package classy-prelude
--package conduit
--package exceptions
--package mtl
--package network
--package network-simple
--package stm
--package stm-conduit
--package text
--package unliftio
--ghc-options -Wall
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
module Main where
import ClassyPrelude hiding (hClose)
import Conduit
import Control.Concurrent.STM.TBQueue (TBQueue, writeTBQueue)
import Control.Monad.Catch (MonadMask)
import Control.Monad.Writer
import Data.Bits (shiftR, (.&.))
import qualified Data.ByteString.Char8 as B
import Data.Conduit.Async (gatherFrom)
import qualified Data.Conduit.List as CL
import Data.Function ((&))
import qualified Data.Text as T
import GHC.IO.Handle (Handle, hClose)
import qualified Network.Simple.TCP as TCP
import qualified Network.Socket as NS
import UnliftIO.Concurrent (ThreadId, forkIO, threadDelay)
type Error = [String]
type Result r = Writer Error r
runResult :: Result r -> (r, Error)
runResult = runWriter
getPort :: NS.ServiceName
getPort = "29876"
waitForever :: IO ()
waitForever = do
threadDelay 10000
waitForever
-- | This signature is meant to simulate the same function from the proto-lens library,
-- | but without dealing with protobus for binary data.
decodeMessageDelimitedH :: Handle -> IO (Either String String)
decodeMessageDelimitedH h = do
sOut <- B.hGetLine h
pure $ Right $ B.unpack sOut
protoServe :: forall m. (MonadMask m, MonadResource m, MonadUnliftIO m) =>
(String -> Result [String])
-> ConduitT () [String] m ()
protoServe fromProto = start .| mapMC logFilterRead
.| CL.catMaybes .| mapMC msgToRecs
where
port = trace "getting protobuf port" getPort
start = do
let enQserver = serveTBQ (TCP.HostIPv4) port (decodeProto . fst)
gatherFrom 10000 enQserver
decodeProto :: NS.Socket -> m (Either String String)
decodeProto sock = bracket
connHandleIO
(liftIO . hClose)
(liftIO . decodeMessageDelimitedH)
where
connHandleIO :: m Handle
connHandleIO = liftIO $ sockToHandle sock
logFilterRead :: Either String String -> m (Maybe String)
logFilterRead pEi = case pEi of
Right p -> pure $ Just p
Left err -> trace err $ pure Nothing
msgToRecs :: String -> m [String]
msgToRecs p = case runResult $ fromProto p of
(rs, rErr) -> do
when (not $ null rErr) $ pure $ trace (intercalate "\n" rErr) ()
pure $ trace "completed msgToRecs" rs
-- | The handle only needs a read-view of the socket. Note that a TBQeueue is
-- | mutable but has STM's runtime safety checks in place.
sockToHandle :: NS.Socket -> IO Handle
sockToHandle sock = NS.socketToHandle sock ReadMode
-- | Based on serve and listen from Network.Simple.TCP
-- | Unlike `serve`, which never returns, `serveTBQ` immediately returns
-- | a `TBQueue` of results.
serveTBQ :: forall a m. (MonadMask m, MonadUnliftIO m)
=> TCP.HostPreference -- ^ Host to bind.
-> NS.ServiceName -- ^ Server service port name or number to bind.
-> ((NS.Socket, NS.SockAddr) -> m a)
-- ^ Computation to run in a different thread once an incoming connection is
-- accepted. Takes the connection socket and remote end address.
-> TBQueue a -- ^ enqueue computation results to this queue
-> m ()
-- ^ Returns a FIFO (queue) of results from concurrent requests
serveTBQ hp port rFun tbq = do
_ <- async $ withRunInIO $ \run -> TCP.serve hp port $ \(lsock, _) -> do
run $ void $ acceptTBQ lsock rFun tbq
putStrLn $ T.pack "exiting serveTBQ"
-- | Based on acceptFork from Network.Simple.TCP.
acceptTBQ :: forall a m.
MonadUnliftIO m
=> NS.Socket -- ^ Listening and bound socket.
-> ((NS.Socket, NS.SockAddr) -> m a)
-- ^ Computation to run in a different thread once an incoming connection is
-- accepted. Takes the connection socket and remote end address.
-> TBQueue a
-> m ThreadId
acceptTBQ lsock rFun tbq = mask $ \restore -> do
(csock, addr) <- trace ("running restore-accept on lsock: " <> (show lsock)) $ restore (liftIO $ NS.accept lsock)
onException (forkIO $ finally
(restore $ do
rVal <- trace "retrieved rVal in finally-restore" rFun (csock, addr)
atomically $ writeTBQueue tbq rVal)
(TCP.closeSock csock))
(TCP.closeSock csock)
retryForever :: forall m a. MonadUnliftIO m => m a -> m a
retryForever prog = catchAny prog progRetry
where
progRetry :: SomeException -> m a
progRetry ex = do
putStrLn $ pack $ show ex
threadDelay 4000000
retryForever prog
-- | Safer interface to sinkNull
sinkUnits :: MonadResource m => ConduitT () Void m ()
sinkUnits = sinkNull
main :: IO ()
main = retryForever $ do
putStrLn $ T.pack "starting tcp server"
let myProtoServe = protoServe (pure . words)
myProtoServe .| mapMC (putStrLn . T.pack . intercalate "_") .| sinkUnits & runConduitRes
putStrLn $ T.pack "tcp server exited"
waitForever
When running the above server and executing a netcat command feeding in some text over TCP, e.g. netcat 127.0.0.1 29876 < .bashrc (substitute .bashrc with any text file), I see output like the following:
starting tcp server
exiting serveTBQ
getting protobuf port
tcp server exited
running restore-accept on lsock: <socket: 16>
tcpConduitServer-exe: Network.Socket.accept: invalid argument (Invalid argument)
I'm not sure what would cause this behavior, though I'm not experience with TCP. Since I'm using serve rather than writing my own TCP logic, I'm a bit surprised at receiving a TCP error.
Here is a repository containing the script (as well as the non-script version):
https://github.com/bbarker/tcpConduitServer
It's because you call NS.accept on a socket connected to the client.
As documented at Network.Simple.TCP.serve, TCP.serve accepts a socket and invokes its third argument by passing the accepted socket in a different thread. lsock in serveTBQ is a socket connected to the client, not a server socket to accept a new connection.
When you write a TCP server, you need to 1) create a socket, 2) bind it to an address and a port, 3) listen it, then 4) accept it. TCP.serve does all these tasks for you, so all you need to do is to read/write on a socket TCP.serve provides to your callback function.
I'm writing a socket server with runTCPServer from conduit-extra (formerly known as network-conduit). My goal is to interact with my editor using this server --- activate the server from the editor (most likely just by calling external command), use it, and terminate the server when the work is done.
For simplicity, I start with a simple echo server, and let's say I'd like to shut down the whole process when the connection is closed.
So I tried:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Conduit
import Data.Conduit.Network
import Data.ByteString (ByteString)
import Control.Monad.IO.Class (liftIO)
import System.Exit (exitSuccess)
import Control.Exception
defaultPort :: Int
defaultPort = 4567
main :: IO ()
main = runTCPServer (serverSettings defaultPort "*") $ \ appData ->
appSource appData $$ conduit =$= appSink appData
conduit :: ConduitM ByteString ByteString IO ()
conduit = do
msg <- await
case msg of
Nothing -> liftIO $ do
putStrLn "Nothing left"
exitSuccess
-- I'd like the server to shut down here
(Just s) -> do
yield s
conduit
But this doesn't work -- the program continues to accept new connections. If I am not mistaken, this is because the thread listening to the connection we're dealing with exits with exitSuccess, but the entire process doesn't. So this is totally understandable, but I haven't been able to find a way to exit the whole process.
How do I terminate a server run by runTCPServer? Is runTCPServer something that's supposed to serve forever?
Here's a simple implementation of the idea described in comments:
main = do
mv <- newEmptyMVar
tid <- forkTCPServer (serverSettings defaultPort "*") $ \ appData ->
appSource appData $$ conduit mv =$= appSink appData
() <- takeMVar mv -- < -- wait for done signal
return ()
conduit :: MVar () -> ConduitM ByteString ByteString IO ()
conduit mv = do
msg <- await
case msg of
Nothing -> liftIO $ do
putStrLn "Nothing left"
putMVar mv () -- < -- signal that we're done
(Just s) -> do
yield s
conduit mv
I have a simple forked conduit setup, with two inputs feeding one single output....
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.TMChan
import Data.Conduit.Network
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
input <- mergeSources [
transPipe liftIO (appSource server),
infiniteSource
] 2
input $$ transPipe liftIO (appSink server)
infiniteSource::MonadIO m=>Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource
(here I connect to a tcp socket, then combine the socket input with a timed infinite source, then respond back to the socket)
This works great, until the connection is dropped.... Because the second input still exists, the conduit keeps running. (In this case, the program does end when the timed input fires and there is no socket to write to, but this isn't always the case in my real example).
What is the proper way to shut down the full conduit when one of the inputs is closed?
I tried to brute force a crash by adding the following
crashOnEndOfStream::MonadIO m=>Conduit B.ByteString m B.ByteString
crashOnEndOfStream = do
awaitForever $ yield
error "the peer connection has disconnected" --tried with error
liftIO $ exitWith ExitSuccess --also tried with exitWith
but because the input conduit runs in a thread, the executable was immune to runtime exceptions shutting it down (plus, there is probably a smoother way to shut stuff down than halting the program).
the Source created by mergeSources keeps a count of unclosed sources. It's only closed when the count reaches 0 i.e. every upstream source is closed. This mechanism and the underlying TBMChannel is hidden from user code so you have no way to change its behavior.
One possible solution is to create the channel and the source manually with some medium-level functions exported by Data.Conduit.TMChan, so you can finalize the source by closing the TBMChannel. I haven't tested the code below since your program is not runnable on my machine.
{-# LANGUAGE OverloadedStrings #-}
import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString as B
import Data.Conduit
import Data.Conduit.Network
import Data.Conduit.TMChan
main::IO ()
main = do
runTCPClient (clientSettings 3000 "127.0.0.1") $ \server -> do
runResourceT $ do
-- create the TBMChannel
chan <- liftIO $ newTBMChanIO 2
let
-- everything piped to the sink will appear at the source
chanSink = sinkTBMChan chan True
chanSource = sourceTBMChan chan
tid1 <- resourceForkIO $ appSource server $$ chanSink
tid2 <- resourceForkIO $ infiniteSource $$ chanSink
chanSource $$ transPipe liftIO (appSink server)
-- and call 'closeTBMChan chan' when you want to exit.
-- 'chanSource' will be closed when the underlying TBMChannel is closed.
infiniteSource :: MonadIO m => Source m B.ByteString
infiniteSource = do
liftIO $ threadDelay 10000000
yield "infinite source"
infiniteSource
Is there something that is like the opposite of liftIO? I'm using websockets, and I want to be able to listen for messages from the server in a separate thread. Here's what I'm doing:
import Network.WebSockets
import qualified Data.Text as T
import Control.Monad.IO.Class
import Control.Monad
import Control.Concurrent
import Control.Applicative
printMessages :: WebSockets Hybi00 ()
printMessages = forever $ do
resp <- receiveDataMessage
liftIO $ print resp
run :: WebSockets Hybi00 ()
run = do
liftIO . forkIO $ printMessages
forever $ do
line <- liftIO getLine
sendTextData . T.pack $ line
main = connect "0.0.0.0" 8080 "/" run
So printMessages listens for messages from the server and keeps printing them out. The problem is, forkIO expects a function that returns IO (). Is there any way for me to run printMessages in the IO monad?
If I'm understanding this right, the reason you want to receive messages in another thread is because the main thread will be waiting for user input to send.
From a look at the documentation, it seems like you'll have an easier time if you reverse the roles of the threads: receive in the main thread, and send asynchronously from the other.
Then you can use getSink :: Protocol p => WebSockets p (Sink p) to grab a sink before forking, which you can then use with sendSink :: Sink p -> Message p -> IO () which lives in IO, avoiding the whole problem of mixing monads.
In other words, restructure your code to something like this:
sendMessages :: Sink Hybi00 -> IO ()
sendMessages sink = forever $ do
line <- getLine
let msg = textData . T.pack $ line
sendSink sink msg
run :: WebSockets Hybi00 ()
run = do
sink <- getSink
liftIO . forkIO $ sendMessages sink
forever $ do
resp <- receiveDataMessage
liftIO $ print resp
main = connect "0.0.0.0" 8080 "/" run
To learn the basics of conduit library, I used network-conduit to make a simple echo server:
import Control.Monad.IO.Class
import qualified Data.ByteString.Char8 as BS
import Data.Conduit
import Data.Conduit.Network
-- A conduit that print the input it receives on the console
-- and passes it through.
echo :: (MonadIO m) => Conduit BS.ByteString m BS.ByteString
echo = do
yield (BS.pack "Anything you type will be echoed back.\n")
-- Print the received data to the console as well:
awaitForever (\x -> liftIO (BS.putStr x) >> yield x)
echoApp :: (MonadIO m) => Application m
echoApp appData = appSource appData $= echo $$ appSink appData
-- Listen on port 4545:
main :: IO ()
main = runTCPServer (serverSettings 4545 HostAny) echoApp
It does what I wanted, but when the client closes its part of the connection, the server is still waiting for input instead of writing out any remaining data and closing its sending part of the connection too:
$ nc localhost 4545 <<<"Hello world!"
Anything you type will be echoed back.
Hello world!
I tried removing echo and do just
echoApp appData = appSource appData $$ appSink appData
but the problem is still there. What am I doing wrong?
I'm not sure what you mean by "the server won't respond to it"? I'd guess you're expecting the server to shut down after the client disconnects. If so, that's not the intention of the library: it continues to server connections in an infinite loop as long as they continue coming in. Using addCleanup, you can see that the individual connection handlers do in fact terminate, e.g.:
echo :: (MonadIO m) => Conduit BS.ByteString m BS.ByteString
echo = addCleanup (const $ liftIO $ putStrLn "Stopping") $ do
yield (BS.pack "Anything you type will be echoed back.\n")
-- Print the received data to the console as well:
awaitForever (\x -> liftIO (BS.putStr x) >> yield x)
It turned out that the problem wasn't with network-conduit, that part works correctly. The problem was with nc, which doesn't close its sending part of the socket when it sends all the data. I made a testing python script and it works against the server as expected:
#!/usr/bin/env python
import socket
HOST = 'localhost'
PORT = 4545
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.sendall('Hello, world')
# This was missing in `nc`:
s.shutdown(socket.SHUT_WR);
print 'Received'
data = s.recv(1024)
while data:
print data,
data = s.recv(1024)
s.close()