pipes `take` with default value - haskell

I have a producer p with type Producer Message IO (Producer SB.ByteString IO ()).
Now I need optionally skip some messages and optionally process certain number of messages:
let p =
(processMBFile f >->
P.drop (optSkip opts)
>->
(case optLimit opts of
Nothing -> cat
Just n -> ptake n)
)
in
do
restp <- runEffect $ for p processMessage
I could not use take from Pipes.Prelude because it returns () while I need to return an empty producer. The way I quickly haked it is by replacing take with my own implementation ptake:
emptyP :: Producer SB.ByteString IO ()
emptyP = return ()
ptake :: Monad m => Int -> Pipe a a m (Producer SB.ByteString IO ())
ptake = go
where
go 0 = return emptyP
go n = do
a <- await
yield a
go (n-1)
My question is: if there is a more elegant way to do this?

ptake differs from take only in the return value, so it can be implemented in terms of it using (<$):
ptake n = emptyP <$ take n

Related

Pipes (Haskell lib) - piping pipes with different state monad

My goal is to have the last value produced equal to 80 (40 + 40) (see code below)...
import Pipes
import Pipes.Prelude
import Pipes.Lift
import Control.Monad.State.Strict
data Input = A Integer | B Integer | C Integer
main :: IO ()
main = runEffect $ each [A 10,B 2,C 3,A 40,A 40] >-> pipeline >-> print
pipeline :: Pipe Input Integer IO ()
pipeline = for cat $ \case
A x -> yield x >-> accumulate
B x -> yield x
C x -> yield x
accumulate :: Pipe Integer Integer IO ()
accumulate = evalStateP 0 accumulate'
accumulate' :: Pipe Integer Integer (StateT Integer IO) ()
accumulate' = go
where
go = do
x <- await
lift $ modify (+x)
r <- lift get
yield r
go
With this example Input As are not accumulated...yield x >-> accumulate on Input A does do what I'm expected, the stream is a new one each time...
Piping pipes with different state monad sequentially works well but here somehow I want to nest them in the case pattern (like a substream somehow)...
The problem is that you call evalStateP too early, discarding state you want to preserve across calls to accumulate. Try something like this:
pipeline :: Pipe Input Integer IO ()
pipeline = evalStateP 0 $ for cat $ \case
A x -> yield x >-> accumulate
B x -> yield x
C x -> yield x
accumulate :: Pipe Integer Integer (StateT Integer IO) ()
accumulate = for cat $ \x -> do
modify (+x)
r <- get
yield r
Note that Proxy has a MonadState instance, so you don't need to lift state operations manually if you use mtl.

Can we access the output from a replicateM defined in a do-block

Assume i have something like this
main = do
input_line <- getLine
let n = read input_line :: Int
replicateM n $ do
input_line <- getLine
let x = read input_line :: Int
return ()
***putStrLn $ show -- Can i access my replicateM here?
return ()
Can i access the result of my replicateM such as if it was a returned value, and for example print it out. Or do i have to work with the replicateM inside the actual do-block?
Specialized to IO
replicateM :: Int -> IO a -> IO [a]
which means that it returns a list. So in your example you could do:
results <- replicateM n $ do
input_line <- getLine
let x = read input_line :: Int
return x -- <- we have to return it if we want to access it
print results
replicateM n a returns a list of the values returned by a. In your case that'd just be a list of units because you have the return () at the end, but if you replace that with return x, you'll get a list of the read integers. You can then just use <- to get it out of the IO.
You can also simplify your code by using readLine instead of getLine and read. Similarly putStrLn . show can be replaced with print.
main = do
n <- readLn
ints <- replicateM n readLn :: IO [Int]
print ints
Of course. Its type is replicateM :: Monad m => Int -> m a -> m [a]. It means it can appear to the right of <- in a do block:
do
....
xs <- replicateM n $ do { ... }
....
xs will be of type [a], as usual for binding the results from Monad m => m [a].
With your code though, where you show return () in that nested do, you'll get ()s replicated n times in your xs. Presumably in the real code you will return something useful there.

Chunk data with Conduit

Here is an example of a conduit combinator that supposed to yield downstream when a complete message is received from upstream:
import qualified Data.ByteString as BS
import Data.Conduit
import Data.Conduit.Combinators
import Data.Conduit.Network
message :: Monad m => ConduitT BS.ByteString BS.ByteString m ()
message = loop
where
loop = await >>= maybe (return ()) go
go x = if (BS.isSuffixOf "|" x)
then yield (BS.init x) >> loop
else leftover x
Server code itself looks like following:
main :: IO ()
main = do
runTCPServer (serverSettings 5000 "!4") $ \ appData -> runConduit $
(appSource appData)
.| message
.| (appSink appData)
For some reason telnet 127.0.0.1 5000 disconnects after sending any message:
telnet 127.0.0.1 5000
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
123|
Connection closed by foreign host.
Please advice, what am I doing wrong here?
Update
More importantly what I try doing here is wait for completion signal | and then yield the complete message downstream. Here is the evolution of message combinator:
message :: Monad m => ConduitT BS.ByteString BS.ByteString m ()
message = do
minput <- await
case minput of
Nothing -> return ()
Just input -> do
case BS.breakSubstring "|" input of
("", "") -> return ()
("", "|") -> return ()
("", xs) -> leftover $ BS.tail xs
(x, "") -> leftover x -- problem is in this leftover
(x, xs) -> do
yield x
leftover $ BS.tail xs
message
The idea I had is that if there is nothing coming from the upstream combinator will have to wait until there will be something, such that it can send a complete message downstream. But it seams that conduit starts spinning on CPU a lot on that leftover call in the above message combinator.
Finally figured out that it was necessary to await instead of leftover on the base case. Here is how working message combinator looks like:
message :: Monad m => ConduitT BS.ByteString BS.ByteString m ()
message = do
minput <- await
case minput of
Nothing -> return ()
Just input -> process input >> message
where
process input =
case BS.breakSubstring "|" input of
("", "") -> return ()
("", "|") -> return ()
("", xs) -> leftover $ BS.tail xs
(x, "") -> do
minput <- await
case minput of
Nothing -> return ()
Just newInput -> process $ BS.concat [x, newInput]
(x, xs) -> do
yield x
leftover $ BS.tail xs
A bit of boilerplate that can probably be cleaned up, but it works.
Print x in go to debug.
...
go x = do
liftIO (Prelude.print x)
if ...
The socket receives a bytestring that ends with \r\n, so you go to the else branch, which terminates the session.

MVars are blocking indefinitely; but only in certain scenarios.

First, because this is about a specific case, I haven't reduced the code at all, so it will be quite long, and in 2 parts (Helper module, and the main).
SpawnThreads in ConcurHelper takes a list of actions, forks them, and gets an MVar containing the result of the action. It them combines the results, and returns the resulting list. It works fine in certain cases, but blocks indefinitely on others.
If I give it a list of putStrLn actions, it executes them fine, then returns the resulting ()s (yes, I know running print commands on different threads at the same time is bad in most cases).
If I try running multiTest in Scanner though (which takes either scanPorts or scanAddresses, the scan range, and the number of threads to use; then splits the scan range over the threads, and passes the list of actions to SpawnThreads), it will block indefinitely. The odd thing is, according to the debug prompts scattered around ConcurHelper, on each thread, ForkIO is returning before the MVar is filled. This would make sense if it wasn't in a do block, but shouldn't the actions be performed sequentially? (I don't know if this is related to the problem or not; it's just something I noticed while attempting to debug it).
I've thought it out step by step, and if it's executing in the order laid out in spawnThreads, the following should happen:
An empty MVar should be created inside forkIOReturnMVar, and passed to mVarWrapAct.
mVarWrapAct should execute the action, and put the result in the MVar (this is where the problem seems to lie. "MVar filled" is never shown, suggesting the MVar is never put into)
getResults should then take from the resulting list of MVars, and return the results
If point #2 isn't the issue, I can see where the problem would be (and if it is the issue, I can't see why putMVar never executes. Inside the scanner module, the only real function of interest for this question is multiTest. I only included the rest so it could be run).
To do a simple test, you can run the following:
spawnThreads [putStrLn "Hello", putStrLn "World"] (should return [(),()])
multiTest (scanPorts "127.0.0.1") 1 (0,5) (Creates the MVar, hangs for a sec, then crashes with the aforementioned error)
Any help in understanding whats going on here would be appreciated. I can't see what the difference between the 2 use cases are.
Thank you
(And I'm using this atrocious exception handling system because IO errors don't give codes for specific network exceptions, so I've been left with parsing messages to find out what happened)
Main:
module Scanner where
import Network
import Network.Socket
import System.IO
import Control.Exception
import Control.Concurrent
import ConcurHelper
import Data.Maybe
import Data.Char
import NetHelp
data NetException = NetNoException | NetTimeOut | NetRefused | NetHostUnreach
| NetANotAvail | NetAccessDenied | NetAddrInUse
deriving (Show, Eq)
diffExcept :: Either SomeException Handle -> Either NetException Handle
diffExcept (Right h) = Right h
diffExcept (Left (SomeException m))
| err == "WSAETIMEDOUT" = Left NetTimeOut
| err == "WSAECONNREFUSED" = Left NetRefused
| err == "WSAEHOSTUNREACH" = Left NetHostUnreach
| err == "WSAEADDRNOTAVAIL" = Left NetANotAvail
| err == "WSAEACCESS" = Left NetAccessDenied
| err == "WSAEADDRINUSE" = Left NetAddrInUse
| otherwise = error $ show m
where
err = reverse . dropWhile (== ')') . reverse . dropWhile (/='W') $ show m
extJust :: Maybe a -> a
extJust (Just a) = a
selectJusts :: IO [Maybe a] -> IO [a]
selectJusts mayActs = do
mays <- mayActs; return . map extJust $ filter isJust mays
scanAddresses :: Int -> Int -> Int -> IO [String]
scanAddresses port minAddr maxAddr =
selectJusts $ mapM (\addr -> do
let sAddr = "192.168.1." ++ show addr
print $ "Trying " ++ sAddr ++ " " ++ show port
connection <- testConn sAddr port
if isJust connection
then do hClose $ extJust connection; return $ Just sAddr
else return Nothing) [minAddr..maxAddr]
scanPorts :: String -> Int -> Int -> IO [Int]
scanPorts addr minPort maxPort =
selectJusts $ mapM (\port -> do
--print $ "Trying " ++ addr ++ " " ++ show port
connection <- testConn addr port
if isJust connection
then do hClose $ extJust connection; return $ Just port
else return Nothing) [minPort..maxPort]
main :: IO ()
main = do
withSocketsDo $ do
putStrLn "Scan Addresses or Ports? (a/p)"
choice <- getLine
if (toLower $ head choice) == 'a'
then do
putStrLn "On what port?"
sPort <- getLine
addrs <- scanAddresses (read sPort :: Int) 0 255
print addrs
else do
putStrLn "At what address?"
address <- getLine
ports <- scanPorts address 0 9999
print ports
main
testConn :: HostName -> Int -> IO (Maybe Handle)
testConn host port = do
result <- try $ timedConnect 1 host port
let result' = diffExcept result
case result' of
Left e -> do putStrLn $ "\t" ++ show e; return Nothing
Right h -> return $ Just h
setPort :: AddrInfo -> Int -> AddrInfo
setPort addInf nPort = case addrAddress addInf of
(SockAddrInet _ host) -> addInf { addrAddress = (SockAddrInet (fromIntegral nPort) host)}
getHostAddress :: HostName -> Int -> IO SockAddr
getHostAddress host port = do
addrs <- getAddrInfo Nothing (Just host) Nothing
let adInfo = head addrs
newAdInfo = setPort adInfo port
return $ addrAddress newAdInfo
timedConnect :: Int -> HostName -> Int -> IO Handle
timedConnect time host port = do
s <- socket AF_INET Stream defaultProtocol
setSocketOption s RecvTimeOut time; setSocketOption s SendTimeOut time
addr <- getHostAddress host port
connect s addr
socketToHandle s ReadWriteMode
multiTest :: (Int -> Int -> IO a) -> Int -> (Int, Int) -> IO [a]
multiTest partAction threads (mi,ma) =
spawnThreads $ recDiv [mi,perThread..ma]
where
perThread = ((ma - mi) `div` threads) + 1
recDiv [] = []
recDiv (curN:restN) =
partAction (curN + 1) (head restN) : recDiv restN
Helper:
module ConcurHelper where
import Control.Concurrent
import System.IO
spawnThreads :: [IO a] -> IO [a]
spawnThreads actions = do
ms <- mapM (\act -> do m <- forkIOReturnMVar act; return m) actions
results <- getResults ms
return results
forkIOReturnMVar :: IO a -> IO (MVar a)
forkIOReturnMVar act = do
m <- newEmptyMVar
putStrLn "Created MVar"
forkIO $ mVarWrapAct act m
putStrLn "Fork returned"
return m
mVarWrapAct :: IO a -> MVar a -> IO ()
mVarWrapAct act m = do a <- act; putMVar m a; putStrLn "MVar filled"
getResults :: [MVar a] -> IO [a]
getResults mvars = do
unpacked <- mapM (\m -> do r <- takeMVar m; return r) mvars
putStrLn "MVar taken from"
return unpacked
Your forkIOReturnMVar isn't exception safe: whenever act throws, the MVar isn't going to be filled.
Minimal example
import ConcurHelper
main = spawnThreads [badOperation]
where badOperation = do
error "You're never going to put something in the MVar"
return True
As you can see, badOperation throws, and therefore the MVar won't get filled in mVarWrapAct.
Fix
Fill the MVar with an appropriate value if you encounter an exception. Since you cannot provide a default value for all possible types a, it's better to use MVar (Maybe a) or MVar (Either b a) as you already do in your network code.
In order to catch the exceptions, use one of the operations provided in Control.Exception. For example, you could use onException:
mVarWrapAct :: IO a -> MVar (Maybe a) -> IO ()
mVarWrapAct act m = do
onException (act >>= putMVar m . Just) (putMVar m Nothing)
putStrLn "MVar filled"
However, you might want to preserve the actual exception for more information. In this case you could simply use catch together with Either SomeException a :
mVarWrapAct :: IO a -> MVar (Either SomeException a) -> IO ()
mVarWrapAct act m = do
catch (act >>= putMVar m . Right) (putMVar m . Left)
putStrLn "MVar filled"

Why can't I compare result of lookup to Nothing in Haskell?

I have the following code:
import System.Environment
import System.Directory
import System.IO
import Data.List
dispatch :: [(String, [String] -> IO ())]
dispatch = [ ("add", add)
, ("view", view)
, ("remove", remove)
, ("bump", bump)
]
main = do
(command:args) <- getArgs
let result = lookup command dispatch
if result == Nothing then
errorExit
else do
let (Just action) = result
action args
errorExit :: IO ()
errorExit = do
putStrLn "Incorrect command"
add :: [String] -> IO ()
add [fileName, todoItem] = appendFile fileName (todoItem ++ "\n")
view :: [String] -> IO ()
view [fileName] = do
contents <- readFile fileName
let todoTasks = lines contents
numberedTasks = zipWith (\n line -> show n ++ " - " ++ line) [0..] todoTasks
putStr $ unlines numberedTasks
remove :: [String] -> IO ()
remove [fileName, numberString] = do
handle <- openFile fileName ReadMode
(tempName, tempHandle) <- openTempFile "." "temp"
contents <- hGetContents handle
let number = read numberString
todoTasks = lines contents
newTodoItems = delete (todoTasks !! number) todoTasks
hPutStr tempHandle $ unlines newTodoItems
hClose handle
hClose tempHandle
removeFile fileName
renameFile tempName fileName
bump :: [String] -> IO ()
bump [fileName, numberString] = do
handle <- openFile fileName ReadMode
(tempName, tempHandle) <- openTempFile "." "temp"
contents <- hGetContents handle
let number = read numberString
todoTasks = lines contents
bumpedItem = todoTasks !! number
newTodoItems = [bumpedItem] ++ delete bumpedItem todoTasks
hPutStr tempHandle $ unlines newTodoItems
hClose handle
hClose tempHandle
removeFile fileName
renameFile tempName fileName
Trying to compile it gives me the following error:
$ ghc --make todo
[1 of 1] Compiling Main ( todo.hs, todo.o )
todo.hs:16:15:
No instance for (Eq ([[Char]] -> IO ()))
arising from a use of `=='
Possible fix:
add an instance declaration for (Eq ([[Char]] -> IO ()))
In the expression: result == Nothing
In a stmt of a 'do' block:
if result == Nothing then
errorExit
else
do { let (Just action) = ...;
action args }
In the expression:
do { (command : args) <- getArgs;
let result = lookup command dispatch;
if result == Nothing then
errorExit
else
do { let ...;
.... } }
I don't get why is that since lookup returns Maybe a, which I'm surely can compare to Nothing.
The type of the (==) operator is Eq a => a -> a -> Bool. What this means is that you can only compare objects for equality if they're of a type which is an instance of Eq. And functions aren't comparable for equality: how would you write (==) :: (a -> b) -> (a -> b) -> Bool? There's no way to do it.1 And while clearly Nothing == Nothing and Just x /= Nothing, it's the case that Just x == Just y if and only if x == y; thus, there's no way to write (==) for Maybe a unless you can write (==) for a.
There best solution here is to use pattern matching. In general, I don't find myself using that many if statements in my Haskell code. You can instead write:
main = do (command:args) <- getArgs
case lookup command dispatch of
Just action -> action args
Nothing -> errorExit
This is better code for a couple of reasons. First, it's shorter, which is always nice. Second, while you simply can't use (==) here, suppose that dispatch instead held lists. The case statement remains just as efficient (constant time), but comparing Just x and Just y becomes very expensive. Second, you don't have to rebind result with let (Just action) = result; this makes the code shorter and doesn't introduce a potential pattern-match failure (which is bad, although you do know it can't fail here).
1:: In fact, it's impossible to write (==) while preserving referential transparency. In Haskell, f = (\x -> x + x) :: Integer -> Integer and g = (* 2) :: Integer -> Integer ought to be considered equal because f x = g x for all x :: Integer; however, proving that two functions are equal in this way is in general undecidable (since it requires enumerating an infinite number of inputs). And you can't just say that \x -> x + x only equals syntactically identical functions, because then you could distinguish f and g even though they do the same thing.
The Maybe a type has an Eq instance only if a has one - that's why you get No instance for (Eq ([[Char]] -> IO ())) (a function can't be compared to another function).
Maybe the maybe function is what you're looking for. I can't test this at the moment, but it should be something like this:
maybe errorExit (\action -> action args) result
That is, if result is Nothing, return errorExit, but if result is Just action, apply the lambda function on action.

Resources