Haskell: System.Process merge stdout and stderr - haskell

I want to invoke a process from within a haskell program and capture stdout as well as stderr.
What I do:
(_, stdout, stderr) <- readProcessWithExitCode "command" [] ""
The problem: This way, stdout and stderr are captured separately, however I want the messages to appear in the right place (otherwise I would simply stdout ++ stderr which separates error messages from their stdout counterparts).
I do know that I could achieve this if I'd pipe the output into a file, i.e.
tmp <- openFile "temp.file" ...
createProcess (proc "command" []) { stdout = UseHandle tmp,
stderr = UseHandle tmp }
So my current workaround is to pipe outputs to a tempfile and read it back in. However I'm looking for a more direct approach.
If I was on unix for sure I'd simply invoke a shell command á la
command 2>&1
and that's it. However, I'd like to have this as portable as possible.
What I need this for: I've built a tiny haskell cgi script (just to play with it) which invokes a certain program and prints the output. I want to html-escape the output, thus I can't simply pipe it to stdout.
I was thinking: Maybe it's possible to create an in-memory-handle, like a PipedInputStream/PipedOutputStream in Java, or ArrayInputStream/ArrayOutputStream which allows for processing IO streams within memory. I looked around for a function :: Handle on hoogle, but did not find anything.
Maybe there is another Haskell module out there which allows me to merge two streams?

You can use pipes to concurrently merge two input streams. The first trick is to read from two streams concurrently, which you can do using the stm package:
import Control.Applicative
import Control.Proxy
import Control.Concurrent
import Control.Concurrent.STM
import System.Process
toTMVarC :: (Proxy p) => TMVar a -> () -> Consumer p a IO r
toTMVarC tmvar () = runIdentityP $ forever $ do
a <- request ()
lift $ atomically $ putTMVar tmvar a
fromTMVarS :: (Proxy p) => TMVar a -> () -> Producer p a IO r
fromTMVarS tmvar () = runIdentityP $ forever $ do
a <- lift $ atomically $ takeTMVar tmvar
respond a
I will soon provide the above primitives in a pipes-stm package, but use the above for now.
Then you just feed each Handle to a separate MVar and read from both concurrently:
main = do
(_, mStdout, mStderr, _) <- createProcess (proc "ls" [])
case (,) <$> mStdout <*> mStderr of
Nothing -> return ()
Just (stdout, stderr) -> do
out <- newEmptyTMVarIO
err <- newEmptyTMVarIO
forkIO $ runProxy $ hGetLineS stdout >-> toTMVarC out
forkIO $ runProxy $ hGetLineS stderr >-> toTMVarC err
let combine () = runIdentityP $ forever $ do
str <- lift $ atomically $
takeTMVar out `orElse` takeTMVar err
respond str
runProxy $ combine >-> putStrLnD
Just change out putStrLnD with however you want to process the input.
To learn more about the pipes package, just read Control.Proxy.Tutorial.

For posix system you can use createPipe and fdToHandle in System.Posix.IO to create a pair of new handles (I'm not sure where to close those handles and fds though..):
readProcessWithMergedOutput :: String -> IO (ExitCode, String)
readProcessWithMergedOutput command = do
(p_r, p_w) <- createPipe
h_r <- fdToHandle p_r
h_w <- fdToHandle p_w
(_, _, _, h_proc) <- createProcess (proc command [])
{ std_out = UseHandle h_w
, std_err = UseHandle h_w
}
ret_code <- waitForProcess h_proc
content <- hGetContents h_r
return (ret_code, content)
For windows, this post implemented a cross-platform createPipe.

Related

conditional standard handle redirection in Haskell

I want to read a file, process it, and write the results to another file; the input file name is to be supplied through a console argument, and the output file name is generated from the input file name.
The catch is I want it to transparently “fail over” to stdin/stdout if no arguments are supplied; essentially, in case a file name is supplied, I redirect stdin/stdout to the respective file names so I can transparently use interact whether the file name was supplied or not.
Here's the code hacked together with dummy output in a superfluous else. What will be the proper, idiomatic form of doing it?
It probably could have something to do with Control.Monad's when or guard, as was pointed out in a similar question, but maybe somebody wrote this already.
import System.IO
import Data.Char(toUpper)
import System.Environment
import GHC.IO.Handle
main :: IO ()
main = do
args <- getArgs
if(not $ null args) then
do
print $ "working with "++ (head args)
finHandle <- openFile (head args) ReadMode --open the supplied input file
hDuplicateTo finHandle stdin --bind stdin to finName's handle
foutHandle <- openFile ((head args) ++ ".out") WriteMode --open the output file for writing
hDuplicateTo foutHandle stdout --bind stdout to the outgoing file
else print "working through stdin/redirect" --get to know
interact ((++) "Here you go---\n" . map toUpper)
There's nothing very special about interact - here is its definition:
interact :: (String -> String) -> IO ()
interact f = do s <- getContents
putStr (f s)
How about something like this:
import System.Environment
import Data.Char
main = do
args <- getArgs
let (reader, writer) =
case args of
[] -> (getContents, putStr)
(path : _) -> let outpath = path ++ ".output"
in (readFile path, writeFile outpath)
contents <- reader
writer (process contents)
process :: String -> String
process = (++) "Here you go---\n" . map toUpper
Based on the command line arguments we set reader and writer to the IO-actions which will read the input and write the output.
This seems fairly idiomatic to me already. The one note I have is to avoid head, as it is an unsafe function (it can throw a runtime error). In this case it is fairly easy to do so by using case to pattern match.
main :: IO ()
main = do
args <- getArgs
case args of
fname:_ -> do
print $ "working with " ++ fname
finHandle <- openFile fname ReadMode
hDuplicateTo finHandle stdin
foutHandle <- openFile (fname ++ ".out") WriteMode
hDuplicateTo foutHandle stdout
[] -> do
print "working through stdin/redirect"
interact ((++) "Here you go---\n" . map toUpper)

How to pipe output from an IO action into a process in haskell

I want to create a process and write some text from my haskell program into the process's stdin periodically (from an IO action).
The following works correctly in GHCi but don't work correctly when built and run. In GHCi everything works perfectly and the value from the IO action is fed in periodically. When built and run however, it seems to pause for arbitrarily long periods of time when writing to stdin of the process.
I've used CreateProcess (from System.Process) to create the handle and tried hPutStrLn (bufferent set to NoBuffering -- LineBuffering didnt work either).
So I'm trying the process-streaming package and pipes but can't seem to get anything to work at all.
The real question is this: How do i create a process from haskell and write to it periodically?
Minimal example that exhibits this behavior:
import System.Process
import Data.IORef
import qualified Data.Text as T -- from the text package
import qualified Data.Text.IO as TIO
import Control.Concurrent.Timer -- from the timers package
import Control.Concurrent.Suspend -- from the suspend package
main = do
(Just hin, _,_,_) <- createProcess_ "bgProcess" $
(System.Process.proc "grep" ["10"]) { std_in = CreatePipe }
ref <- newIORef 0 :: IO (IORef Int)
flip repeatedTimer (msDelay 1000) $ do
x <- atomicModifyIORef' ref $ \x -> (x + 1, x)
hSetBuffering hin NoBuffering
TIO.hPutStrLn hin $ T.pack $ show x
Any help will be greatly appreciated.
This is a pipes Producer that emits a sequence of numbers with a second delay:
{-# language NumDecimals #-}
import Control.Concurrent
import Pipes
import qualified Data.ByteString.Char8 as Bytes
periodic :: Producer Bytes.ByteString IO ()
periodic = go 0
where
go n = do
d <- liftIO (pure (Bytes.pack (show n ++ "\n"))) -- put your IO action here
Pipes.yield d
liftIO (threadDelay 1e6)
go (succ n)
And, using process-streaming, we can feed the producer to an external process like this:
import System.Process.Streaming
main :: IO ()
main = do
executeInteractive (shell "grep 10"){ std_in = CreatePipe } (feedProducer periodic)
I used executeInteractive, which sets std_in automatically to NoBuffering.
Also, if you pipe std_out and want to process each match immediately, be sure to pass the --line-buffered option to grep (or use the stdbuf command) to ensure that matches are immediately available at the output.
What about using threadDelay, e.g.:
import Control.Monad (forever)
import Control.Concurrent (threadDelay)
...
forever $ do
x <- atomicModifyIORef' ref $ \x -> (x + 1, x)
hSetBuffering hin NoBuffering
TIO.hPutStrLn hin $ T.pack $ show x
threadDelay 1000000 -- 1 sec
Spawn this off in another thread if you need to do other work at the same time.
You can remove he need for the IORef with:
loop h x = do
hSetBuffering h NoBuffering
TIO.hPutStrLn h $ T.pack $ show x
threadDelay 1000000
loop h (x+1)
And, of course, you only need to do the hSetBuffering once - e.g. do it just before you enter the loop.

How do I break out of a loop in Haskell?

The current version of the Pipes tutorial, uses the following two functions in one of the example:
stdout :: () -> Consumer String IO r
stdout () = forever $ do
str <- request ()
lift $ putStrLn str
stdin :: () -> Producer String IO ()
stdin () = loop
where
loop = do
eof <- lift $ IO.hIsEOF IO.stdin
unless eof $ do
str <- lift getLine
respond str
loop
As is mentinoed in the tutorial itself, P.stdin is a bit more complicated due to the need to check for the end of input.
Are there any nice ways to rewrite P.stdin to not need a manual tail recursive loop and use higher order control flow combinators like P.stdout does? In an imperative language I would use a structured while loop or a break statement to do the same thing:
while(not IO.isEOF(IO.stdin) ){
str <- getLine()
respond(str)
}
forever(){
if(IO.isEOF(IO.stdin) ){ break }
str <- getLine()
respond(str)
}
I prefer the following:
import Control.Monad
import Control.Monad.Trans.Either
loop :: (Monad m) => EitherT e m a -> m e
loop = liftM (either id id) . runEitherT . forever
-- I'd prefer 'break', but that's in the Prelude
quit :: (Monad m) => e -> EitherT e m r
quit = left
You use it like this:
import Pipes
import qualified System.IO as IO
stdin :: () -> Producer String IO ()
stdin () = loop $ do
eof <- lift $ lift $ IO.hIsEOF IO.stdin
if eof
then quit ()
else do
str <- lift $ lift getLine
lift $ respond str
See this blog post where I explain this technique.
The only reason I don't use that in the tutorial is that I consider it less beginner-friendly.
Looks like a job for whileM_:
stdin () = whileM_ (lift . fmap not $ IO.hIsEOF IO.stdin) (lift getLine >>= respond)
or, using do-notation similarly to the original example:
stdin () =
whileM_ (lift . fmap not $ IO.hIsEOF IO.stdin) $ do
str <- lift getLine
respond str
The monad-loops package offers also whileM which returns a list of intermediate results instead of ignoring the results of the repeated action, and other useful combinators.
Since there is no implicit flow there is no such thing like "break". Moreover your sample already is small block which will be used in more complicated code.
If you want to stop "producing strings" it should be supported by your abstraction. I.e. some "managment" of "pipes" using special monad in Consumer and/or other monads that related with this one.
You can simply import System.Exit, and use exitWith ExitSuccess
Eg. if (input == 'q')
then exitWith ExitSuccess
else print 5 (anything)

Catching/hijacking stdout in haskell

How can I define 'catchOutput' so that running main outputs only 'bar'?
That is, how can I access both the output stream (stdout) and the actual output of an io action separately?
catchOutput :: IO a -> IO (a,String)
catchOutput = undefined
doSomethingWithOutput :: IO a -> IO ()
doSomethingWithOutput io = do
(_ioOutp, stdOutp) <- catchOutput io
if stdOutp == "foo"
then putStrLn "bar"
else putStrLn "fail!"
main = doSomethingWithOutput (putStr "foo")
The best hypothetical "solution" I've found so far includes diverting stdout, inspired by this, to a file stream and then reading from that file (Besides being super-ugly I haven't been able to read directly after writing from a file. Is it possible to create a "custom buffer stream" that doesn't have to store in a file?). Although that feels 'a bit' like a side track.
Another angle seems to use 'hGetContents stdout' if that is supposed to do what I think it should. But I'm not given permission to read from stdout. Although googling it seems to show that it has been used.
I used the following function for an unit test of a function that prints to stdout.
import GHC.IO.Handle
import System.IO
import System.Directory
catchOutput :: IO () -> IO String
catchOutput f = do
tmpd <- getTemporaryDirectory
(tmpf, tmph) <- openTempFile tmpd "haskell_stdout"
stdout_dup <- hDuplicate stdout
hDuplicateTo tmph stdout
hClose tmph
f
hDuplicateTo stdout_dup stdout
str <- readFile tmpf
removeFile tmpf
return str
I am not sure about the in-memory file approach, but it works okay for a small amount of output with a temporary file.
There are some packages on Hackage that promise to do that : io-capture and silently. silently seems to be maintained and works on Windows too (io-capture only works on Unix). With silently, you use capture :
import System.IO.Silently
main = do
(output, _) <- capture $ putStr "hello"
putStrLn $ output ++ " world"
Note that it works by redirecting output to a temporary file and then read it... But as long as it works !
Why not just use a writer monad instead? For example,
import Control.Monad.Writer
doSomethingWithOutput :: WriterT String IO a -> IO ()
doSomethingWithOutput io = do
(_, res) <- runWriterT io
if res == "foo"
then putStrLn "bar"
else putStrLn "fail!"
main = doSomethingWithOutput (tell "foo")
Alternatively, you could modify your inner action to take a Handle to write to instead of stdout. You can then use something like knob to make an in-memory file handle which you can pass to the inner action, and check its contents afterward.
As #hammar pointed out, you can use a knob to create an in-memory file, but you can also use hDuplicate and hDuplicateTo to change stdout to the memory file, and back again. Something like the following completely untested code:
catchOutput io = do
knob <- newKnob (pack [])
let before = do
h <- newFileHandle knob "<stdout>" WriteMode
stdout' <- hDuplicate stdout
hDuplicateTo h stdout
hClose h
return stdout'
after stdout' = do
hDuplicateTo stdout' stdout
hClose stdout'
a <- bracket_ before after io
bytes <- Data.Knob.getContents knob
return (a, unpack bytes)

Parallel IO Causes Random Text Output in Terminal

I'm using
import Control.Concurrent.ParallelIO.Global
main = parallel_ (map processI [1..(sdNumber runParameters)]) >> stopGlobalPool
where
processI :: Int -> IO ()
is some function, which reads data from file, processes it and writes it to another file. No output to terminal. The problem is when I run the program with +RTS -N8 the terminal is flooded with random text like
piptufuht teata thtsieieo ocnsno e nscsdeoe qnqvuduee ernvnstetiirioasanlil lolwynya. .s
w
a s s uY Ysosopuuue's'nvpvdeeee n dpdp rerdodoub beada
bub lel y
What is happening? Without +RTS there is no clutter. I couldn't reproduce the behavior with a more simple (suitable to post here) program.
GHC 7.0.3 if that matters
Buffering is probably preventing you from constructing a simple test case. I was able to reproduce it with this (only when run with +RTS -Nsomething):
import Control.Concurrent
import System.IO
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
forkIO $ putStrLn "foo"
forkIO $ putStrLn "bar"
forkIO $ putStrLn "baz"
threadDelay 1000 -- Allow things to print
As Thomas mentioned, you'll probably need to sequence this somehow, though I'm not sure how writing straight to files would change this. Here's a simple example how you can sequence this with a Chan. I'm sure there's a better way to do this, this is just an example of how I got this to not garble the output.
import Control.Concurrent
import Control.Concurrent.Chan
import System.IO
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
ch <- newChan -- Things written here are picked up by stuffWriter
forkIO $ stuffWriter ch -- Fire up concurrent stuffWriter
forkIO $ writeChan ch "foo"
forkIO $ writeChan ch "bar"
forkIO $ writeChan ch "baz"
threadDelay 1000 -- Allow things to print
-- | Write all the things!
stuffWriter :: Chan String -> IO ()
stuffWriter ch = do
readChan ch >>= putStrLn -- Block, then write once I've got something
stuffWriter ch -- loop... looking for more things to write
Now your writes to somewhere are now synchronous (stuffWriter writes things, one at a time), and you should have no more garbling.

Resources