Interacting with a subprocess while capturing stderr haskell

Interacting with a subprocess while capturing stderr haskell - haskell

So I have a Haskell program that interacts with a subprocess using the System.Process.Typed library. I am trying to capture the stderr of the subprocess during the entire duration of the subprocess's lifespan. The current method doesn't work if the subprocess finishes before I get to line *. I think to do this I need to use STM but I don't know anything about STM and so was wondering if there was a simpler way.
fun :: MyType -> IO MyOtherType
fun inparam = withProcessWait config $ \process -> do
hPutStrLn (getStdin process) (getStr1 inparam)
hFlush (getStdin process)
response1 <- hGetLine (getStdout process)
hPutStrLn (getStdin process) (getStr2 inparam)
hFlush (getStdin process)
response2 <- hGetLine (getStdout process)
err <- hGetContents (getStderr process) -- this is line *
hClose (getStdin process)
exitCode <- timedWaitExitCode 100 process
return $ MyOtherType response1 response2 err
where
config = setStdin createPipe
$ setStdout createPipe
$ setStderr createPipe
$ fromString (fp inparam)
Thank you in advance.
Edit 1: Fixed the * label
Edit 2: When I try to run the code I get Exception: [..] hGetContents: illegal operation (delayed read on closed handle)

You did not specify what exactly “doesn’t work” in your code, so I’ll try to guess. One potential issue that I can immediately see is that you are returning values that you read from file handles (response1, response2, err) from your function. The problem here is that Haskell is a lazy language, so the values that you return are not actually read from those handles until they are really needed. And by the time they are needed, the child process has exited and the handles are closed, so it is impossible to read from them.
The simplest fix would be to force those entire strings to be read before you “return” from your function. One standard recipe for this is to use force followed by evaluate. This will make your program actually read the values and remember them, so the handles can be closed.
So, instead of:
value <- hGetContents handle
you should do:
value' <- hGetContents handle
value <- evaluate $ force value'

Related

How do I keep a spawned process alive in Haskell?

I'm trying to set up a "bridge" between my Haskell code and an interactive command-line process. More specifically, I'm trying to run an Elm REPL and send/receive through stdin/stdout. I wasn't sure exactly which library to use for this, but I went with typed-process.
The issue I have is that my Haskell program finishes (or quits) while the REPL process is still running. How do I avoid this?
Also, another problem is that the REPL process isn't getting any input from the stdin handle.
My current code looks like this:
run :: Document -> IO (Result () Text)
run (Document moduleName tests) = do
let config = createConfig
p <- startProcess config
hSetBuffering (getStdin p) NoBuffering
hSetBuffering (getStdout p) NoBuffering
Data.Text.IO.hPutStr (getStdin p) "True\n"
Data.Text.IO.hGetChunk (getStdout p) >>= print
_ <- waitExitCode p
return (Ok ())
{-| Config for process.
-}
createConfig =
shell "elm repl"
|> setStdin createPipe
|> setStdout createPipe
|> setStderr closed

From the docs, it seems that stopProcess forces the process to stop (it sends SIGTERM on unix). This is because the docs state that it calls terminateProcess, and then waits.
We only want to wait without terminating the process. I would try waitExitCode or similar functions, instead.

Haskell hGetContents error

Is the following error due to lazy evaluation?
epubParsing :: FilePath -> IO [String]
epubParsing f = do
h <- openFile f ReadMode
hSetEncoding h utf8
content <- hGetContents h
hClose h
return . fromJust $ scrapeStringLike content paragraphS
I get an error: hGetContents: illegal operation (delayed read on closed handle)
Why?

Calling hGetContents puts the handle into a special "semi-closed" state. You cannot perform any explicit operations on it after that. In particular, you don't manually close it; it automatically gets closed in the background when you read to the end of the string. You can just remove that hClose and it will work.
This is one of the pitfalls of lazy I/O, and one of the reasons people advise to avoid it; it makes the timing of your I/O operations kind of unpredictable.

Understanding `withFile` with Example

I implemented withFile in Haskell:
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path iomode f = do
handle <- openFile path iomode
result <- f handle
hClose handle
return result
When I ran the main provided by Learn You a Haskell, it printed out the content of "girlfriend.txt," as expected:
import System.IO
main = do
withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
putStr contents)
I wasn't sure if my withFile' would've worked with the last 2 lines: (1) close the handle and (2) returning the result as anIO a.
Why didn't the following happen?
result gets lazily bound to f handle
hClose handle closes the file handle
result gets return'd, which results in the actual evaluate of f handle. Since handle was closed, an error gets thrown.

Lazy IO is popularly known as confusing.
It depends on whether putStr executes before hClose or not.
Notice the difference between the first and second uses (the brackets are unnecessary but clarifying in the second example).
ghci> withFile' "temp.hs" ReadMode (hGetContents >=> putStr) -- putStr
import System.IO
import Control.Monad
withFile' :: FilePath -> IOMode -> (Handle -> IO a) -> IO a
withFile' path iomode f = do
handle <- openFile path iomode
result <- f handle
hClose handle
return result
ghci> (withFile' "temp.hs" ReadMode hGetContents) >>= putStr
ghci>
In both cases, the f passed in gets a chance to run before the handle is closed. Because of lazy evaluation, hGetContents only reads the file if it needs to, i.e. is forced to in order to produce output for some other function.
In the first example, since f is (hGetContents >=> putStr), the full contents of the file must be read in order to execute putStr.
In the second example, nothing needs to be evaluated after hGetContents in order to return result, which is a lazy list. (I can quite happily return (show [1..]) which will only fail to terminate if I choose to use the entire output.) This is seen as a problem for lazy IO, which is fixed by alternatives such as strict IO, pipes or conduit.
Maybe returning the empty string for a file when the handle was closed prematurely is a bug, but certainly running the entirety of f before closing it is not.

Equational reasoning means that you can reason about Haskell code by just inlining and substituting things (with certain caveats, but they don't apply here).
This means that all I need to do to understand your code is to take the withFile' here:
import System.IO
main = do
withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
putStr contents)
... and inline its definition:
main = do
handle <- openFile "girlfriend.txt" ReadMode
contents <- hGetContents handle
result <- putStr contents
hClose handle
return result
Once you inline its definition, it's easier to see what is going on. putStr evaluates the entire contents of the file before you close the handle, so there is no error. Also, result is not what you think it is: it's the return value of putStr, which is just (), not the contents of the file.

Most IO actions are not lazily executed.
IO action execution is different from normal Haskell evaluation of values. IO execution is only ever carried out by the outer driver that is trying to execute all the effects of main; it does so in the correct order implied by the monadic sequencing of IO actions.
The driver's need to know what the next IO action is ultimately triggers all evaluation of lazy values in Haskell; if it were happy with an unevaluated lazy value and moved on to the next thing without fully evaluating and executing it, then it would just leave main unevaluated and no Haskell program could ever do anything.
The Haskell value resulting from executing an IO action may of course be an unevaluated lazy value, but each IO action itself is evaluated and executed by the driver (including all sub-actions sequenced with do blocks or binds).
So result doesn't get lazily bound to f handle completely unevaluated; f handle is evaluated to come up with the sub actions hGetContents handle and putStr contents. These are both fully executed before the outer driver moves on to hClose handle, so everything's okay.
Note however that hGetContents is special. Quoting from the documentation:
Computation hGetContents hdl returns the list of characters corresponding to the unread portion of the channel or file managed by hdl, which is put into an intermediate state, semi-closed. In this state, hdl is effectively closed, but items are read from hdl on demand and accumulated in a special list returned by hGetContents hdl.
Any operation that fails because a handle is closed, also fails if a handle is semi-closed. The only exception is hClose. A semi-closed handle becomes closed:
if hClose is applied to it;
if an I/O error occurs when reading an item from the handle;
or once the entire contents of the handle has been read.
Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.
So executing hGetContents handle actually results in a partially evaluated list, whose lazy evaluation is tied to further IO operations under the hood. This is impossible to do yourself without using the Unsafe family of operations, since it is essentially bypassing the type system and can result in exactly the sort of problem you were concerned about; if you had attempted the following code:
main = do
text <- withFile' "girlfriend.txt" ReadMode (\handle -> do
contents <- hGetContents handle
return contents)
putStr text
(where the function passed to withFile' tries to return the file contents, and they are passed to putStr after the withFile' call), then the putStr would be executed after hClose, and the file may well not have been fully read before it was closed.

Using Data.Binary.decodeFile, encountered error "demandInput: not enough bytes"

I'm attempting to use the encodeFile and decodeFile functions in Data.Binary to save a very large datastructure so that I don't have to recompute it every time I run this program. The relevant encoding- and decoding-functions are as follows:
writePlan :: IO ()
writePlan = do (d, _, bs) <- return subjectDomain
outHandle <- openFile "outputfile" WriteMode
((ebsP, aP), cacheData) <- preplanDomain d bs
putStrLn "Calculated."
let toWrite = ((map pseudofyOverEBS ebsP, aP),
pseudofyOverMap cacheData) :: WrittenData
in do encodeFile preplanFilename $ encode toWrite
putStrLn "Done."
readPlan :: IO (([EvaluatedBeliefState], [Action]), MVar HeuCache)
readPlan = do (d, _, _) <- return subjectDomain
inHandle <- openFile "outputfile" ReadMode
((ebsP, aP), cacheData) <- decodeFile preplanFilename :: IO WrittenData
fancyCache <- newMVar (M.empty, depseudofyOverMap cacheData)
return $! ((map depseudofyOverEBS ebsP, aP), fancyCache)
The program to calculate and write the file (using writePlan) executes without error, outputting a gigantic binary file. However, when I run the program which takes in this file, executing readPlan results in the error (the program name is "Realtime"):
Realtime: demandInput: not enough bytes
I can't make head nor tail of this, and scouring Google has turned up no substantial documentation or discussion of this message. Any insight would be appreciated!

I am very late to the party, but found this while looking for help with a similar issue. I'm working with the incremental interface for Data.Binary.Get. As you can see in here, the function is defined in Data.Binary.Get.Internal. Now I am guessing, but your decodeFile function probably does some sort of parsing and the error is thrown because the file does not parse completely (i.e. the parser thinks that there must be something else in the file but it reaches EOF already).
Hope that helps anyone with this/similar issues!

Catching/hijacking stdout in haskell

How can I define 'catchOutput' so that running main outputs only 'bar'?
That is, how can I access both the output stream (stdout) and the actual output of an io action separately?
catchOutput :: IO a -> IO (a,String)
catchOutput = undefined
doSomethingWithOutput :: IO a -> IO ()
doSomethingWithOutput io = do
(_ioOutp, stdOutp) <- catchOutput io
if stdOutp == "foo"
then putStrLn "bar"
else putStrLn "fail!"
main = doSomethingWithOutput (putStr "foo")
The best hypothetical "solution" I've found so far includes diverting stdout, inspired by this, to a file stream and then reading from that file (Besides being super-ugly I haven't been able to read directly after writing from a file. Is it possible to create a "custom buffer stream" that doesn't have to store in a file?). Although that feels 'a bit' like a side track.
Another angle seems to use 'hGetContents stdout' if that is supposed to do what I think it should. But I'm not given permission to read from stdout. Although googling it seems to show that it has been used.

I used the following function for an unit test of a function that prints to stdout.
import GHC.IO.Handle
import System.IO
import System.Directory
catchOutput :: IO () -> IO String
catchOutput f = do
tmpd <- getTemporaryDirectory
(tmpf, tmph) <- openTempFile tmpd "haskell_stdout"
stdout_dup <- hDuplicate stdout
hDuplicateTo tmph stdout
hClose tmph
f
hDuplicateTo stdout_dup stdout
str <- readFile tmpf
removeFile tmpf
return str
I am not sure about the in-memory file approach, but it works okay for a small amount of output with a temporary file.

There are some packages on Hackage that promise to do that : io-capture and silently. silently seems to be maintained and works on Windows too (io-capture only works on Unix). With silently, you use capture :
import System.IO.Silently
main = do
(output, _) <- capture $ putStr "hello"
putStrLn $ output ++ " world"
Note that it works by redirecting output to a temporary file and then read it... But as long as it works !

Why not just use a writer monad instead? For example,
import Control.Monad.Writer
doSomethingWithOutput :: WriterT String IO a -> IO ()
doSomethingWithOutput io = do
(_, res) <- runWriterT io
if res == "foo"
then putStrLn "bar"
else putStrLn "fail!"
main = doSomethingWithOutput (tell "foo")
Alternatively, you could modify your inner action to take a Handle to write to instead of stdout. You can then use something like knob to make an in-memory file handle which you can pass to the inner action, and check its contents afterward.

As #hammar pointed out, you can use a knob to create an in-memory file, but you can also use hDuplicate and hDuplicateTo to change stdout to the memory file, and back again. Something like the following completely untested code:
catchOutput io = do
knob <- newKnob (pack [])
let before = do
h <- newFileHandle knob "<stdout>" WriteMode
stdout' <- hDuplicate stdout
hDuplicateTo h stdout
hClose h
return stdout'
after stdout' = do
hDuplicateTo stdout' stdout
hClose stdout'
a <- bracket_ before after io
bytes <- Data.Knob.getContents knob
return (a, unpack bytes)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Interacting with a subprocess while capturing stderr haskell - haskell

Related

How do I keep a spawned process alive in Haskell?

Haskell hGetContents error

Understanding `withFile` with Example

Using Data.Binary.decodeFile, encountered error "demandInput: not enough bytes"

Catching/hijacking stdout in haskell

Categories

Resources