Piping from stdin to process - haskell

I'm trying to pipe the stdin of my program to an external process using the following
import System.IO
import System.Posix.IO
import System.Posix.Process
import qualified System.Process as P
import Control.Concurrent (forkIO, killThread)
import Control.Monad
main :: IO ()
main = do
h <- fdToHandle stdInput
(Just hIn, _, _, p) <-
P.createProcess (P.shell "runhaskell echo.hs"){ P.std_in = P.CreatePipe }
hSetBuffering hIn NoBuffering
tid <- forkIO $ getInput hIn
e <- P.waitForProcess p
killThread tid
print e
getInput hin = do
forever $ do
l <- getLine
hPutStrLn hin l
where echo.hs just echoes stdin to stdout, but if I wait a couple seconds between giving new input, I get the following error:
pipes.hs: <stdin>: hGetLine: invalid argument (Bad file descriptor)
when I tried compiling with ghc pipes.hs, the compiled program would not redirect stdin to the stdin of echo.hs at all

Your fdToHandler stdInput call creates a new Handle pointing at file descriptor 0 (stdin) of the original process. After a bit of time, the garbage collector notices that it's no longer being used, and garbage collects the Handle, which in turn causes the underlying file descriptor to be closed. Then your getLine (which uses System.IO.stdin) call fails. That's because that Handle is still open, but the underlying file descriptor it's pointing at has been closed.
FWIW, I'd recommend using binary I/O on the handles to avoid issues with character encodings.

Related

Haskell. How to make my program to terminate?

I want to make an "asynchronous" reading of stdin for a few seconds. The way I am doing if is by forking getContents and writing to a Chan. After 5 seconds, I kill the thread and read the channel.
From my understading, the code below should just print whatever is in chan and terminate, but it keeps waiting for input and ^C must be pressed to finish. This is the behaviour you have when doing getContents >>= print on its own, so I have two guesses and no clue about each:
The thread is not killed and getContent keep running asking for more input
Something weird is happening with multithreading (see considerations below)
-- OS: Ubuntu 22.04.1 LTS
-- Gnome Terminal
-- ghc 9.2.5
import Control.Concurrent.Chan ( newChan, readChan, writeChan, Chan )
import Control.Concurrent (threadDelay, forkIO, killThread)
import System.IO (getContents, BufferMode (..), stdin, hSetBuffering)
main :: IO ()
main = do
hSetBuffering stdin NoBuffering
chan <- newChan
putStrLn "start"
threadId <- forkIO $ getContents >>= writeChan chan
threadDelay 5000000
putStrLn "\nend"
killThread threadId
a <- readChan chan
print a
Some considerations:
Using getLine make it work only if Enter is pressed. But I'd like to just "stream" stdin into the channel
hSetBuffering stdin NoBuffering is neccessary, otherwise the programm hangs (I guess waiting for end of input?)
Using getContents' cause a thread blocked indefinitely in an MVar operation, which up to the documentation is due to the channel being empty. I guess getContents' actually never terminates.
Last but most importantly, the behaviour is different depending on the compilation parameters:
ghc -threaded main.hs && ./main +RTS -N2 wont print anything and will hang until ^C is pressed (same thing with -N1, -N3, etc...)
runghc main.hs will actually print whatever has being the stdin during the 5 seconds (i.e. whatever is in chan) and then hang.
Just to clarify. Here are the ouputs:
> runghc main.hs
start
abc # user input
end
"abc^C" # ^C is done manually, to termiante the program and the last \" is for formatting purpose
> ghc -threaded main.hs && ./main +RTS -N2
start
abc # user input
end
^C # ^C is done manually, to termiante the program
So the question is simple. How do I make my program to end?
The thread getContents >>= writeChan chan is not an infinite loop that constantly adds content to chan. getContents creates a thunk, which is put in chan, and the thread terminates near instantaneously. Then in the main thread readChan gets that thunk, and print a forces it. It's the forcing of the thunk which prompts reading stdin, hence your program just blocks for more input until EOF or it gets killed.
What you want to do is to explicitly take small bits of input and write them into the channel. However, in the main thread, the channel does not give you a way to tell when it's ended. A workaround is to use an IORef String instead as a channel. Write to it by explicitly appending to the stored string, and readIORef will give you whatever content was written so far.
import Control.Concurrent.Chan ( newChan, readChan, writeChan, Chan )
import Control.Concurrent (threadDelay, forkIO, killThread)
import Control.Monad (forever)
import Data.IORef
import System.IO (getContents, BufferMode (..), stdin, hSetBuffering)
main :: IO ()
main = do
hSetBuffering stdin NoBuffering
buf <- newIORef []
putStrLn "start"
threadId <- forkIO $ forever $ do
c <- getChar
atomicModifyIORef' buf (\cs -> (c : cs, ()))
threadDelay 5000000
putStrLn "\nend"
killThread threadId
a <- reverse <$> readIORef buf
print a

How to reliably determine if a Handle is a terminal in Haskell?

I am using System.IO.hIsTerminalDevice to determine if a Handle is a terminal and apply colorization if this is the case. I noticed that when forking a process using CreatePipe as stream for new process stdin and stdout, this function returns True which seems to be the wrong answer: A pipe should not be considered as a terminal. I have tried to track down the issue looking at System.IO and System.POSIX.IO source code but it ends up in C pipe function and lead me nowhere.
Is there a better way to tell if a handle is a terminal? Or am I doing something wrong?
Update
Here are 2 programs that are supposed to expose the behaviour I observed:
import Control.Monad
import System.IO
import System.Process
main = do
(in_, Just out, Just err, h) <- createProcess $ (proc "./test2" [])
{ std_in = CreatePipe
, std_err = CreatePipe
, std_out = CreatePipe }
dump out
where
dump h = forever $ do
ln <- hGetLine h
putStrLn ln
Then `test2 :
import System.IO
main = do
print =<< hIsTerminalDevice stderr
print =<< hIsTerminalDevice stdout
print =<< hIsTerminalDevice stdin
Then running the first program:
$ ./test
False
False
False
I know what's happening: What I am forking is not the program itself but a docker container! And I explicitly add -t parameter which allocates a tty for the container...
Do you have a minimal example to illustrate the problem? The following program prints "False" twice, suggesting that handles created with CreatePipe are not misidentified as terminal devices:
import System.Process
import System.IO
main = do (Just in_, Just out, err, h) <-
createProcess $ (shell "cat") { std_in = CreatePipe,
std_out = CreatePipe }
print =<< hIsTerminalDevice in_
print =<< hIsTerminalDevice out

How to pipe output from an IO action into a process in haskell

I want to create a process and write some text from my haskell program into the process's stdin periodically (from an IO action).
The following works correctly in GHCi but don't work correctly when built and run. In GHCi everything works perfectly and the value from the IO action is fed in periodically. When built and run however, it seems to pause for arbitrarily long periods of time when writing to stdin of the process.
I've used CreateProcess (from System.Process) to create the handle and tried hPutStrLn (bufferent set to NoBuffering -- LineBuffering didnt work either).
So I'm trying the process-streaming package and pipes but can't seem to get anything to work at all.
The real question is this: How do i create a process from haskell and write to it periodically?
Minimal example that exhibits this behavior:
import System.Process
import Data.IORef
import qualified Data.Text as T -- from the text package
import qualified Data.Text.IO as TIO
import Control.Concurrent.Timer -- from the timers package
import Control.Concurrent.Suspend -- from the suspend package
main = do
(Just hin, _,_,_) <- createProcess_ "bgProcess" $
(System.Process.proc "grep" ["10"]) { std_in = CreatePipe }
ref <- newIORef 0 :: IO (IORef Int)
flip repeatedTimer (msDelay 1000) $ do
x <- atomicModifyIORef' ref $ \x -> (x + 1, x)
hSetBuffering hin NoBuffering
TIO.hPutStrLn hin $ T.pack $ show x
Any help will be greatly appreciated.
This is a pipes Producer that emits a sequence of numbers with a second delay:
{-# language NumDecimals #-}
import Control.Concurrent
import Pipes
import qualified Data.ByteString.Char8 as Bytes
periodic :: Producer Bytes.ByteString IO ()
periodic = go 0
where
go n = do
d <- liftIO (pure (Bytes.pack (show n ++ "\n"))) -- put your IO action here
Pipes.yield d
liftIO (threadDelay 1e6)
go (succ n)
And, using process-streaming, we can feed the producer to an external process like this:
import System.Process.Streaming
main :: IO ()
main = do
executeInteractive (shell "grep 10"){ std_in = CreatePipe } (feedProducer periodic)
I used executeInteractive, which sets std_in automatically to NoBuffering.
Also, if you pipe std_out and want to process each match immediately, be sure to pass the --line-buffered option to grep (or use the stdbuf command) to ensure that matches are immediately available at the output.
What about using threadDelay, e.g.:
import Control.Monad (forever)
import Control.Concurrent (threadDelay)
...
forever $ do
x <- atomicModifyIORef' ref $ \x -> (x + 1, x)
hSetBuffering hin NoBuffering
TIO.hPutStrLn hin $ T.pack $ show x
threadDelay 1000000 -- 1 sec
Spawn this off in another thread if you need to do other work at the same time.
You can remove he need for the IORef with:
loop h x = do
hSetBuffering h NoBuffering
TIO.hPutStrLn h $ T.pack $ show x
threadDelay 1000000
loop h (x+1)
And, of course, you only need to do the hSetBuffering once - e.g. do it just before you enter the loop.

In Haskell, how to flush Data.Text every line?

How can I change this program to immediately process every line of text in case of interactive input? Preferably flush buffer every newline character.
main = do
input <- T.getContents
mapM_ T.putStrLn $ T.lines input
Update: Something is still missing. Take a look (???? is after newline, stdout is printed out after reaching EOF on stdin) :
> cat Test.hs
import System.IO
import Data.Text as T
import Data.Text.IO as T
main = do
hSetBuffering stdout LineBuffering
input <- T.getContents
mapM_ T.putStrLn $ T.lines input
> runhaskell Test.hs
a
????
a
????
> runhaskell --version
runghc 7.6.3
>
You want to use hSetBuffering from System.IO:
import System.IO
main = do
hSetBuffering stdout LineBuffering
input <- T.getContents
mapM_ T.putStrLn $ T.lines input
It seems like you want to use lazy input to interleave reading lines and handling them.
getContents from Data.Text.IO is not lazy, and will read everything before returning anything at all.
Import the version from Data.Text.Lazy.IO instead.

Parallel IO Causes Random Text Output in Terminal

I'm using
import Control.Concurrent.ParallelIO.Global
main = parallel_ (map processI [1..(sdNumber runParameters)]) >> stopGlobalPool
where
processI :: Int -> IO ()
is some function, which reads data from file, processes it and writes it to another file. No output to terminal. The problem is when I run the program with +RTS -N8 the terminal is flooded with random text like
piptufuht teata thtsieieo ocnsno e nscsdeoe qnqvuduee ernvnstetiirioasanlil lolwynya. .s
w
a s s uY Ysosopuuue's'nvpvdeeee n dpdp rerdodoub beada
bub lel y
What is happening? Without +RTS there is no clutter. I couldn't reproduce the behavior with a more simple (suitable to post here) program.
GHC 7.0.3 if that matters
Buffering is probably preventing you from constructing a simple test case. I was able to reproduce it with this (only when run with +RTS -Nsomething):
import Control.Concurrent
import System.IO
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
forkIO $ putStrLn "foo"
forkIO $ putStrLn "bar"
forkIO $ putStrLn "baz"
threadDelay 1000 -- Allow things to print
As Thomas mentioned, you'll probably need to sequence this somehow, though I'm not sure how writing straight to files would change this. Here's a simple example how you can sequence this with a Chan. I'm sure there's a better way to do this, this is just an example of how I got this to not garble the output.
import Control.Concurrent
import Control.Concurrent.Chan
import System.IO
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
ch <- newChan -- Things written here are picked up by stuffWriter
forkIO $ stuffWriter ch -- Fire up concurrent stuffWriter
forkIO $ writeChan ch "foo"
forkIO $ writeChan ch "bar"
forkIO $ writeChan ch "baz"
threadDelay 1000 -- Allow things to print
-- | Write all the things!
stuffWriter :: Chan String -> IO ()
stuffWriter ch = do
readChan ch >>= putStrLn -- Block, then write once I've got something
stuffWriter ch -- loop... looking for more things to write
Now your writes to somewhere are now synchronous (stuffWriter writes things, one at a time), and you should have no more garbling.

Resources