As a learning exercise, I created a small shell in Haskell that supports background jobs and simple stdout redirection. However, there seems to be a race condition in my code that I don't understand. I'm using System.Posix.
-- ... imports ...
data Redirection = None
| StdOut String
deriving (Show)
data Command = Command
{ cProgramName :: String
, cProgramArgs :: [String]
, cRedirection :: Redirection
, cBackground :: Bool
} deriving (Show)
data Sync = Sync
{ sMVar :: MVar ()
, sRunningJobs :: IORef [ProcessID]
, sDeadJobs :: IORef [ProcessID]
}
withLock :: Sync -> IO () -> IO ()
withLock (Sync mvar _ _) action = do
putMVar mvar ()
action
takeMVar mvar
-- ... parser stuff ...
resolveLine :: Sync -> String -> IO Bool
resolveLine _ "quit" = return False
resolveLine sync line = do
case P.parse lineParser "" line of
Left err -> putStrLn (show err)
Right Nothing -> return ()
Right (Just (Command prog args redirect bg)) ->
case Map.lookup prog builtins of
(Just cmd) -> cmd sync args
Nothing -> do
pid <- forkProcess $ do
case redirect of
None -> return ()
StdOut target -> do
fd <- openFd target WriteOnly (Just 420) defaultFileFlags { trunc = True }
void $ dupTo fd stdOutput
closeFd fd
executeFile prog True args Nothing
when bg $
withLock sync $ modifyIORef (sRunningJobs sync) (pid:)
return True
runShell :: Sync -> IO ()
runShell sync = do
wd <- getCurrentDirectory
hd <- getHomeDirectory
line <- readline $ shortenDirectory hd wd ++ " > "
case line of
Nothing -> putStrLn "" >> return ()
Just line -> do
addHistory line
-- Reap dead children.
withLock sync $ readIORef (sDeadJobs sync) >>= mapM_
(\pid -> do
getProcessStatus True True pid
modifyIORef (sDeadJobs sync) (delete pid)
)
go <- resolveLine sync line
when go $ runShell sync
where shortenDirectory hd wd = if hd `isPrefixOf` wd
then "~" ++ drop (length hd) wd
else wd
handleSigChild :: Sync -> SignalInfo -> IO ()
handleSigChild sync si = do
withLock sync $ do
modifyIORef (sRunningJobs sync) (delete pid)
modifyIORef (sDeadJobs sync) (pid:)
where
pid = siginfoPid $ siginfoSpecific si
main :: IO ()
main = do
mvar <- newEmptyMVar
rjobs <- newIORef []
djobs <- newIORef []
void $ installHandler sigCHLD (CatchInfo $ handleSigChild $ Sync mvar rjobs djobs) Nothing
runShell $ Sync mvar rjobs djobs
Sometimes, when I run kill on a background job, the handleSigChild function is not called which results in a zombie process, because the child's pid is not added to the shared list of dead processes that will be reaped later. I'm running the kill command from within the shell.
Why is the handler executed only once? Is it because the kill process itself and the killed process both send the signal almost at the same time? How do I fix it?
Related
I've just encountered withSession :: (Session -> IO a) -> IO a of wreq package. I want to evaluate the continuation line by line, but I can't find any way for this.
import Network.Wreq.Session as S
withSession $ \sess -> do
res <- S.getWith opts sess "http://stackoverflow.com/questions"
-- print res
-- .. other things
In above snippet how can I evaluate print res in ghci? In other words, can I get Session type in ghci?
Wonderful question.
I am aware of no methods that can re-enter the GHCi REPL, so that we can use that in CPS functions. Perhaps others can suggest some way.
However, I can suggest an hack. Basically, one can exploit concurrency to turn CPS inside out, if it is based on the IO monad as in this case.
Here's the hack: use this in a GHCi session
> sess <- newEmptyMVar :: IO (MVar Session)
> stop <- newEmptyMVar :: IO (MVar ())
> forkIO $ withSession $ \s -> putMVar sess s >> takeMVar stop
> s <- takeMVar sess
> -- use s here as if you were inside withSession
> let s = () -- recommended
> putMVar stop ()
> -- we are now "outside" withSession, don't try to access s here!
A small library to automatize the hack:
data CPSControl b = CPSControl (MVar ()) (MVar b)
startDebugCps :: ((a -> IO ()) -> IO b) -> IO (a, CPSControl b)
startDebugCps cps = do
cpsVal <- newEmptyMVar
retVal <- newEmptyMVar
stop <- newEmptyMVar
_ <- forkIO $ do
x <- cps $ \c -> putMVar cpsVal c >> takeMVar stop
putMVar retVal x
s <- takeMVar cpsVal
return (s, CPSControl stop retVal)
stopDebugCps :: CPSControl b -> IO b
stopDebugCps (CPSControl stop retVal) = do
putMVar stop ()
takeMVar retVal
testCps :: (String -> IO ()) -> IO String
testCps act = do
putStrLn "testCps: begin"
act "here's some string!"
putStrLn "testCps: end"
return "some return value"
A quick test:
> (x, ctrl) <- startDebugCps testCps
testCps: begin
> x
"here's some string!"
> stopDebugCps ctrl
testCps: end
"some return value"
First, because this is about a specific case, I haven't reduced the code at all, so it will be quite long, and in 2 parts (Helper module, and the main).
SpawnThreads in ConcurHelper takes a list of actions, forks them, and gets an MVar containing the result of the action. It them combines the results, and returns the resulting list. It works fine in certain cases, but blocks indefinitely on others.
If I give it a list of putStrLn actions, it executes them fine, then returns the resulting ()s (yes, I know running print commands on different threads at the same time is bad in most cases).
If I try running multiTest in Scanner though (which takes either scanPorts or scanAddresses, the scan range, and the number of threads to use; then splits the scan range over the threads, and passes the list of actions to SpawnThreads), it will block indefinitely. The odd thing is, according to the debug prompts scattered around ConcurHelper, on each thread, ForkIO is returning before the MVar is filled. This would make sense if it wasn't in a do block, but shouldn't the actions be performed sequentially? (I don't know if this is related to the problem or not; it's just something I noticed while attempting to debug it).
I've thought it out step by step, and if it's executing in the order laid out in spawnThreads, the following should happen:
An empty MVar should be created inside forkIOReturnMVar, and passed to mVarWrapAct.
mVarWrapAct should execute the action, and put the result in the MVar (this is where the problem seems to lie. "MVar filled" is never shown, suggesting the MVar is never put into)
getResults should then take from the resulting list of MVars, and return the results
If point #2 isn't the issue, I can see where the problem would be (and if it is the issue, I can't see why putMVar never executes. Inside the scanner module, the only real function of interest for this question is multiTest. I only included the rest so it could be run).
To do a simple test, you can run the following:
spawnThreads [putStrLn "Hello", putStrLn "World"] (should return [(),()])
multiTest (scanPorts "127.0.0.1") 1 (0,5) (Creates the MVar, hangs for a sec, then crashes with the aforementioned error)
Any help in understanding whats going on here would be appreciated. I can't see what the difference between the 2 use cases are.
Thank you
(And I'm using this atrocious exception handling system because IO errors don't give codes for specific network exceptions, so I've been left with parsing messages to find out what happened)
Main:
module Scanner where
import Network
import Network.Socket
import System.IO
import Control.Exception
import Control.Concurrent
import ConcurHelper
import Data.Maybe
import Data.Char
import NetHelp
data NetException = NetNoException | NetTimeOut | NetRefused | NetHostUnreach
| NetANotAvail | NetAccessDenied | NetAddrInUse
deriving (Show, Eq)
diffExcept :: Either SomeException Handle -> Either NetException Handle
diffExcept (Right h) = Right h
diffExcept (Left (SomeException m))
| err == "WSAETIMEDOUT" = Left NetTimeOut
| err == "WSAECONNREFUSED" = Left NetRefused
| err == "WSAEHOSTUNREACH" = Left NetHostUnreach
| err == "WSAEADDRNOTAVAIL" = Left NetANotAvail
| err == "WSAEACCESS" = Left NetAccessDenied
| err == "WSAEADDRINUSE" = Left NetAddrInUse
| otherwise = error $ show m
where
err = reverse . dropWhile (== ')') . reverse . dropWhile (/='W') $ show m
extJust :: Maybe a -> a
extJust (Just a) = a
selectJusts :: IO [Maybe a] -> IO [a]
selectJusts mayActs = do
mays <- mayActs; return . map extJust $ filter isJust mays
scanAddresses :: Int -> Int -> Int -> IO [String]
scanAddresses port minAddr maxAddr =
selectJusts $ mapM (\addr -> do
let sAddr = "192.168.1." ++ show addr
print $ "Trying " ++ sAddr ++ " " ++ show port
connection <- testConn sAddr port
if isJust connection
then do hClose $ extJust connection; return $ Just sAddr
else return Nothing) [minAddr..maxAddr]
scanPorts :: String -> Int -> Int -> IO [Int]
scanPorts addr minPort maxPort =
selectJusts $ mapM (\port -> do
--print $ "Trying " ++ addr ++ " " ++ show port
connection <- testConn addr port
if isJust connection
then do hClose $ extJust connection; return $ Just port
else return Nothing) [minPort..maxPort]
main :: IO ()
main = do
withSocketsDo $ do
putStrLn "Scan Addresses or Ports? (a/p)"
choice <- getLine
if (toLower $ head choice) == 'a'
then do
putStrLn "On what port?"
sPort <- getLine
addrs <- scanAddresses (read sPort :: Int) 0 255
print addrs
else do
putStrLn "At what address?"
address <- getLine
ports <- scanPorts address 0 9999
print ports
main
testConn :: HostName -> Int -> IO (Maybe Handle)
testConn host port = do
result <- try $ timedConnect 1 host port
let result' = diffExcept result
case result' of
Left e -> do putStrLn $ "\t" ++ show e; return Nothing
Right h -> return $ Just h
setPort :: AddrInfo -> Int -> AddrInfo
setPort addInf nPort = case addrAddress addInf of
(SockAddrInet _ host) -> addInf { addrAddress = (SockAddrInet (fromIntegral nPort) host)}
getHostAddress :: HostName -> Int -> IO SockAddr
getHostAddress host port = do
addrs <- getAddrInfo Nothing (Just host) Nothing
let adInfo = head addrs
newAdInfo = setPort adInfo port
return $ addrAddress newAdInfo
timedConnect :: Int -> HostName -> Int -> IO Handle
timedConnect time host port = do
s <- socket AF_INET Stream defaultProtocol
setSocketOption s RecvTimeOut time; setSocketOption s SendTimeOut time
addr <- getHostAddress host port
connect s addr
socketToHandle s ReadWriteMode
multiTest :: (Int -> Int -> IO a) -> Int -> (Int, Int) -> IO [a]
multiTest partAction threads (mi,ma) =
spawnThreads $ recDiv [mi,perThread..ma]
where
perThread = ((ma - mi) `div` threads) + 1
recDiv [] = []
recDiv (curN:restN) =
partAction (curN + 1) (head restN) : recDiv restN
Helper:
module ConcurHelper where
import Control.Concurrent
import System.IO
spawnThreads :: [IO a] -> IO [a]
spawnThreads actions = do
ms <- mapM (\act -> do m <- forkIOReturnMVar act; return m) actions
results <- getResults ms
return results
forkIOReturnMVar :: IO a -> IO (MVar a)
forkIOReturnMVar act = do
m <- newEmptyMVar
putStrLn "Created MVar"
forkIO $ mVarWrapAct act m
putStrLn "Fork returned"
return m
mVarWrapAct :: IO a -> MVar a -> IO ()
mVarWrapAct act m = do a <- act; putMVar m a; putStrLn "MVar filled"
getResults :: [MVar a] -> IO [a]
getResults mvars = do
unpacked <- mapM (\m -> do r <- takeMVar m; return r) mvars
putStrLn "MVar taken from"
return unpacked
Your forkIOReturnMVar isn't exception safe: whenever act throws, the MVar isn't going to be filled.
Minimal example
import ConcurHelper
main = spawnThreads [badOperation]
where badOperation = do
error "You're never going to put something in the MVar"
return True
As you can see, badOperation throws, and therefore the MVar won't get filled in mVarWrapAct.
Fix
Fill the MVar with an appropriate value if you encounter an exception. Since you cannot provide a default value for all possible types a, it's better to use MVar (Maybe a) or MVar (Either b a) as you already do in your network code.
In order to catch the exceptions, use one of the operations provided in Control.Exception. For example, you could use onException:
mVarWrapAct :: IO a -> MVar (Maybe a) -> IO ()
mVarWrapAct act m = do
onException (act >>= putMVar m . Just) (putMVar m Nothing)
putStrLn "MVar filled"
However, you might want to preserve the actual exception for more information. In this case you could simply use catch together with Either SomeException a :
mVarWrapAct :: IO a -> MVar (Either SomeException a) -> IO ()
mVarWrapAct act m = do
catch (act >>= putMVar m . Right) (putMVar m . Left)
putStrLn "MVar filled"
I would like to optionally abort a getChar action.
I need the following function:
getChar' :: (Char -> IO ()) -> IO (IO ())
In case of abort <- getChar' callback , a character is read from standard input, unless abort is called before a character is available.
If a character is read, callback is called with it.
I have the following prototype implementation:
import Control.Monad
import Control.Concurrent
getChar' :: (Char -> IO ()) -> IO (IO ())
getChar' callback = do
v <- newEmptyMVar
tid <- forkIO $ do
c <- getChar
b <- tryPutMVar v ()
when b $ callback c
return $ do
b <- tryPutMVar v ()
when b $ killThread tid
The problem is that killThread may abort the thread after reading the char but before putting () into the MVar.
I have no idea how to solve this problem, is it possible at all with the base package?
If not, have you seen a similar function implemented in other packages?
I think the easiest way to achieve this is to perform your own buffering. Here's a simple prototype. It assumes that you call launchIOThread exactly once in your program. It doesn't handle EOF or other IO exceptions, but that should be easy.
import Control.Concurrent
import Control.Concurrent.STM
import Data.Maybe
import Control.Monad
type Buffer = TVar (Maybe Char)
launchIOThread :: IO Buffer
launchIOThread = do
buf <- atomically $ newTVar Nothing
_ <- forkIO $ ioThread buf
return buf
ioThread :: Buffer -> IO ()
ioThread buf = loop where
loop =
join $ atomically $ do
contents <- readTVar buf
if isJust contents -- no-one has taken the character yet
then retry -- relax
else return $ do
c <- getChar
atomically $ writeTVar buf (Just c)
loop
getChar' :: Buffer -> (Char -> IO ()) -> IO (IO ())
getChar' buf callback = do
abortFlag <- atomically $ newTVar False
_ <- forkIO $ doGetChar abortFlag
return $ atomically $ writeTVar abortFlag True
where
doGetChar abortFlag = join $ atomically $ do
mbC <- readTVar buf
abort <- readTVar abortFlag
case mbC of
Just c ->
do writeTVar buf Nothing; return $ callback c
Nothing | abort -> return $ return ()
_ -> retry
What you want to do is use exception-handling constructs such that regardless of exceptions, the MVar is always left in a safe state. In particular, you probably want withMVar.
while the following code:
postImportR = do
fi <- lookupFiles "file"
fc <- lift $ fileSource (fi !! 0) $$ consume
seems to work (at least can I "liftIO $ print fc), splitting it off to a function for iterating doesn't:
process :: [FileInfo] -> [String]
process [] = []
process (f:r) = do
fn <- fileName f
fc <- lift $ fileSource f $$ consume
([fn] : (process r))
postImportR = do
fi <- lookupFiles "file"
process fi
or even with a lambda function:
files <- L.map (\f -> (fileName f, lift $ fileSource f $$ consume)) fi
in the Handler it gives me a type error I don't understand.
Where's my fault -- liked to generate content for database import from the file's lines (and to learn some more Haskell, of course).
You have
fileName :: FileInfo -> Text
so you can't directly use fileName in a do-block like
fn <- fileName f
That would need to be a let-binding
let fn = fileName f
The next thing is what makes process :: [FileInfo] -> [String] impossible(1),
fileSource :: FileInfo -> Source (ResourceT IO) ByteString
so with
fc <- lift $ fileSource f $$ consume
you are in a do-block whose Monad is a MonadIO, and you can't get out of a Monad that can wrap arbitrary IO-actions, just like you can't get out of IO itself.
What you can have is
process :: (SomeFiendishConstraint m) => [FileInfo] -> m [Text]
process [] = return []
process (f:r) = do
let fn = fileName f
lift $ fileSource f $$ consume
fs <- process r
return (fn : fs)
or, more succinctly,
process = mapM (\f -> lift $ fileSource f $$ consume >> return fileName f)
and then
postImportR = do
fi <- lookupFiles "file"
process fi
(1) Barring unsafePerformIO.
I call an external program inside a function. Now i would like to timeout this function and not just the external program. But after the function times out, the external program is still running on my computer (i'm using debian) until it finishes its computation, after that its thread still remains in the process table as a subthread of my main program until the main program terminates.
Here are two minimal examples which illustrates what i would like to do. The first uses unsafePerformIO, the second is completely in the IO monad. I don't really depend on the unsafePerformIO but would like to keep it if possible. The described problem occures with and without it.
With unsafePerformIO
module Main where
import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
main = do
x <- time $ timeoutP (1 * 1000000) $ mytest 2
y <- getLine
putStrLn $ show x ++ y
timeoutP :: Int -> a -> IO (Maybe a)
timeoutP t fun = timeout t $ return $! fun
mytest :: Int -> String
mytest n =
let
x = runOnExternalProgram $ n * 1000
in
x ++ ". Indeed."
runOnExternalProgram :: Int -> String
runOnExternalProgram n = unsafePerformIO $ do
-- convert the input to a parameter of the external program
let x = show $ n + 12
-- run the external program
-- (here i use "sleep" to indicate a slow computation)
answer <- readProcess "sleep" [x] ""
-- convert the output as needed
let verboseAnswer = "External program answered: " ++ answer
return verboseAnswer
Without unsafePerformIO
module Main where
import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
main = do
x <- time $ timeout (1 * 1000000) $ mytest 2
y <- getLine
putStrLn $ show x ++ y
mytest :: Int -> IO String
mytest n = do
x <- runOnExternalProgram $ n * 1000
return $ x ++ ". Indeed."
runOnExternalProgram :: Int -> IO String
runOnExternalProgram n = do
-- convert the input to a parameter for the external program:
let x = show $ n + 12
-- run the external program
-- (here i use "sleep" to indicate a slow computation):
answer <- readProcess "sleep" [x] ""
-- convert the output as needed:
let verboseAnswer = "External program answered: " ++ answer
return verboseAnswer
Maybe bracket can be of help here, but i don't really know how.
Edit: I adopted John L's answer. Now i am using the following:
import Control.Concurrent
import Control.Exception
import System.Exit
import System.IO
import System.IO.Error
import System.Posix.Signals
import System.Process
import System.Process.Internals
safeCreateProcess :: String -> [String] -> StdStream -> StdStream -> StdStream
-> ( ( Maybe Handle
, Maybe Handle
, Maybe Handle
, ProcessHandle
) -> IO a )
-> IO a
safeCreateProcess prog args streamIn streamOut streamErr fun = bracket
( do
h <- createProcess (proc prog args)
{ std_in = streamIn
, std_out = streamOut
, std_err = streamErr
, create_group = True }
return h
)
-- "interruptProcessGroupOf" is in the new System.Process. Since some
-- programs return funny exit codes i implemented a "terminateProcessGroupOf".
-- (\(_, _, _, ph) -> interruptProcessGroupOf ph >> waitForProcess ph)
(\(_, _, _, ph) -> terminateProcessGroup ph >> waitForProcess ph)
fun
{-# NOINLINE safeCreateProcess #-}
safeReadProcess :: String -> [String] -> String -> IO String
safeReadProcess prog args str =
safeCreateProcess prog args CreatePipe CreatePipe Inherit
(\(Just inh, Just outh, _, ph) -> do
hPutStr inh str
hClose inh
-- fork a thread to consume output
output <- hGetContents outh
outMVar <- newEmptyMVar
forkIO $ evaluate (length output) >> putMVar outMVar ()
-- wait on output
takeMVar outMVar
hClose outh
return output
-- The following would be great, if some programs did not return funny
-- exit codes!
-- ex <- waitForProcess ph
-- case ex of
-- ExitSuccess -> return output
-- ExitFailure r ->
-- fail ("spawned process " ++ prog ++ " exit: " ++ show r)
)
terminateProcessGroup :: ProcessHandle -> IO ()
terminateProcessGroup ph = do
let (ProcessHandle pmvar) = ph
ph_ <- readMVar pmvar
case ph_ of
OpenHandle pid -> do -- pid is a POSIX pid
signalProcessGroup 15 pid
otherwise -> return ()
This solves my problem. It kills all child processes of the spawned process and that at the right time.
Kind regards.
Edit: it is possible to get the pid of the spawned process. You can do so with code like the following:
-- highly non-portable, and liable to change between versions
import System.Process.Internals
-- from the finalizer of the bracketed function
-- `ph` is a ProcessHandle as returned by createProcess
(\(_,_,_,ph) -> do
let (ProcessHandle pmvar) = ph
ph_ <- takeMVar pmvar
case ph_ of
OpenHandle pid -> do -- pid is a POSIX pid
... -- do stuff
putMVar pmvar ph_
If you kill the process, instead of putting the open ph_ into the mvar you should create an appropriate ClosedHandle and put that back instead. It's important that this code executes masked (bracket will do this for you).
Now that you have a POSIX id you can use system calls or shell out to kill as necessary. Just be careful that your Haskell executable isn't in the same process group if you go that route.
/end edit
This behavior seems sort of sensible. The documentation for timeout claims that it doesn't work at all for non-Haskell code, and indeed I don't see any way that it could generically. What's happening is that readProcess spawns a new process, but then is timed out while waiting for output from that process. It seems that readProcess doesn't terminate the spawned process when it's aborted abnormally. This could be a bug in readProcess, or it could be by design.
As a workaround, I think you'll need to implement some of this yourself. timeout works by raising an async exception in a spawned thread. If you wrap your runOnExternalProgram in an exception handler, you'll get the behavior you want.
The key function here is the new runOnExternalProgram, which is a combination of your original function and readProcess. It would be better (more modular, more reusable, more maintainable) to make a new readProcess that kills the spawned process when an exception is raised, but I'll leave that as an exercise.
module Main where
import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
import Control.Exception
import System.IO
import System.IO.Error
import GHC.IO.Exception
import System.Exit
import Control.Concurrent.MVar
import Control.Concurrent
main = do
x <- time $ timeoutP (1 * 1000000) $ mytest 2
y <- getLine
putStrLn $ show x ++ y
timeoutP :: Int -> IO a -> IO (Maybe a)
timeoutP t fun = timeout t $ fun
mytest :: Int -> IO String
mytest n = do
x <- runOnExternalProgram $ n * 1000
return $ x ++ ". Indeed."
runOnExternalProgram :: Int -> IO String
runOnExternalProgram n =
-- convert the input to a parameter of the external program
let x = show $ n + 12
in bracketOnError
(createProcess (proc "sleep" [x]){std_in = CreatePipe
,std_out = CreatePipe
,std_err = Inherit})
(\(Just inh, Just outh, _, pid) -> terminateProcess pid >> waitForProcess pid)
(\(Just inh, Just outh, _, pid) -> do
-- fork a thread to consume output
output <- hGetContents outh
outMVar <- newEmptyMVar
forkIO $ evaluate (length output) >> putMVar outMVar ()
-- no input in this case
hClose inh
-- wait on output
takeMVar outMVar
hClose outh
-- wait for process
ex <- waitForProcess pid
case ex of
ExitSuccess -> do
-- convert the output as needed
let verboseAnswer = "External program answered: " ++ output
return verboseAnswer
ExitFailure r ->
ioError (mkIOError OtherError ("spawned process exit: " ++ show r) Nothing Nothing) )