I need to profile a large number of haskell executables, hopefully in parallel. I was able to get the clock time with measure and measTime from the Criterion library, but couldn't get measCpuTime or any GC report to work (measCpuTime returns a time that's impossibly short). The code looks like:
buildProj :: FilePath -> IO ExitCode
buildProj projDir = system $ "cd " ++ projDir ++ "; cabal sandbox init; cabal configure; cabal build"
-- Time a project
instance NFData ExitCode
where
rnf ExitSuccess = ()
rnf (ExitFailure _) = ()
benchmark :: FilePath -> Int64 -> IO Double
benchmark projDir runs = do
let runProj = "./" ++ projDir ++ "/dist/build/" ++ projDir ++ "/" ++ projDir ++ "> /dev/null"
exit <- timeout 17000000 $ system runProj -- TODO hardcode timeout
case exit of
Just ExitSuccess -> do {(m, _) <- measure (nfIO $ system runProj) runs;
return $! measTime m}
Just (ExitFailure _) -> return 100
Nothing -> return 100
In short, I'm running the executables with System.Process.system as an IO action and I've declared ExitCode as NFData in order to get nfIO to work. What have I done wrong? Are there better tools to do the task?
The file's here if you want to play with it.
I took a look at this SO question and got some ideas. First note that criterion uses cbits to enable system-dependent cpu time functions. Let's pretend you're on unix. The simplest thing to do is to directly read from /proc/PID/stat/cutime at the start and end of your runs and take the difference. Beyond that, you can actually use the c code provided in that question, link it in yourself as a foreign import, and then call that directly from your own code.
Related
I'm trying to learn how to work with IO in Haskell by writing a function that, if there is a flag, will take a list of points from a file, and if there is no flag, it asks the user to enter them.
dispatch :: [String] -> IO ()
dispatch argList = do
if "file" `elem` argList
then do
let (path : otherArgs) = argList
points <- getPointsFile path
else
print "Enter a point in the format: x;y"
input <- getLine
if (input == "exit")
then do
print "The user inputted list:"
print $ reverse xs
else (inputStrings (input:xs))
if "help" `elem` argList
then help
else return ()
dispatch [] = return ()
dispatch _ = error "Error: invalid args"
getPointsFile :: String -> IO ([(Double, Double)])
getPointsFile path = do
handle <- openFile path ReadMode
contents <- hGetContents handle
let points_str = lines contents
let points = foldl (\l d -> l ++ [tuplify2 $ splitOn ";" d]) [] points_str
hClose handle
return points
I get this: do-notation in pattern Possibly caused by a missing 'do'?` after `if "file" `elem` argList.
I'm also worried about the binding issue, assuming that I have another flag that says which method will be used to process the points. Obviously it waits for points, but I don't know how to make points visible not only in if then else, constructs. In imperative languages I would write something like:
init points
if ... { points = a}
else points = b
some actions with points
How I can do something similar in Haskell?
Here's a fairly minimal example that I've done half a dozen times when I'm writing something quick and dirty, don't have a complicated argument structure, and so can't be bothered to do a proper job of setting up one of the usual command-line parsing libraries. It doesn't explain what went wrong with your approach -- there's an existing good answer there -- it's just an attempt to show what this kind of thing looks like when done idiomatically.
import System.Environment
import System.Exit
import System.IO
main :: IO ()
main = do
args <- getArgs
pts <- case args of
["--help"] -> usage stdout ExitSuccess
["--file", f] -> getPointsFile f
[] -> getPointsNoFile
_ -> usage stderr (ExitFailure 1)
print (frobnicate pts)
usage :: Handle -> ExitCode -> IO a
usage h c = do
nm <- getProgName
hPutStrLn h $ "Usage: " ++ nm ++ " [--file FILE]"
hPutStrLn h $ "Frobnicate the points in FILE, or from stdin if no file is supplied."
exitWith c
getPointsFile :: FilePath -> IO [(Double, Double)]
getPointsFile = {- ... -}
getPointsNoFile :: IO [(Double, Double)]
getPointsNoFile = {- ... -}
frobnicate :: [(Double, Double)] -> Double
frobnicate = {- ... -}
if in Haskell doesn't inherently have anything to do with control flow, it just switches between expressions. Which, in Haskell, happen to include do blocks of statements (if we want to call them that), but you still always need to make that explicit, i.e. you need to say both then do and else do if there are multiple statements in each branch.
Also, all the statements in a do block need to be indented to the same level. So in your case
if "file" `elem` argList
...
if "help" `elem` argList
Or alternatively, if the help check should only happen in the else branch, it needs to be indented to the statements in that do block.
Independent of all that, I would recommend to avoid parsing anything in an IO context. It is usually much less hassle and easier testable to first parse the strings into a pure data structure, which can then easily be processed by the part of the code that does IO. There are libraries like cmdargs and optparse-applicative that help with the parsing part.
could you please help me with Turtle library.
I want to write simple program, that calculates disk space usage.
Here is the code:
getFileSize :: FilePath -> IO Size
getFileSize f = do
status <- stat f
return $ fileSize status
main = sh $ do
let sizes = fmap getFileSize $ find (suffix ".hs") "."
so now I have sizes bind of type Shell (IO Size). But I can't just sum it, with sum fold, cause there is IO Size in there. If it was something like [IO Size] I could pull IO monad out of there by using sequence to transform it to IO [Size]. But I can't do this with Shell monad since it is not Traversable. So I wrote something like this
import qualified Control.Foldl as F
main = sh $ do
let sizes = fmap getFileSize $ find (suffix ".hs") "."
lst <- fold sizes F.list
let cont = sequence lst
sz <- liftIO $ cont
liftIO $ putStrLn (show (sum sz))
First I folded Shell (IO Size) to [IO Size] and then to IO [Size] to sum list afterwards.
But I wonder if there is more canonical or elegant solution to this, because here I created two lists to accomplish my task. And I throught that Shell monad is for manipulating entities in constant space. Maybe there is some fold to make IO (Shell Size) from Shell (IO Size)?
Thanks.
You have an IO action, and you really want a Shell action. The usual way to handle that is with the liftIO method, which is available because Shell is an instance of MonadIO.
file <- find (suffix ".hs") "."
size <- liftIO $ getFileSize file
or even
size <- liftIO . getFileSize =<< find (suffix ".hs") "."
Fortunately, the Turtle package itself offers some size functions you can use directly with MonadIO instances like Shell in Turtle.Prelude so you don't need to use liftIO yourself.
Now you actually have to sum these up, but you can do that with fold and sum.
I would recommend that you avoid breaking open the Shell type itself. That should be reserved for adding totally new functionality to the API. That certainly isn't necessary in this case.
Actually I've managed to get rid of IO here by using helper transformation
sio :: Shell (IO a) -> Shell a
sio s = Shell (\(FoldShell step begin done) ->
let step' x a = do
a' <- a
step x a'
in
_foldShell s (FoldShell step' begin done))
But now I wonder is there any simpler solution to this task...
Using the Shake Haskell build library, how can I write a rule using a program that needs to reach a fixed point? Imagine I have a program foo that takes a file input and produces an output file, which should have foo applied repeatedly until the output file does not change. How can I write that in Shake?
The typical example of this pattern is LaTeX.
Firstly, note that calling Latex repeatedly does not always produce a fixed point, so make sure you have a bound on the iterations. Also, some distributions (MikTex) provide Latex versions that automatically run as many times as they need to, so if you use those instead the problem goes away.
Write your own foo_transitive command
The easiest way to solve the problem, assuming each run of foo has the same dependencies, is to solve the problem outside the build system. Just write a foo_transitive command, either as a shell script or as a Haskell function, that when supplied an input file produces an output file by running repeatedly and checking if it has reached a fixed point. The build system can now use foo_transitive and there are no issues about dependencies.
Encode it in the build system
You need to write two rules, one which makes one step, and one which figures out which step is the right one to use:
let step i = "tempfile" <.> show i
"tempfile.*" *> \out -> do
let i = read $ takeExtension out :: Int
if i == 0 then
copyFile "input" out
else
let prev = step (i-1)
need [prev]
-- perhaps require addition dependencies, depending on prev
system' "foo" [prev,out]
"output" *> \out -> do
let f i = do
old <- readFile' $ step (i-1)
new <- readFile' $ step i
if old == new || i > 100 then copyFile (step i) out else f (i+1)
f 1
The first rule generates tempfile.2 from tempfile.1 and so on, so we can need ["tempfile.100"] to get the 100th iteration. If the dependencies change in each step we can look at the previous result to calculate the new dependencies.
The second rule loops round checking each pair of values in the sequence, and stopping when they are equal. If you are implementing this in a production build system you may wish to avoid calling readFile' on each element twice (once as i-1 and once as i).
Expanding on #Neil Mitchell's answer, below is a sample code of foo_transitive. Having said that, for this particular case I'd just use latexmk which Does The Right Thing™.
import Control.Monad.Fix (fix, mfix)
import Control.Monad.IO.Class (MonadIO(liftIO))
import Text.Printf (printf)
type SHA = Int
data TeXCompilationStage
= Init
| BibTeX
| Recompile SHA
deriving (Show, Eq)
data TeXResult
= Stable SHA
| Unstable
deriving (Show, Eq)
f retry x budgetLaTeXCalls
| budgetLaTeXCalls <= 0
= do
liftIO $ putStrLn "Budget for LaTeX depleted; result didn't converge"
return Unstable
| otherwise
= case x of
Init -> do
liftIO $ do
putStrLn "Init"
putStrLn " # latex"
retry BibTeX (budgetLaTeXCalls-1)
BibTeX -> do
liftIO $ do
putStrLn "BibTeX"
putStrLn " # bibtex"
retry (Recompile 0) budgetLaTeXCalls
Recompile previousSHA -> do
let budgetLaTeXCalls' = budgetLaTeXCalls - 1
calculcatedSHA = 3
liftIO $ do
printf "Recompile (budget: %d)\n" budgetLaTeXCalls
printf " Prevous SHA:%d\n Current SHA:%d\n" previousSHA calculcatedSHA
if calculcatedSHA == previousSHA
then do
liftIO $ putStrLn " Stabilized"
return $ Stable calculcatedSHA
else do
liftIO $ putStrLn " Unstable"
retry (Recompile (previousSHA+1)) (budgetLaTeXCalls-1)
latex :: Int -> IO TeXResult
latex = fix f Init
I'm trying to use the interact function, but I'm having an issue with the following code:
main::IO()
main = interact test
test :: String -> String
test [] = show 0
test a = show 3
I'm using EclipseFP and taking one input it seems like there is an error. Trying to run main again leads to a:
*** Exception: <stdin>: hGetContents: illegal operation (handle is closed)
I'm not sure why this is not working, the type of test is String -> String and show is Show a => a -> String, so it seems like it should be a valid input for interact.
EDIT/UPDATE
I've tried the following and it works fine. How does the use of unlines and lines cause interact to work as expected?
main::IO()
main = interact respondPalindromes
respondPalindromes :: String -> String
respondPalindromes =
unlines .
map (\xs -> if isPal xs then "palindrome" else "not a palindrome") .
lines
isPal :: String -> Bool
isPal xs = xs == reverse xs
GHCi and Unsafe I/O
You can reduce this problem (the exception) to:
main = getContents >> return ()
(interact calls getContents)
The problem is that stdin (getContents is really hGetContents stdin) remains evaluated in GHCi in-between calls to main. If you look up stdin, it's implemented as:
stdin :: Handle
stdin = unsafePerformIO $ ...
To see why this is a problem, you could load this into GHCi:
import System.IO.Unsafe
f :: ()
f = unsafePerformIO $ putStrLn "Hi!"
Then, in GHCi:
*Main> f
Hi!
()
*Main> f
()
Since we've used unsafePerformIO and told the compiler that f is a pure function, it thinks it doesn't need to evaluate it a second time. In the case of stdin, all of the initialization on the handle isn't run a second time and it's still in a semi-closed state (which hGetContents puts it in), which causes the exception. So I think that GHCi is "correct" in this case and the problem lies in the definition of stdin which is a practical convenience for compiled programs that will just evaluate stdin once.
Interact and Lazy I/O
As for why interact quits after a single line of input while the unlines . lines version continues, let's try reducing that as well:
main :: IO ()
main = interact (const "response\n")
If you test the above version, interact won't even wait for input before printing response. Why? Here's the source for interact (in GHC):
interact f = do s <- getContents
putStr (f s)
getContents is lazy I/O, and since f in this case doesn't need s, nothing is read from stdin.
If you change your test program to:
main :: IO ()
main = interact test
test :: String -> String
test [] = show 0
test a = show a
you should notice different behavior. And that suggests that in your original version (test a = show 3), the compiler is smart enough to realize that it only needs enough input to determine if the string read is empty or not (because if it's not empty, it doesn't need to know what a is, it just needs to print "3"). Since the input is presumably line-buffered on a terminal, it reads up until you press the return key.
Most of this is straight from the hint example. What I'd like to do is initialize the interpreter with modules and imports and such and keep it around somehow. Later on (user events, or whatever), I want to be able to call a function with that initialized state and interpret an expression many times. So at the --split here location in the code, I want to have the code above in init, and the code below that in a new function that takes an expression and interprets it.
module Main where
import Language.Haskell.Interpreter
import Test.SomeModule
main :: IO ()
main = do r <- runInterpreter testHint
case r of
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
-- Right here I want to do something like the following
-- but how do I do testInterpret thing so it uses the
-- pre-initialized interpreter?
case (testInterpret "expression one")
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
case (testInterpret "expression two")
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
testHint :: Interpreter ()
testHint =
do
loadModules ["src/Test/SomeModule.hs"]
setImportsQ [("Prelude", Nothing), ("Test.SomeModule", Just "SM")]
say "loaded"
-- Split here, so what I want is something like this though I know
-- this doesn't make sense as is:
-- testExpr = Interpreter () -> String -> Interpreter ()
-- testExpr hintmonad expr = interpret expr
let expr1 = "let p1o1 = SM.exported undefined; p1o2 = SM.exported undefined; in p1o1"
say $ "e.g. typeOf " ++ expr1
say =<< typeOf expr1
say :: String -> Interpreter ()
say = liftIO . putStrLn
printInterpreterError :: InterpreterError -> IO ()
printInterpreterError e = putStrLn $ "Ups... " ++ (show e)
I'm having trouble understanding your question. Also I am not very familiar with hint. But I'll give it a go.
As far as I can tell, the Interpreter monad is just a simple state wrapper around IO -- it only exists so that you can say eg. setImportsQ [...] and have subsequent computations depend on the "settings" that were modified by that function. So basically you want to share the monadic context of multiple computations. The only way to do that is by staying within the monad -- by building a single computation in Interpreter and running it once. You can't have a "global variable" that escapes and reuses runInterpreter.
Fortunately, Interpreter is an instance of MonadIO, which means you can interleave IO computations and Interpreter computations using liftIO :: IO a -> Interpreter a. Basically you are thinking inside-out (an extremely common mistake for learners of Haskell). Instead of using a function in IO that runs code in your interpreter, use a function in Interpreter that runs code in IO (namely liftIO). So eg.
main = runInterpreter $ do
testHint
expr1 <- liftIO getLine
r1 <- interpret "" expr1
case r1 of
...
expr2 <- liftIO getLine
r2 <- interpret "" expr2
case r2 of
...
And you can easily pull that latter code out into a function if you need to, using the beauty of referential transparency! Just pull it straight out.
runSession :: Interpreter ()
runSession = do
expr1 <- liftIO getLine
r1 <- interpret "" expr1
case interpret expr1 of
...
main = runInterpreter $ do
testHint
runSession
Does that make sense? Your whole program is an Interpreter computation, and only at the last minute do you pull it out into IO.
(That does not mean that every function you write should be in the Interpreter monad. Far from it! As usual, use Interpreter around the edges of your program and keep the core purely functional. Interpreter is the new IO).
If I understand correctly, you want to initialize the compiler once, and run multiple queries, possibly interactively.
There are two main approaches:
lift IO actions into your Interpreter context (see luqui's answer).
use lazy IO to smuggle a stream of data in and out of your program.
I'll describe the second option.
By the magic of lazy IO, you can pass testHint a lazy stream of input, then loop in the body of testHint, interpreting many queries interactively:
main = do
ls <- getContents -- a stream of future input
r <- runInterpreter (testHint (lines input))
case r of
Left err -> printInterpreterError err
Right () -> putStrLn "Done."
testHint input = do
loadModules ["src/Test/SomeModule.hs"]
setImportsQ [("Prelude", Nothing), ("Test.SomeModule", Just "SM")]
say "loaded"
-- loop over the stream of input, interpreting commands
let go (":quit":es) = return ()
(e:es) = do say =<< typeOf e
go es
go
The go function has access to the closed-over environment of the initialized interpreter, so feeding it events will obviously run in the scope of that once-initialized interpreter.
An alternative method would be to extract the interpreter state from the monad, but I'm not sure that is possible in GHC (it would require GHC not to be in the IO monad fundamentally).