Command line options picked up by criterion library - haskell

I have used the libraries criterion and cmdargs.
When I compile the program completely without cmdargs and run it e.g. ./prog --help then I get some unwanted response from criterion about the possible options and the number of runs etc..
When I compile and run it as below the command line options are first picked up by my code then then read by criterion. Criterion then subsequently reports and error telling me that the option --byte is unknown. I have not seen anything in the criterion documentation how this could be switched off or worked around. Is there a way to clear out the command line options ofter I have read them? Otherwise I would need to use e.g. CPUTime instead of criterion, that is OK to me since I do to really require the loads of extra functionality and data that criterion delivers.
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveDataTypeable #-}
import System.Console.CmdArgs
data Strlen = Strlen {byte :: Int} deriving (Data, Typeable, Show)
strlen = cmdArgsMode $ Strlen {byte = def} &= summary "MessagePack benchmark v0.04"
main = do
n <- cmdArgsRun strlen
let datastring = take (byte n) $ randomRs ('a','z') (mkStdGen 3)
putStrLn "Starting..."
conn <- connect "192.168.35.62" 8081
defaultMain [bench "sendReceive" $ whnfIO (mywl conn datastring)]

Use System.Environment.withArgs. Parse the command line arguments first with cmdArgs, then pass what you haven't used to criterion:
main = do
(flags, remaining) <- parseArgsHowever
act according to flags
withArgs remaining $
defaultMain [ ... ]

Take a look at the criterion source. You should be able to write your own defaultMainWith function that handles args however you want, including ignoring them, or ignoring unknown args, or etc...

Related

Read large lines in huge file without buffering

I was wondering if there's an easy way to get lines one at a time out of a file without eventually loading the whole file in memory. I'd like to do a fold over the lines with an attoparsec parser. I tried using Data.Text.Lazy.IO with hGetLine and that blows through my memory. I read later that eventually loads the whole file.
I also tried using pipes-text with folds and view lines:
s <- Pipes.sum $
folds (\i _ -> (i+1)) 0 id (view Text.lines (Text.fromHandle handle))
print s
to just count the number of lines and it seems to be doing some wonky stuff "hGetChunk: invalid argument (invalid byte sequence)" and it takes 11 minutes where wc -l takes 1 minute. I heard that pipes-text might have some issues with gigantic lines? (Each line is about 1GB)
I'm really open to any suggestions, can't find much searching except for newbie readLine how-tos.
Thanks!
The following code uses Conduit, and will:
UTF8-decode standard input
Run the lineC combinator as long as there is more data available
For each line, simply yield the value 1 and discard the line content, without ever read the entire line into memory at once
Sum up the 1s yielded and print it
You can replace the yield 1 code with something which will do processing on the individual lines.
#!/usr/bin/env stack
-- stack --resolver lts-8.4 --install-ghc runghc --package conduit-combinators
import Conduit
main :: IO ()
main = (runConduit
$ stdinC
.| decodeUtf8C
.| peekForeverE (lineC (yield (1 :: Int)))
.| sumC) >>= print
This is probably easiest as a fold over the decoded text stream
{-#LANGUAGE BangPatterns #-}
import Pipes
import qualified Pipes.Prelude as P
import qualified Pipes.ByteString as PB
import qualified Pipes.Text.Encoding as PT
import qualified Control.Foldl as L
import qualified Control.Foldl.Text as LT
main = do
n <- L.purely P.fold (LT.count '\n') $ void $ PT.decodeUtf8 PB.stdin
print n
It takes about 14% longer than wc -l for the file I produced which was just long lines of commas and digits. IO should properly be done with Pipes.ByteString as the documentation says, the rest is conveniences of various sorts.
You can map an attoparsec parser over each line, distinguished by view lines, but keep in mind that an attoparsec parser can accumulate the whole text as it pleases and this might not be a great idea over a 1 gigabyte chunk of text. If there is a repeated figure on each line (e.g. word separated numbers) you can use Pipes.Attoparsec.parsed to stream them.

How can QuickCheck test all properties for each sample

...instead of generating 100 new random samples for each property?
My testsuite contains the TemplateHaskell hack explained here [1] to
test all functions named prop_*. Running the test program prints
=== prop_foo from tests/lala.lhs:20 ===
+++ OK, passed 100 tests.
=== prop_bar from tests/lala.lhs:28 ===
+++ OK, passed 100 tests.
and it looks like going through 100 random samples for each of the
properties.
Problemis: Generating the samples is quite expensive, checking the
properties is not. So I'd like to have a means to pass each random
sample to each of the prop_* functions instead of creating new
(#properties * 100) many samples.
Is there anything like that built in? Actually, I think I'd need a
replacement for the splice
$(forAllProperties)
in
main :: IO ()
main
= do args <- parseArgs <$> getArgs
s <- $(forAllProperties) $ quickCheckWithResult args
s ? return () $ exitFailure
where
parseArgs as
= null as ? stdArgs $ stdArgs{ maxSuccess = read $ head as }
[1] Simple haskell unit testing, and
QuickCheck exit status on failures, and cabal integration
In this post you can see how to group tests
Stackoverflow post
That user provides a very simple example of use Test.Tasty.QuickCheck
Using testProperty and testGroup you can pass each random sample to each property
In the next link you can check the hackage of this package
Test.Tasty.QuickCheck

How to show progress in Shake?

I am trying to figure out how can i take the progress info from a Progress type (in Development.Shake.Progress) to output it before executing a command. The possible desired output would be:
[1/9] Compiling src/Window/Window.cpp
[2/9] Compiling src/Window/GlfwError.cpp
[3/9] Compiling src/Window/GlfwContext.cpp
[4/9] Compiling src/Util/MemTrack.cpp
...
For now i am simulating this using some IORef that keeps the total (initially set to the sum of the source files) and a count that i increase before executing each build command, but this seems like a hackish solution to me.
On top of that this solution seems to work correctly on clean builds, but misbehaves on partial builds as the sum that displayed is still the total of all the source files.
With access to a Progress data type i will be able to calculate this fraction correctly using its countSkipped, countBuild, and countTodo members (see Progress.hs:53), but i am still not sure how i can i achieve this.
Any help is appreciated.
Values of type Progress are currently only available as an argument to the function stored in shakeProgress. You can obtain the Progress whenever you want with:
{-# LANGUAGE RecordWildCards #-}
import Development.Shake
import Data.IORef
import Data.Monoid
import Control.Monad
main = do
ref <- newIORef $ return mempty
shakeArgs shakeOptions{shakeProgress = writeIORef ref} $ do
want ["test" ++ show i | i <- [1..5]]
"test*" %> \out -> do
Progress{..} <- liftIO $ join $ readIORef ref
putNormal $
"[" ++ show (countBuilt + countSkipped + 1) ++
"/" ++ show (countBuilt + countSkipped + countTodo) ++
"] " ++ out
writeFile' out ""
Here we create an IORef to squirrel away the argument passed to shakeProgress, then retrieve it later when running the rules. Running the above code I see:
[1/5] test5
[2/5] test4
[3/5] test3
[4/5] test2
[5/5] test1
Running at a higher level of parallelism gives less precise results - initially there are only 3 items in todo (Shake increments countTodo as it finds items todo, and spawns items as soon as it knows about any of them), and there are often two rules running at the same index (there is no information about how many are in progress). Given knowledge of your specific rules, you could refine the output, e.g. storing an IORef you increment to ensure the index was monotonic.
The reason this code is somewhat convoluted is that the Progress information was intended to be used for asynchronous progress messages, although your approach seems perfectly valid. It may be worth introducing a getProgress :: Action Progress function for synchronous progress messages.

Get args Haskell

I'm having problems with an exercise, and can not understand the error.
It should be a simple exercise with args:
import System.IO
import System.Environment
main= do
args < - getArgs
nomeficheiro <- return( args !! 0)
putStrnLn ( "Name is" ++ nomeficheiro)
Then i should run it, with : $ ./comando James
The error:
<interactive>:51:1:
parse error on input ‘$’
Perhaps you intended to use TemplateHaskell
I've read other doubts about args at this fórum and I didn't find any answer that could help me
$ ./comando James isn't meant to be run on GHCi. Instead, $ at the start of the line indicates that the following line should be run in your bash/cmd/shell, not in GHCi:
# in your favourite shell, in the correct directory
./comando James
If you want to run main with arguments within GHCi, you can use :main args:
ghci> :main James
Further remarks
Your current code isn't indented correctly, so make sure that you fix this too. Also, you can use let nomeficheiro = head args instead of … <- return …. Keep in mind that this could lead to problems if one doesn't supply any argument to your program, since head [] calls error.

cmdargs value arguments without equals sign

I use cmdargs to write CLI parsers for my Haskell programs.
Let's assume this program, derived directly from their example:
{-# LANGUAGE DeriveDataTypeable #-}
import System.Console.CmdArgs.Implicit
data Sample = Sample {hello :: String} deriving (Show, Data, Typeable)
sample = Sample{hello = def &= help "World argument" &= opt "world"}
&= summary "Sample v1"
main = print =<< cmdArgs sample
If this program is called cmdtest.hs, I can execute
$ runghc cmdtest.hs --hello=world
Sample {hello = "world"}
However, if I leave off the equals sign (like in the -o option in gcc -o foo foobar.c), cmdargs tries to recognize world as a positional argument
$ runghc cmdtest.hs --hello world
Unhandled argument, none expected: world
Is there any option / annotation I can set to allow this syntax in cmdargs?
Update: Using the short option exposes the same problem:
$ runghc cmdtest.hs -h world
Unhandled argument, none expected: world
Update 2: The FlagInfo data in the cmdargs Explicit module seems to suggest it is somehow possible, at least if using Explicit instead of Implicit
Your problem is that you're using opt which makes the value of --hello optional. So --hello world is taken to mean --hello without a value, followed by world as a positional parameter. This is of course a design choice in cmdargs, but I think it's a reasonable one, and it matches the behaviour of most other programs. Take out the opt and see if it works the way you want.

Resources