How to debug diverging test using QuickCheck - haskell

I have some parsing code using Megaparsec that I've written a simple property to test (it generates a random expression tree, pretty-prints it, and then checks that the result parses back to the original tree).
Unfortunately, there seems to be a bug and if I run the tests without any limits, I see the GHC process allocating more and more memory until either I kill it or the OOM killer gets there first.
Not a problem, I thought... but I can't for the life of me figure out what's causing the divergence. The property itself looks like this: (I've ripped out the proper testing and the shrinking code to try to minimise the code that actually runs)
prop_parse_expr :: Property
prop_parse_expr =
forAll arbitrary $
(\ pe ->
let str = prettyParExpr 0 pe in
counterexample ("Rendered to: " ++ show str) $
trace ("STARTING TEST: " ++ show str) $
case parse (expr <* eof) "" str of
Left _ -> trace "NOPE" $ False
Right _ -> trace "GOOD" $ True)
If I compile with profiling (using stack test --profile), I can run the resulting binary with RTS options. Ahah, I thought, and ran with -xc, thinking that I'd get a helpful stack trace when I sent a SIGINT to the stuck job. It seems not. Running with
./long/path/to/foo-test -j1 --test-seed 1 +RTS -xc
I see this output:
STARTING TEST: "0"
GOOD
STARTING TEST: "(x [( !0 )]) "
STARTING TEST: "({ 2 {( !0 )}} ) "
STARTING TEST: "{ 2{ ( x[0? {( 0) ,( x ) } :((0 )? (x ):0) -: ( -(^( x )) ) ]), 0**( x )} } "
STARTING TEST: "| (0? (x[({ 1{ (0)? x : ( 0 ) }} ) ]) :(~&( 0) ?( x):( (x ) ^( x ) )))"
STARTING TEST: "(0 )"
STARTING TEST: "0"
^C*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace:
Test.Framework.Improving.runImprovingIO,
called from Test.Framework.Providers.QuickCheck2.runProperty,
called from Test.Framework.Providers.QuickCheck2.runTest,
called from Test.Framework.Runners.Core.runSimpleTest,
called from Test.Framework.Runners.Core.runTestTree.go,
called from Test.Framework.Runners.Core.runTestTree,
called from Test.Framework.Runners.Core.runTests',
called from Test.Framework.Runners.Core.runTests,
called from Test.Framework.Runners.Console.defaultMainWithOpts,
called from Test.Framework.Runners.Console.defaultMainWithArgs,
called from Test.Framework.Runners.Console.defaultMain,
called from Main.main
<snip: 2 more identical backtraces>
*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace:
Test.Framework.Runners.Console.Utilities.hideCursorDuring,
called from Test.Framework.Runners.Console.Run.showRunTestsTop,
called from Test.Framework.Improving.runImprovingIO,
called from Test.Framework.Providers.QuickCheck2.runProperty,
called from Test.Framework.Providers.QuickCheck2.runTest,
called from Test.Framework.Runners.Core.runSimpleTest,
called from Test.Framework.Runners.Core.runTestTree.go,
called from Test.Framework.Runners.Core.runTestTree,
called from Test.Framework.Runners.Core.runTests',
called from Test.Framework.Runners.Core.runTests,
called from Test.Framework.Runners.Console.defaultMainWithOpts,
called from Test.Framework.Runners.Console.defaultMainWithArgs,
called from Test.Framework.Runners.Console.defaultMain,
called from Main.main
Can anyone tell me:
Why I see multiple STARTING TEST lines without GOOD or NOPE between them, despite the -j1?
How I get an actual stack trace that shows where the test is allocating all its memory?
Thanks for any ideas!

For anyone who finds this question, the problem with my code was that my arbitrary instance for expressions didn't constrain sizes properly, so sometimes tried to make enormous trees. See the "Generating Recursive Data Types" section of the QuickCheck manual for what I should have been doing!
I found that running commands like:
./long/path/to/foo-test -o3 +RTS -xc
helped me figure out what was going on. Strangely, the backtrace still shows several threads of execution. I don't really understand why, but at least I could then see that I was spending time in my "makeAnExpr" function. The trick is to tune the timeout (3 seconds above) so that it doesn't kill a test until it's well and truly stuck, but does stop it before it starts eating all your RAM!

Related

Why does an exception from an FFI appear to occur, but results in an empty lefts list?

First, I don't have an easy to reproduce example on hand, as the code is calling out to a MATLAB engine, which requires a license. It may be possible to construct a similar example, just using C, though. I have the following snippet from a test:
ei1 :: Either SomeException MAnyArray <- try $ engineGetVar eng foopi
putStrLn $ assert (isRight ei1) " Can clearVar once"
clearVar eng foopi
ei2 :: Either SomeException MAnyArray <- try $ engineGetVar eng foopi
putStrLn $ assert (isLeft ei2) $
" Can't clearVar twice: " <> (show $ lefts [ei2])
putStrLn " Finished testClearVar"
This results in the output:
Can clearVar once
Error using save
Variable 'foopi' not found.
Can't clearVar twice: []
Finished testClearVar
The confusing bit is this expression, since the assertion appears to succeed (meaning that ei2 is a Left value, but when calling lefts [ei2], no Left values are found):
putStrLn $ assert (isLeft ei2) $
" Can't clearVar twice: " <> (show $ lefts [ei2])
If you look closely at documentation of assert you will find:
Assertions can normally be turned on or off with a compiler flag (for GHC, assertions are normally on unless optimisation is turned on with -O or the -fignore-asserts option is given). When assertions are turned off, the first argument to assert is ignored, and the second argument is returned as the result.
I assume this is a regular package you are working on and not just a file you are compiling manually with ghc. By default cabal will compile a project with -O, which means your asserts are simply ignored. What you need is either -O0 or -fno-ignore-asserts flag added. But what I would recommend is just don't rely on assert at all.

“P6opaque, Str” vs simple “Str” types in Perl 6

This is a follow-up to my previous question.
I am finally able to reproduce the error here:
my #recentList = prompt("Get recentList: e.g. 1 2 3: ").words || (2,4,6);
say "the list is: ", #recentList;
for #recentList -> $x {
say "one element is: ", $x;
say "element type is: ", $x.WHAT;
say "test (1,2,3).tail(\"2\") : ", (1,2,3).tail("2");
say ( (10.rand.Int xx 10) xx 15 ).map: { #($_.tail($x)); };
}
And the results are ok as long as I use the default list by just hitting return at the prompt and not entering anything. But if I enter a number, it gives this error:
Get recentList: e.g. 1 2 3: 2
the list is: [2]
one element is: 2
element type is: (Str)
test (1,2,3).tail("2") : (2 3)
This type cannot unbox to a native integer: P6opaque, Str
in block at intType.p6 line 9
in block <unit> at intType.p6 line 5
If tail("2") works, why does tail($x) fail? Also, in my original code, tail($x.Int) wouldn't correct the problem, but it did here.
This is at best a nanswer. It is a thus-far failed attempt to figure out this problem. I may have just wandered off into the weeds. But I'll publish what I have. If nothing else, maybe it can serve as a reminder that the first three steps below are sensible ones; thereafter I'm gambling on my ability to work my way forward by spelunking source code when I would probably make much faster and more reliable progress by directly debugging the compiler as discussed in the third step.
OK, the first step was an MRE. What you've provided was an E that was fully R and sufficiently M. :)
Step #2 was increasing the M (golfing). I got it down to:
Any.tail('0'); # OK
Any.tail('1'); # BOOM
Note that it can be actual values:
1.tail('1'); # BOOM
(1..2).tail('1'); # BOOM
But some values work:
(1,2).tail('1'); # OK
Step #3 probably should be to follow the instructions in Playing with the code of Rakudo Perl 6 to track the compiler's execution, eg by sticking says in its source code and recompiling it.
You may also want to try out App::MoarVM::Debug. (I haven't.)
Using these approaches you'll have the power to track with absolute precision what the compiler does for any code you throw at it. I recommend you do this even though I didn't. Maybe you can figure out where I've gone wrong.
In the following I trace this problem by just directly spelunking the Rakudo compiler's source code.
A search for "method tail" in the Rakudo sources yielded 4 matches. For my golf the matching method is a match in core/AnyIterableMethods.pm6.
The tail parameter $n clearly isn't a Callable so the pertinent line that continues our spelunking is Rakudo::Iterator.LastNValues(self.iterator,$n,'tail').
A search for this leads to this method in core/Iterator.pm6.
This in turn calls this .new routine.
These three lines:
nqp::if(
n <= 0, # must be HLL comparison
Rakudo::Iterator.Empty, # negative is just nothing
explain why '0' works. The <= operator coerces its operands to numeric before doing the numeric comparison. So '0' coerces to 0, the condition is True, the result is Rakudo::Iterator.Empty, and the Any.tail('0') yields () and doesn't complain.
The code that immediately follows the above three lines is the else branch of the nqp::if. It closes with nqp::create(self)!SET-SELF(iterator,n,f).
That in turn calls the !SET-SELF routine, which has the line:
($!lastn := nqp::setelems(nqp::list, $!size = size)),
Which attempts to assign size, which in our BOOM case is '1', to $!size. But $!size is declared as:
has int $!size;
Bingo.
Or is it? I don't know if I really have correctly tracked the problem down. I'm only spelunking the code in the github repo, not actually running an instrumented version of the compiler and tracing its execution, as discussed as the sensible step #3 for trying to figure out the problem you've encountered.
Worse, when I'm running a compiler it's an old one whereas the code I'm spelunking is the master...
Why does this work?
(*,*).tail('1') # OK
The code path for this will presumably be this method. The parameter $n isn't a Callable so the code path will run thru the path that uses the $n in the lines:
nqp::unless(
nqp::istype($n,Whatever) || $n == Inf,
$iterator.skip-at-least(nqp::elems($!reified) - $n.Int)
The $n == Inf shouldn't be a problem. The == will coerce its operands to numerics and that should take care of $n being '1'.
The nqp::elems($!reified) - $n.Int shouldn't be a problem either.
The nqp ops doc shows that nqp::elems always returns an int. So this boils down to an int - Int which should work.
Hmm.
A blame of these lines shows that the .Int in the last line was only added 3 months ago.
So, clutching at straws, what happens if one tries:
(my int $foo = 1) - '1' # OK
Nope, that's not the problem.
It seems the trail has grown cold or rather I've wandered off the actual execution path.
I'll publish what I've got. Maybe someone else can pick it up from here or I'll have another go in a day or three...

Is it possible to recover from an erroneous eval in hint?

I am trying to use hint package from hackage to create a simple environment where user can issue lines of code for evaluation (like in ghci). I expect some of the input lines to be erroneous (eval would end the session with an error). How can I create a robust session that ignores erroneous input (or better: it reports an error but can accept other input) and keeps the previously consistent state?
Also, I would like to use it in do style, i.e. let a = 3 as standalone input line makes sense.
To clarify things: I have no problem with a single eval. What I would like to do, is allow continuing evaluation even after some step failed. Also I would like to incrementally extend a monadic chain (as you do in ghci I guess).
In other words: I want something like this, except that I get to evaluate 3 and don't stop at undefined with the error.
runInterpreter $ setImports [ "Prelude" ] >> eval "undefined" >> eval "3"
More specifically I would like something like this to be possible:
runInterpreter $ setImports ... >> eval' "let a = (1, 2)" -- modifying context
>> typeOf "b" -- error but not breaking the chain
>> typeOf "a" -- (Num a, Num b) => (a, b)
I don't expect it to work this straightforwardly, this is just to show the idea. I basically would like to build up some context (as you do in ghci) and every addition to the context would modify it only if there is no failure, failures could be logged or explicitly retrieved after each attempt to modify the context.
You didn't show any code so I don't know the problem. The most straight-forward way I use hint handles errors fine:
import Language.Haskell.Interpreter
let doEval s = runInterpreter $ setImports ["Prelude"] >> eval s
has resulted in fine output for me...
Prelude Language.Haskell.Interpreter> doEval "1 + 2"
Right "3"
Prelude Language.Haskell.Interpreter> doEval "1 + 'c'"
ghc: panic! (the 'impossible' happened)
(GHC version 7.10.2 for x86_64-apple-darwin):
nameModule doEval_a43r
... Except that now the impossible happens... that's a bug. Notice you are supposed to get Left someError in cases like these:
data InterpreterError
= UnknownError String
| WontCompile [GhcError]
| NotAllowed String
| GhcException String
-- Defined in ‘hint-0.4.2.3:Hint.Base’
Have you looked through the ghchq bug list and/or submitted an issue?
EDIT:
And the correct functionality is back, at least as of GHC 7.10.3 x64 on OS X with hint version 0.4.2.3. In other words, it appears the bug went away from 7.10.2 to 7.10.3
The output is:
Left (WontCompile [GhcError {errMsg = ":3:3:\n No instance for (Num Char) arising from a use of \8216+\8217\n In the expression: 1 + 'c'\n In an equation for \8216e_11\8217: e_11 = 1 + 'c'\n In the first argument of \8216show_M439719814875238119360034\8217, namely\n \8216(let e_11 = 1 + 'c' in e_11)\8217"}])
Though executing the doEval line twice in GHCi does cause a panic, things seem to work once in the interpreter and properly regardless when compiled.

How to show progress in Shake?

I am trying to figure out how can i take the progress info from a Progress type (in Development.Shake.Progress) to output it before executing a command. The possible desired output would be:
[1/9] Compiling src/Window/Window.cpp
[2/9] Compiling src/Window/GlfwError.cpp
[3/9] Compiling src/Window/GlfwContext.cpp
[4/9] Compiling src/Util/MemTrack.cpp
...
For now i am simulating this using some IORef that keeps the total (initially set to the sum of the source files) and a count that i increase before executing each build command, but this seems like a hackish solution to me.
On top of that this solution seems to work correctly on clean builds, but misbehaves on partial builds as the sum that displayed is still the total of all the source files.
With access to a Progress data type i will be able to calculate this fraction correctly using its countSkipped, countBuild, and countTodo members (see Progress.hs:53), but i am still not sure how i can i achieve this.
Any help is appreciated.
Values of type Progress are currently only available as an argument to the function stored in shakeProgress. You can obtain the Progress whenever you want with:
{-# LANGUAGE RecordWildCards #-}
import Development.Shake
import Data.IORef
import Data.Monoid
import Control.Monad
main = do
ref <- newIORef $ return mempty
shakeArgs shakeOptions{shakeProgress = writeIORef ref} $ do
want ["test" ++ show i | i <- [1..5]]
"test*" %> \out -> do
Progress{..} <- liftIO $ join $ readIORef ref
putNormal $
"[" ++ show (countBuilt + countSkipped + 1) ++
"/" ++ show (countBuilt + countSkipped + countTodo) ++
"] " ++ out
writeFile' out ""
Here we create an IORef to squirrel away the argument passed to shakeProgress, then retrieve it later when running the rules. Running the above code I see:
[1/5] test5
[2/5] test4
[3/5] test3
[4/5] test2
[5/5] test1
Running at a higher level of parallelism gives less precise results - initially there are only 3 items in todo (Shake increments countTodo as it finds items todo, and spawns items as soon as it knows about any of them), and there are often two rules running at the same index (there is no information about how many are in progress). Given knowledge of your specific rules, you could refine the output, e.g. storing an IORef you increment to ensure the index was monotonic.
The reason this code is somewhat convoluted is that the Progress information was intended to be used for asynchronous progress messages, although your approach seems perfectly valid. It may be worth introducing a getProgress :: Action Progress function for synchronous progress messages.

Order of execution within monads

I was learning how to use the State monad and I noticed some odd behavior in terms of the order of execution. Removing the distracting bits that involve using the actual state, say I have the following code:
import Control.Monad
import Control.Monad.State
import Debug.Trace
mainAction :: State Int ()
mainAction = do
traceM "Starting the main action"
forM [0..2] (\i -> do
traceM $ "i is " ++ show i
forM [0..2] (\j -> do
traceM $ "j is " ++ show j
someSubaction i j
)
)
Running runState mainAction 1 in ghci produces the following output:
j is 2
j is 1
j is 0
i is 2
j is 2
j is 1
j is 0
i is 1
j is 2
j is 1
j is 0
i is 0
Outside for loop
which seems like the reverse order of execution of what might be expected. I thought that maybe this is a quirk of forM and tried it with sequence which specifically states that it runs its computation sequentially from left to right like so:
mainAction :: State Int ()
mainAction = do
traceM "Outside for loop"
sequence $ map handleI [0..2]
return ()
where
handleI i = do
traceM $ "i is " ++ show i
sequence $ map (handleJ i) [0..2]
handleJ i j = do
traceM $ "j is " ++ show j
someSubaction i j
However, the sequence version produces the same output. What is the actual logic in terms of the order of execution that is happening here?
Haskell is lazy, which means things are not executed immediately. Things are executed whenever their result is needed – but no sooner. Sometimes code isn't executed at all if its result isn't needed.
If you stick a bunch of trace calls in a pure function, you will see this laziness happening. The first thing that is needed will be executed first, so that's the trace call you see first.
When something says "the computation is run from left to right" what it means is that the result will be the same as if the computation was run from left to right. What actually happens under the hood might be very different.
This is in fact why it's a bad idea to do I/O inside pure functions. As you have discovered, you get "weird" results because the execution order can be pretty much anything that produces the correct result.
Why is this a good idea? When the language doesn't enforce a specific execution order (such as the traditional "top to bottom" order seen in imperative languages) the compiler is free to do a tonne of optimisations, such as for example not executing some code at all because its result isn't needed.
I would recommend you to not think too much about execution order in Haskell. There should be no reason to. Leave that up to the compiler. Think instead about which values you want. Does the function give the correct value? Then it works, regardless of which order it executes things in.
I thought that maybe this is a quirk of forM and tried it with sequence which specifically states that it runs its computation sequentially from left to right like so: [...]
You need to learn to make the following, tricky distinction:
The order of evaluation
The order of effects (a.k.a. "actions")
What forM, sequence and similar functions promise is that the effects will be ordered from left to right. So for example, the following is guaranteed to print characters in the same order that they occur in the string:
putStrLn :: String -> IO ()
putStrLn str = forM_ str putChar >> putChar '\n'
But that doesn't mean that expressions are evaluated in this left-to-right order. The program has to evaluate enough of the expressions to figure out what the next action is, but that often does not require evaluating everything in every expression involved in earlier actions.
Your example uses the State monad, which bottoms out to pure code, so that accentuates the order issues. The only thing that a traversal functions such as forM promises in this case is that gets inside the actions mapped to the list elements will see the effect of puts for elements to their left in the list.

Resources