catching state that is changed by execStateT

catching state that is changed by execStateT - haskell

I am very new to Haskell. Recently, I had to work with it for my project.
I have a certain code which is evaluating some state using execStateT and I want to catch each state change and return it.
I have tried to understand what execStateT and the flow of the code, but I am failing at certain places, where I couldn't understand how to get the thing I really want.
Maybe due to a somewhat RAW understanding of monads and other concepts, I am finding a need to change the whole structure of the code.
In the upcoming code, I tried to use par to create a file and write the state of a variable into that file, and so it doesn't affect the actual work of the code. But it didn't create a file and write the inputs into it.
I am facing the following code
campaign u v w ts d = let d' = fromMaybe defaultDict d in fmap (fromMaybe mempty) (view (hasLens . to knownCoverage)) >>= \c -> do
g <- view (hasLens . to seed)
let g' = mkStdGen $ fromMaybe (d' ^. defSeed) g
execStateT (evalRandT runCampaign g') (Campaign ((,Open (-1)) <$> ts) c d') where
step = runUpdate (updateTest v Nothing) >> lift u >> runCampaign
runCampaign = use (hasLens . tests . to (fmap snd)) >>= update
update c = view hasLens >>= \(CampaignConf tl q sl _ _) ->
if | any (\case Open n -> n < tl; _ -> False) c -> callseq v w q >> step
| any (\case Large n _ -> n < sl; _ -> False) c -> step
| otherwise -> lift u
What I want here is find some way to look at the changes in variable v, to do my further work. This can be done either by writing a variable into a file or returning it to the console.
Thanks for help!
[Edit 1]
Here are the imports I am making:
import Control.Lens
import Control.Monad (liftM2, replicateM, when)
import Control.Monad.Catch (MonadCatch(..), MonadThrow(..))
import Control.Monad.Random.Strict (MonadRandom, RandT, evalRandT)
import Control.Monad.Reader.Class (MonadReader)
import Control.Monad.State.Strict (MonadState(..), StateT, evalStateT, execStateT)
import Control.Monad.Trans (lift)
import Control.Monad.Trans.Random.Strict (liftCatch)
import Data.Aeson (ToJSON(..), object)
import Data.Bool (bool)
import Data.Either (lefts)
import Data.Foldable (toList)
import Data.Map (Map, mapKeys, unionWith)
import Data.Maybe (fromMaybe, isNothing, maybeToList)
import Data.Ord (comparing)
import Data.Has (Has(..))
import Data.Set (Set, union)
import Data.Text (unpack)
import EVM
import EVM.Types (W256)
import Numeric (showHex)
import System.Random (mkStdGen)

Here's one approach. Suppose you have a class for monads supporting logging of messages with a certain type (MonadLogger is one, but I don't know enough about it to use it here). I'll just use a hypothetical CanLog class. Now you can write
newtype LStateT s m a = LStateT
{runLStateT :: StateT s m a}
deriving (Functor, Applicative, Monad)
instance CanLog s m => MonadState s (LStateT s m) where
get = LStateT get
put x = LStateT $ do
lift $ -- Log the state transition
put x

Related

Efficient Haskell equivalent to NumPy's argsort

Is there a standard Haskell equivalent to NumPy's argsort function?
I'm using HMatrix and, so, would like a function compatible with Vector R which is an alias for Data.Vector.Storable.Vector Double. The argSort function below is the implementation I'm currently using:
{-# LANGUAGE NoImplicitPrelude #-}
module Main where
import qualified Data.List as L
import qualified Data.Vector as V
import qualified Data.Vector.Storable as VS
import Prelude (($), Double, IO, Int, compare, print, snd)
a :: VS.Vector Double
a = VS.fromList [40.0, 20.0, 10.0, 11.0]
argSort :: VS.Vector Double -> V.Vector Int
argSort xs = V.fromList (L.map snd $ L.sortBy (\(x0, _) (x1, _) -> compare x0 x1) (L.zip (VS.toList xs) [0..]))
main :: IO ()
main = print $ argSort a -- yields [2,3,1,0]
I'm using explicit qualified imports just to make it clear where every type and function is coming from.
This implementation is not terribly efficient since it converts the input vector to a list and the result back to a vector. Does something like this (but more efficient) exist somewhere?
Update
#leftaroundabout had a good solution. This is the solution I ended up with:
module LAUtil.Sorting
( IndexVector
, argSort
)
where
import Control.Monad
import Control.Monad.ST
import Data.Ord
import qualified Data.Vector.Algorithms.Intro as VAI
import qualified Data.Vector.Storable as VS
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import Numeric.LinearAlgebra
type IndexVector = VU.Vector Int
argSort :: Vector R -> IndexVector
argSort xs = runST $ do
let l = VS.length xs
t0 <- VUM.new l
forM_ [0..l - 1] $
\i -> VUM.unsafeWrite t0 i (i, (VS.!) xs i)
VAI.sortBy (comparing snd) t0
t1 <- VUM.new l
forM_ [0..l - 1] $
\i -> VUM.unsafeRead t0 i >>= \(x, _) -> VUM.unsafeWrite t1 i x
VU.freeze t1
This is more directly usable with Numeric.LinearAlgebra since the data vector is a Storable. This uses an unboxed vector for the indices.

Use vector-algorithms:
import Data.Ord (comparing)
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Algorithms.Intro as VAlgo
argSort :: (Ord a, VU.Unbox a) => VU.Vector a -> VU.Vector Int
argSort xs = VU.map fst $ VU.create $ do
xsi <- VU.thaw $ VU.indexed xs
VAlgo.sortBy (comparing snd) xsi
return xsi
Note these are Unboxed rather than Storable vectors. The latter need to make some tradeoffs to allow impure C FFI operations and can't properly handle heterogeneous tuples. You can of course always convert to and from storable vectors.

What worked better for me is using Data.map, as it is subject to list fusion, got a speed up. Here n=Length xs.
import Data.Map as M (toList, fromList, toAscList)
out :: Int -> [Double] -> [Int]
out n !xs = let !a= (M.toAscList (M.fromList $! (zip xs [0..n])))
!res=a `seq` L.map snd a
in res
However this is only aplicable for periodic lists, as:
out 12 [1,2,3,4,1,2,3,4,1,2,3,4] == out 12 [1,2,3,4,1,3,2,4,1,2,3,4]

Haskell Hashtable Performance

I am trying to use hash tables in Haskell with the hashtables package, and finding that I cannot get anywhere near Python's performance. How can I achieve similar performance? Is it possible given current Haskell libraries and compilers? If not, what's the underlying issue?
Here is my Python code:
y = {}
for x in xrange(10000000):
y[x] = x
print y[100]
Here's my corresponding Haskell code:
import qualified Data.HashTable.IO as H
import Control.Monad
main = do
y <- H.new :: IO (H.CuckooHashTable Int Int)
forM_ [1..10000000] $ \x -> H.insert y x x
H.lookup y 100 >>= print
Here is another version using Data.Map, which is slower than both for me:
import qualified Data.Map as Map
import Data.List
import Control.Monad
main = do
let m = foldl' (\m x -> Map.insert x x m) Map.empty [1..10000000]
print $ Map.lookup 100 m
Interestingly enough, Data.HashMap performs very badly:
import qualified Data.HashMap.Strict as Map
import Data.List
main = do
let m = foldl' (\m x -> Map.insert x x m) Map.empty [1..10000000]
print $ Map.lookup 100 m
My suspicion is that Data.HashMap performs badly because unlike Data.Map, it is not spine-strict (I think), so foldl' is just a foldl, with the associated thunk buildup problems.
Note that I have used -prof and verified that the majority of the time is spend in the hashtables or Data.Map code, not on the forM or anything like that. All code is compiled with -O2 and no other parameters.

As reddit.com/u/cheecheeo suggested here, using Data.Judy, you'll get similar performance for your particular microbenchmark:
module Main where
import qualified Data.Judy as J
import Control.Monad (forM_)
main = do
h <- J.new :: IO (J.JudyL Int)
forM_ [0..10000000] $ \i -> J.insert (fromIntegral i) i h
v <- J.lookup 100 h
putStrLn $ show v
Timeing the above:
$ time ./Main
Just 100
real 0m0.958s
user 0m0.924s
sys 0m0.032s
Timing the python code of OP:
$ time ./main.py
100
real 0m1.067s
user 0m0.886s
sys 0m0.180s

The documentation for hashtables notes that "Cuckoo hashing, like the basic hash table implementation using linear probing, can suffer from long delays when the table is resized." You use new, which creates a new table of the default size. From looking at the source, it appears that the default size is 2. Inserting 10000000 items likely entails numerous resizings.
Try using newSized.

Given the times above, I thought I would throw in the Data.Map solution, which seems to be comparable to using newSized.
import qualified Data.Map as M
main = do
print $ M.lookup 100 $ M.fromList $ map (\x -> (x,x)) [1..10000000]

Join two consumers into a single consumer that returns multiple values?

I have been experimenting with the new pipes-http package and I had a thought. I have two parsers for a web page, one that returns line items and another a number from elsewhere in the page. When I grab the page, it'd be nice to string these parsers together and get their results at the same time from the same bytestring producer, rather than fetching the page twice or fetching all the html into memory and parsing it twice.
In other words, say you have two Consumers:
c1 :: Consumer a m r1
c2 :: Consumer a m r2
Is it possible to make a function like this:
combineConsumers :: Consumer a m r1 -> Consumer a m r2 -> Consumer a m (r1, r2)
combineConsumers = undefined
I have tried a few things, but I can't figure it out. I understand if it isn't possible, but it would be convenient.
Edit:
I'm sorry it turns out I was making an assumption about pipes-attoparsec, due to my experience with conduit-attoparsec that caused me to ask the wrong question. Pipes-attoparsec turns an attoparsec into a pipes Parser when I just assumed that it would return a pipes Consumer. That means that I can't actually turn two attoparsec parsers into consumers that take text and return a result, then use them with the plain old pipes ecosystem. I'm sorry but I just don't understand pipes-parse.
Even though it doesn't help me, Arthur's answer is pretty much what I envisioned when I asked the question, and I'll probably end up using his solution in the future. In the meantime I'm just going to use conduit.

It the results are "monoidal", you can use the tee function from the Pipes prelude, in combination with a WriterT.
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid
import Control.Monad
import Control.Monad.Writer
import Control.Monad.Writer.Class
import Pipes
import qualified Pipes.Prelude as P
import qualified Data.Text as T
textSource :: Producer T.Text IO ()
textSource = yield "foo" >> yield "bar" >> yield "foo" >> yield "nah"
counter :: Monoid w => T.Text
-> (T.Text -> w)
-> Consumer T.Text (WriterT w IO) ()
counter word inject = P.filter (==word) >-> P.mapM (tell . inject) >-> P.drain
main :: IO ()
main = do
result <-runWriterT $ runEffect $
hoist lift textSource >->
P.tee (counter "foo" inject1) >-> (counter "bar" inject2)
putStrLn . show $ result
where
inject1 _ = (,) (Sum 1) mempty
inject2 _ = (,) mempty (Sum 1)
Update: As mentioned in a comment, the real problem I see is that in pipes parsers aren't Consumers. And how can you run two parsers concurrently if they have different behaviours regarding leftovers? What happens if one of the parsers wants to "un-draw" some text and the other parser doesn't?
One possible solution is to run the parsers in a truly concurrent manner, in different threads. The primitives in the pipes-concurrency package let you "duplicate" a Producer by writing the same data to two different mailboxes. And then each parser can do whatever it wants with its own copy of the producer. Here's an example which also uses the pipes-parse, pipes-attoparsec and async packages:
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid
import qualified Data.Text as T
import Data.Attoparsec.Text hiding (takeWhile)
import Data.Attoparsec.Combinator
import Control.Applicative
import Control.Monad
import Control.Monad.State.Strict
import Pipes
import qualified Pipes.Prelude as P
import qualified Pipes.Attoparsec as P
import qualified Pipes.Concurrent as P
import qualified Control.Concurrent.Async as A
parseChars :: Char -> Parser [Char]
parseChars c = fmap mconcat $
many (notChar c) *> many1 (some (char c) <* many (notChar c))
textSource :: Producer T.Text IO ()
textSource = yield "foo" >> yield "bar" >> yield "foo" >> yield "nah"
parseConc :: Producer T.Text IO ()
-> Parser a
-> Parser b
-> IO (Either P.ParsingError a,Either P.ParsingError b)
parseConc producer parser1 parser2 = do
(outbox1,inbox1,seal1) <- P.spawn' P.Unbounded
(outbox2,inbox2,seal2) <- P.spawn' P.Unbounded
feeding <- A.async $ runEffect $ producer >-> P.tee (P.toOutput outbox1)
>-> P.toOutput outbox2
sealing <- A.async $ A.wait feeding >> P.atomically seal1 >> P.atomically seal2
r <- A.runConcurrently $
(,) <$> (A.Concurrently $ parseInbox parser1 inbox1)
<*> (A.Concurrently $ parseInbox parser2 inbox2)
A.wait sealing
return r
where
parseInbox parser inbox = evalStateT (P.parse parser) (P.fromInput inbox)
main :: IO ()
main = do
(Right a, Right b) <- parseConc textSource (parseChars 'o') (parseChars 'a')
putStrLn . show $ (a,b)
The result is:
("oooo","aa")
I'm not sure how much overhead this approach introduces.

I think something is wrong with the way you are going about this, for the reasons Davorak mentions in his remark. But if you really need such a function, you can define it.
import Pipes.Internal
import Pipes.Core
zipConsumers :: Monad m => Consumer a m r -> Consumer a m s -> Consumer a m (r,s)
zipConsumers p q = go (p,q) where
go (p,q) = case (p,q) of
(Pure r , Pure s) -> Pure (r,s)
(M mpr , ps) -> M (do pr <- mpr
return (go (pr, ps)))
(pr , M mps) -> M (do ps <- mps
return (go (pr, ps)))
(Request _ f, Request _ g) -> Request () (\a -> go (f a, g a))
(Request _ f, Pure s) -> Request () (\a -> do r <- f a
return (r, s))
(Pure r , Request _ g) -> Request () (\a -> do s <- g a
return (r,s))
(Respond x _, _ ) -> closed x
(_ , Respond y _) -> closed y
If you are 'zipping' consumers without using their return value, only their 'effects' you can just use tee consumer1 >-> consumer2

The idiomatic solution is to rewrite your Consumers as a Fold or FoldM from the foldl library and then combine them using Applicative style. You can then convert this combined fold to one that works on pipes.
Let's assume that you either have two Folds:
fold1 :: Fold a r1
fold2 :: Fold a r2
... or two FoldMs:
foldM1 :: Monad m => FoldM a m r1
foldM2 :: Monad m => FoldM a m r2
Then you combine these into a single Fold/FoldM using Applicative style:
import Control.Applicative
foldBoth :: Fold a (r1, r2)
foldBoth = (,) <$> fold1 <*> fold2
foldBothM :: Monad m => FoldM a m (r1, r2)
foldBothM = (,) <$> foldM1 <*> foldM2
-- or: foldBoth = liftA2 (,) fold1 fold2
-- foldMBoth = liftA2 (,) foldM1 foldM2
You can turn either fold into a Pipes.Prelude-style fold or a Parser. Here are the necessary conversion functions:
import Control.Foldl (purely, impurely)
import qualified Pipes.Prelude as Pipes
import qualified Pipes.Parse as Parse
purely Pipes.fold
:: Monad m => Fold a b -> Producer a m () -> m b
impurely Pipes.foldM
:: Monad m => FoldM m a b -> Producer a m () -> m b
purely Parse.foldAll
:: Monad m => Fold a b -> Parser a m r
impurely Parse.foldMAll
:: Monad m => FoldM a m b -> Parser a m r
The reason for the purely and impurely functions is so that foldl and pipes can interoperate without either one incurring a dependency on the other. Also, they allow libraries other than pipes (like conduit) to reuse foldl without a dependency, too (Hint hint, #MichaelSnoyman).
I apologize that this feature is not documented, mainly because it took me a while to figure out how to get pipes and foldl to interoperate in a dependency-free manner, and that was after I wrote the pipes tutorial. I will update the tutorial to point out this trick.
To learn how to use foldl, just read the documentation in the main module. It's a very small and easy-to-learn library.

For what it's worth, in the conduit world, the relevant function is zipSinks. There might be some way to adapt this function to work for pipes, but automatic termination may get in the way.

Consumer forms a Monad so
combineConsumers = liftM2 (,)
will type check. Unfortunately, the semantics might be unlike what you're expecting: the first consumer will run to completion and then the second.

Is there a way how to enumerate all functions in a module using Template Haskell?

While I can use reify to get information about most other syntactic constructs, I couldn't find anything that would give some information about a module.

Unfortunately Template Haskell currently has no such capabilities. All the solutions involve parsing of the module's source-code. However the location and loc_filename functions of TH make it easy to locate the module with the calling splice.
Here is a solution extracted from the source code of one of my projects:
{-# LANGUAGE LambdaCase, TupleSections #-}
import Language.Haskell.TH
import qualified Data.Attoparsec.Text as AP
import qualified Data.Text.IO as Text
import qualified Data.Text as Text
import qualified Data.Char as Char
import Data.Maybe
import Data.List
import Control.Applicative
import Data.Traversable
import Prelude hiding (mapM)
reifyLocalFunctions :: Q [(Name, Type)]
reifyLocalFunctions =
listTopLevelFunctionLikeNames >>=
mapM (\name -> reifyFunction name >>= mapM (return . (name, ))) >>=
return . catMaybes
where
listTopLevelFunctionLikeNames = do
loc <- location
text <- runIO $ Text.readFile $ loc_filename loc
return $ map (mkName . Text.unpack) $ nub $ parse text
where
parse text =
either (error . ("Local function name parsing failure: " ++)) id $
AP.parseOnly parser text
where
parser =
AP.sepBy (optional topLevelFunctionP <* AP.skipWhile (not . AP.isEndOfLine))
AP.endOfLine >>=
return . catMaybes
where
topLevelFunctionP = do
head <- AP.satisfy Char.isLower
tail <- many (AP.satisfy (\c -> Char.isAlphaNum c || c `elem` ['_', '\'']))
return $ Text.pack $ head : tail
reifyFunction :: Name -> Q (Maybe Type)
reifyFunction name = do
tryToReify name >>= \case
Just (VarI _ t _ _) -> return $ Just $ t
_ -> return Nothing
tryToReify :: Name -> Q (Maybe Info)
tryToReify n = recover (return Nothing) (fmap Just $ reify n)

Haskell enumerators, odd errors

I'm trying to figure out how enumerators work, and therefore testing the enumerator library. I have a snippet which compiles on my desktop computer, but complains about No instance for MonadIO. Am I way off on how to use the enumerator library or is something amiss with my laptop?
iterateetests.hs:29:17:
No instance for (MonadIO (Iteratee Int IO))
arising from a use of `enumeratorFile' at iterateetests.hs:29:17-32
Possible fix:
add an instance declaration for (MonadIO (Iteratee Int IO))
In the first argument of `(==<<)', namely `enumeratorFile h'
In the first argument of `run_', namely
`(enumeratorFile h ==<< summer)'
In the expression: run_ (enumeratorFile h ==<< summer)
And the code
import Data.Enumerator
import qualified Data.Enumerator.List as EL
import System.IO
import Control.Exception.Base
import Control.Monad.Trans
summer :: (Monad m) => Iteratee Int m Int
summer = do
m <- EL.head
case m of
Nothing -> return 0
Just i -> do
rest <- summer
return (i+rest)
enumeratorFile h (Continue k) = do
e <- liftIO (hIsEOF h)
if e
then k EOF
else do
l <- liftIO $ hGetLine h
k (Chunks [read l]) >>== enumeratorFile h
enumeratorFile _ step = returnI step
main = do
bracket
(openFile "numberlist" ReadMode)
(hClose)
(\h -> run_ (enumeratorFile h ==<< summer))

Try changing the import of:
import Control.Monad.Trans
to
import Control.Monad.IO.Class
It may be that you have an older version of mtl installed, and therefore have different MonadIO typeclasses between Control.Monad.Trans and Data.Enumerator.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

catching state that is changed by execStateT - haskell

Related

Efficient Haskell equivalent to NumPy's argsort

Haskell Hashtable Performance

Join two consumers into a single consumer that returns multiple values?

Is there a way how to enumerate all functions in a module using Template Haskell?

Haskell enumerators, odd errors

Categories

Resources