maxIndex of an MVector - haskell

Data.Vector includes a function maxIndex with type maxIndex :: (Ord a) => Vector a -> Int that returns the index of the maximum value in that Vector. I'm working with mutable Vectors, however, and MVector doesn't have maxIndex defined for it.
What's the best way of getting the data I want out of the MVector I have? My code currently is:
import qualified Data.Vector.Unboxed.Mutable as MV
import Control.Monad.ST
import Control.Monad (mapM_)
type MaxIndex = Int
step :: forall s. MV.MVector s Int -> MaxIndex -> ST s ()
step vec i = do
n <- MV.unsafeRead vec i
MV.write vec i 0
let l = MV.length vec
(k, x) = n `divMod` l
mapM_ (\j -> MV.modify vec (+k) j) [0..l-1] -- side note, this is just
-- fmap (+k) vec, but MVector is not
-- a functor. Is there a better way?
mapM_ (\j -> MV.modify vec (+1) (j `mod` l)) [i+1..i+x]
where i is the index I'm looking to derive inside step. I'm doing this because the actions here need to eventually be wrapped inside an until and repeated until a predicate is satisfied, and freezing and thawing every cycle sounds ludicrously expensive.

I see lots of talk about unsafe freezing which seems suspect since you plan to mutate this memory later, thus violating the assurance you are implicitly giving when calling unsafeFreeze.
My suggestion is to just write an imperative-style maxIndex function. The below is typed but not tested:
import qualified Data.Vector.Unboxed.Mutable as MV
import Control.Monad.ST
import Control.Monad (mapM_)
maxIndex :: (Ord a, MV.Unbox a) => MV.MVector s a -> ST s (Maybe Int)
maxIndex mv | len == 0 = pure Nothing
| otherwise = Just <$> go 0 0
where
len = MV.length mv
go n i | i >=len = pure n
| otherwise = do
nVal <- MV.unsafeRead mv n
iVal <- MV.unsafeRead mv i
if nVal < iVal then go i (i+1)
else go n (i+1)

Have you considered freezing the vector with unsafeFreeze which is supposed to be fast (i.e. Θ(1))? For example you can define maxIndex for mutable vectors like this:
maxIndex = fmap V.maxIndex . V.unsafeFreeze
This assumes that you have imported the following:
import qualified Data.Vector.Unboxed as V
unsafeFreeze doesn't actually copy any data and should be fast, but it would be interesting to run a criterion benchmark to see if this approach is actually faster compared to an explicit loop.

Related

Creating a random permutation of 1..N with Data.Vector.Unboxed.Mutable

I want to create a list containing a random permutation of the numbers 1 through N. As I understand it, it is possible to use VUM.swap in the runST, but since I need random numbers as well I figured I might do both in the IO monad.
The code below yields:
Expected type: IO (VU.Vector Int), Actual type: IO (VU.Vector
(VU.Vector a0))
for the return statement.
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import System.Random
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
VU.forM_ (VU.fromList [2..VUM.length vector]) $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
return $ VU.unsafeFreeze vector
I'm not quite sure why the return vector is nested. Do I have to use VU.fold1M_ instead?
unsafeFreeze vector already returns IO (VU.Vector Int). Just change the last line to VU.unsafeFreeze vector.
On another note, you should iterate until VUM.length vector - 1, since both [x .. y] and randomRIO use inclusive ranges. Also, you can use plain forM_ here for iteration, since you only care about side effects.
import Control.Monad
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import System.Random
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
forM_ [2..VUM.length vector - 1] $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
VU.unsafeFreeze vector
I looked at the generated code, and it seems that with GHC 7.10.3 forM_ compiles to an efficient loop while VU.forM_ retains the intermediate list and is surely significantly slower (which was my expected outcome for forM_, but I was unsure about VU.forM_).
I would try (note update at end):
import Control.Monad
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
forM_ [2..VUM.length vector] $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
return $ VU.unsafeFreeze vector
Edit: as #András Kovács pointed out, you don't want the return at the end so the last line should be:
VU.unsafeFreeze vector

Haskell: importing (infix) Data constructors from user added libraries

This is a simple question, but I cannot find the way to use the PSQ library.
The code below is messy, but seems to find PSQ and fromList, but fails to find Binding (Error: Not in scope: data constructor 'Data.PSQueue.Binding'). LearnYouAHaskell does not cover how to use non-standard libraries and I can't find any simple examples that just show PSQ being implemented.
import qualified Data.PSQueue (Binding, PSQ, fromList)
{-
data Binding k p
k :-> p binds the key k with the priority p.
Constructors
k :-> p
data PSQ k p
A mapping from keys k to priorites p.
-}
type VertHeap = Data.PSQueue.PSQ Int Int
main = do
--fromList :: (Ord k, Ord p) => [Binding k p] -> PSQ k p
return $ Data.PSQueue.fromList $ map (\k -> Data.PSQueue.Binding k 1000000) [2..10]
It can be easy to miss, but the data constructor for the Binding type is :->.
So this import should work:
import qualified Data.PSQueue (PSQ,Binding(..),fromList)
and later:
return $ Data.PSQueue.fromList $ map (\k -> k Data.PSQueue.:-> 1000000) [2..10]
Using Binding(..) will import all of the data constructors for the Binding data type.
Edit: :-> is just an infix operator defined by Data.PSQueue. Data.PSQueue.:-> is the fully qualified name for it.
Once I understood how to refer to Binding, I could use a more familiar pattern
import qualified Data.PSQueue as PSQ
type VertHeap = PSQ.PSQ Int Int
main = do
return $ PSQ.fromList $ map (\k -> k PSQ.:-> 1000000) [2..10]

Creating a mutable Data.Vector in Haskell

I wish to create a mutable vector using Data.Vector.Generic.Mutable.new. I have found examples that create a mutable vector by thawing a pure vector, but that's not what I wish to do.
Here is one of many failed attempts:
import Control.Monad.Primitive
import qualified Data.Vector.Generic.Mutable as GM
main = do
v <- (GM.new 10) :: (GM.MVector v a) => IO (v RealWorld a)
GM.write v 0 (3::Int)
x <- GM.read v 0
putStrLn $ show x
giving me the error
No instance for (GM.MVector v0 Int)
arising from an expression type signature
Possible fix: add an instance declaration for (GM.MVector v0 Int)
I tried variations based on the Haskell Vector tutorial with no luck.
I would also welcome suggestion on cleaner ways to construct the vector. The reference to RealWorld seems ugly to me.
The GM.MVector v a constaint is ambigous in v. In other words, from the type information you've given GHC, it still can't figure out what specific instance of GM.MVector you want it to use. For a mutable vector of Int use Data.Vector.Unboxed.Mutable.
import qualified Data.Vector.Unboxed.Mutable as M
main = do
v <- M.new 10
M.write v 0 (3 :: Int)
x <- M.read v 0
print x
I think the problem is that you have to give v a concrete type -- like this:
import Control.Monad.Primitive
import qualified Data.Vector.Mutable as V
import qualified Data.Vector.Generic.Mutable as GM
main = do
v <- GM.new 10 :: IO (V.MVector RealWorld Int)
GM.write v 0 (3::Int)
x <- GM.read v 0
putStrLn $ show x

"Persistently" Impure (IO) Vectors in Haskell, with database-like persistent interface

I have a computation that is best described as iterative mutations on a vector; the final result is the final state of the vector.
The "idiomatic" approach to making this functional, I think, is to simply pass on a new vector object along whenever it is "modified". So your iterative method would be operate_on_vector :: Vector -> Vector, which takes in a vector and outputs the modified vector, which is then fed through the method again.
This method is pretty straightforward and I had no problems implementing it, even being new to Haskell.
Alternatively, one could encapsulate all of this in a State monad and pass along a constantly re-created and modified vector as the state value.
However, I suffer a huge, huge performance cost, as these calculations are pretty intensive, the iterations many (on the order of millions) and the data vectors can get pretty large (on the order of at least thousands of primitives). Re-creating a new vector in memory at every step of the iteration seems pretty costly, data collection or not.
Then I considered how IO works -- it can be seen as basically like State, except the state value is the "World", which is constantly changing.
Maybe I could use something that is like IO to "operate" on a "world"? And the "world" would be the vector in-memory? Sort of like a database query, but everything is in memory.
For example with io you could do
do
putStrLn "enter something"
something <- getLine
putStrLine $ "you entered " ++ something
which can be seen as "performing" putStrLn and "modifying" the World object, returning a new World object and feeding it into the next function, which queryies the world object for a string that is the result of the modification, and then returns another world object after another modification.
Is there anything like that that can do this for mutable vectors?
do
putInVec 0 9 -- index 0, value 9
val <- getFromVec 0
putInVec 0 (val + 1)
, with "impure" "mutable" vectors, instead of passing along a new modified vector at each step.
I believe you can do this using mutable vector and a thin wrapper over Reader + ST (or IO) monad.
It can look like this:
type MyVector = IOVector $x -- Use your own elements type here instead of $x
newtype VectorIO a = VectorIO (ReaderT MyVector IO a) deriving (Monad, MonadReader, MonadIO)
-- You will need GeneralizedNewtypeDeriving extension here
-- Run your computation over an existing vector
runComputation :: MyVector -> VectorIO a -> IO MyVector
runComputation vector (VectorIO action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorIO a -> IO MyVector
runNewComputation n action = do
vector <- new n
runComputation vector action
putInVec :: Int -> $x -> VectorIO ()
putInVec idx val = do
v <- ask
liftIO $ write v idx val
getFromVec :: Int -> VectorIO $x
getFromVec idx = do
v <- ask
liftIO $ read v idx
That's really all. You can use VectorIO monad to perform your computations, just like you wanted in your example. If you do not want IO but want pure computations, you can use ST monad; modifications to the code above will be trivial.
Update
Here is an ST-based version:
{-# LANGUAGE GeneralizedNewtypeDeriving, FlexibleInstances, MultiParamTypeClasses, Rank2Types #-}
module Main where
import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.Reader
import Control.Monad.Reader.Class
import Control.Monad.ST
import Data.Vector as V
import Data.Vector.Mutable as MV
-- Your type of the elements
type E = Int
-- Mutable vector which will be used as a context
type MyVector s = MV.STVector s E
-- Immutable vector compatible with MyVector in its type
type MyPureVector = V.Vector E
-- Simple monad stack consisting of a reader with the mutable vector as a context
-- and of an ST action
newtype VectorST s a = VectorST (ReaderT (MyVector s) (ST s) a) deriving Monad
-- Make the VectorST a reader monad
instance MonadReader (MyVector s) (VectorST s) where
ask = VectorST $ ask
local f (VectorST a) = VectorST $ local f a
reader = VectorST . reader
-- Lift an ST action to a VectorST action
liftST :: ST s a -> VectorST s a
liftST = VectorST . lift
-- Run your computation over an existing vector
runComputation :: MyVector s -> VectorST s a -> ST s (MyVector s)
runComputation vector (VectorST action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorST s a -> ST s (MyVector s)
runNewComputation n action = do
vector <- MV.new n
runComputation vector action
-- Run a computation on a new mutable vector and then freeze it to an immutable one
runComputationPure :: Int -> (forall s. VectorST s a) -> MyPureVector
runComputationPure n action = runST $ do
vector <- runNewComputation n action
V.unsafeFreeze vector
-- Put an element into the current vector
putInVec :: Int -> E -> VectorST s ()
putInVec idx val = do
v <- ask
liftST $ MV.write v idx val
-- Retrieve an element from the current vector
getFromVec :: Int -> VectorST s E
getFromVec idx = do
v <- ask
liftST $ MV.read v idx

Monadic creation of vectors (or: can someone type annotate this for me?)

I came across the following piece of code as part of this Redddit sub-thread discussing an implementation of the Fisher-Yates shuffle:
randomIs g n = fill g 0
where
v = enumFromN 0 n
fill g i = when (i < n) $ do
let (x,g') = randomR (i, n-1) g
G.swap v i x
fill g' (i+1)
(I guess G refers to Data.Vector.Generic.Mutable... right?). Having never created vectors monadically before, I'm struggling to grasp this, especially with no type annotations. Doesn't v have type Data.Vector Int? How come one can pass it to G.swap then? Won't it have to be thawed first?
I might have just misunderstood Data.Vector.Generic, but if someone could clarify the above (by adding type annotations, perhaps?), I'd appreciate it.
Addendum: Here's my own attempt at adding type annotations:
import qualified Data.Vector.Unboxed as UVect
import qualified Data.Vector.Unboxed.Mutable as UMVect
import qualified System.Random as R
import Control.Monad
import Control.Monad.ST
randomPermutation :: forall a. (R.RandomGen a) => a -> Int -> UVect.Vector Int
randomPermutation g n = runST newVect
where
newVect :: ST s (UVect.Vector Int)
newVect = UVect.unsafeThaw (UVect.enumFromN 0 n) >>= \v ->
fill v 0 g >>
UVect.unsafeFreeze v
fill x i gen = when (i < n) $
let (j, gen') = R.randomR (i, n-1) gen in
UMVect.unsafeSwap x i j >>
fill x (i+1) gen'
As you can see, I'm avoiding Data.Vector.Generic to rule out the error source caused by perhaps not understanding it right. I'm also doing things in the ST monad.
In my head, the type of fill should be
UMVect.MVector (ST s (UVect.Vector Int)) Int -> Int -> a -> ST s ()
but GHC objects. Any hints? Again: It typechecks if I don't annotate fill.
Sidenote: I'd also like randomPermutation to return the updated random number generator. Thus, I'd need fill to also handle the generator's state. With my current type confusion, I don't see how to do that neatly. Any hints?
The compile error is telling us:
Expected type: ST s (UMVect.MVector (ST s (UVect.Vector Int)) Int)
Actual type: ST s (UMVect.MVector (Control.Monad.Primitive.PrimState (ST s)) Int)
So, changing the type signature of fill to UMVect.MVector (PrimState (ST s)) Int -> Int -> a -> ST s () (adding import Control.Monad.Primitive too) solves the problem!

Resources