Creating a mutable Data.Vector in Haskell

Creating a mutable Data.Vector in Haskell - haskell

I wish to create a mutable vector using Data.Vector.Generic.Mutable.new. I have found examples that create a mutable vector by thawing a pure vector, but that's not what I wish to do.
Here is one of many failed attempts:
import Control.Monad.Primitive
import qualified Data.Vector.Generic.Mutable as GM
main = do
v <- (GM.new 10) :: (GM.MVector v a) => IO (v RealWorld a)
GM.write v 0 (3::Int)
x <- GM.read v 0
putStrLn $ show x
giving me the error
No instance for (GM.MVector v0 Int)
arising from an expression type signature
Possible fix: add an instance declaration for (GM.MVector v0 Int)
I tried variations based on the Haskell Vector tutorial with no luck.
I would also welcome suggestion on cleaner ways to construct the vector. The reference to RealWorld seems ugly to me.

The GM.MVector v a constaint is ambigous in v. In other words, from the type information you've given GHC, it still can't figure out what specific instance of GM.MVector you want it to use. For a mutable vector of Int use Data.Vector.Unboxed.Mutable.
import qualified Data.Vector.Unboxed.Mutable as M
main = do
v <- M.new 10
M.write v 0 (3 :: Int)
x <- M.read v 0
print x

I think the problem is that you have to give v a concrete type -- like this:
import Control.Monad.Primitive
import qualified Data.Vector.Mutable as V
import qualified Data.Vector.Generic.Mutable as GM
main = do
v <- GM.new 10 :: IO (V.MVector RealWorld Int)
GM.write v 0 (3::Int)
x <- GM.read v 0
putStrLn $ show x

Related

maxIndex of an MVector

Data.Vector includes a function maxIndex with type maxIndex :: (Ord a) => Vector a -> Int that returns the index of the maximum value in that Vector. I'm working with mutable Vectors, however, and MVector doesn't have maxIndex defined for it.
What's the best way of getting the data I want out of the MVector I have? My code currently is:
import qualified Data.Vector.Unboxed.Mutable as MV
import Control.Monad.ST
import Control.Monad (mapM_)
type MaxIndex = Int
step :: forall s. MV.MVector s Int -> MaxIndex -> ST s ()
step vec i = do
n <- MV.unsafeRead vec i
MV.write vec i 0
let l = MV.length vec
(k, x) = n `divMod` l
mapM_ (\j -> MV.modify vec (+k) j) [0..l-1] -- side note, this is just
-- fmap (+k) vec, but MVector is not
-- a functor. Is there a better way?
mapM_ (\j -> MV.modify vec (+1) (j `mod` l)) [i+1..i+x]
where i is the index I'm looking to derive inside step. I'm doing this because the actions here need to eventually be wrapped inside an until and repeated until a predicate is satisfied, and freezing and thawing every cycle sounds ludicrously expensive.

I see lots of talk about unsafe freezing which seems suspect since you plan to mutate this memory later, thus violating the assurance you are implicitly giving when calling unsafeFreeze.
My suggestion is to just write an imperative-style maxIndex function. The below is typed but not tested:
import qualified Data.Vector.Unboxed.Mutable as MV
import Control.Monad.ST
import Control.Monad (mapM_)
maxIndex :: (Ord a, MV.Unbox a) => MV.MVector s a -> ST s (Maybe Int)
maxIndex mv | len == 0 = pure Nothing
| otherwise = Just <$> go 0 0
where
len = MV.length mv
go n i | i >=len = pure n
| otherwise = do
nVal <- MV.unsafeRead mv n
iVal <- MV.unsafeRead mv i
if nVal < iVal then go i (i+1)
else go n (i+1)

Have you considered freezing the vector with unsafeFreeze which is supposed to be fast (i.e. Θ(1))? For example you can define maxIndex for mutable vectors like this:
maxIndex = fmap V.maxIndex . V.unsafeFreeze
This assumes that you have imported the following:
import qualified Data.Vector.Unboxed as V
unsafeFreeze doesn't actually copy any data and should be fast, but it would be interesting to run a criterion benchmark to see if this approach is actually faster compared to an explicit loop.

Efficient Haskell equivalent to NumPy's argsort

Is there a standard Haskell equivalent to NumPy's argsort function?
I'm using HMatrix and, so, would like a function compatible with Vector R which is an alias for Data.Vector.Storable.Vector Double. The argSort function below is the implementation I'm currently using:
{-# LANGUAGE NoImplicitPrelude #-}
module Main where
import qualified Data.List as L
import qualified Data.Vector as V
import qualified Data.Vector.Storable as VS
import Prelude (($), Double, IO, Int, compare, print, snd)
a :: VS.Vector Double
a = VS.fromList [40.0, 20.0, 10.0, 11.0]
argSort :: VS.Vector Double -> V.Vector Int
argSort xs = V.fromList (L.map snd $ L.sortBy (\(x0, _) (x1, _) -> compare x0 x1) (L.zip (VS.toList xs) [0..]))
main :: IO ()
main = print $ argSort a -- yields [2,3,1,0]
I'm using explicit qualified imports just to make it clear where every type and function is coming from.
This implementation is not terribly efficient since it converts the input vector to a list and the result back to a vector. Does something like this (but more efficient) exist somewhere?
Update
#leftaroundabout had a good solution. This is the solution I ended up with:
module LAUtil.Sorting
( IndexVector
, argSort
)
where
import Control.Monad
import Control.Monad.ST
import Data.Ord
import qualified Data.Vector.Algorithms.Intro as VAI
import qualified Data.Vector.Storable as VS
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import Numeric.LinearAlgebra
type IndexVector = VU.Vector Int
argSort :: Vector R -> IndexVector
argSort xs = runST $ do
let l = VS.length xs
t0 <- VUM.new l
forM_ [0..l - 1] $
\i -> VUM.unsafeWrite t0 i (i, (VS.!) xs i)
VAI.sortBy (comparing snd) t0
t1 <- VUM.new l
forM_ [0..l - 1] $
\i -> VUM.unsafeRead t0 i >>= \(x, _) -> VUM.unsafeWrite t1 i x
VU.freeze t1
This is more directly usable with Numeric.LinearAlgebra since the data vector is a Storable. This uses an unboxed vector for the indices.

Use vector-algorithms:
import Data.Ord (comparing)
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Algorithms.Intro as VAlgo
argSort :: (Ord a, VU.Unbox a) => VU.Vector a -> VU.Vector Int
argSort xs = VU.map fst $ VU.create $ do
xsi <- VU.thaw $ VU.indexed xs
VAlgo.sortBy (comparing snd) xsi
return xsi
Note these are Unboxed rather than Storable vectors. The latter need to make some tradeoffs to allow impure C FFI operations and can't properly handle heterogeneous tuples. You can of course always convert to and from storable vectors.

What worked better for me is using Data.map, as it is subject to list fusion, got a speed up. Here n=Length xs.
import Data.Map as M (toList, fromList, toAscList)
out :: Int -> [Double] -> [Int]
out n !xs = let !a= (M.toAscList (M.fromList $! (zip xs [0..n])))
!res=a `seq` L.map snd a
in res
However this is only aplicable for periodic lists, as:
out 12 [1,2,3,4,1,2,3,4,1,2,3,4] == out 12 [1,2,3,4,1,3,2,4,1,2,3,4]

Creating a random permutation of 1..N with Data.Vector.Unboxed.Mutable

I want to create a list containing a random permutation of the numbers 1 through N. As I understand it, it is possible to use VUM.swap in the runST, but since I need random numbers as well I figured I might do both in the IO monad.
The code below yields:
Expected type: IO (VU.Vector Int), Actual type: IO (VU.Vector
(VU.Vector a0))
for the return statement.
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import System.Random
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
VU.forM_ (VU.fromList [2..VUM.length vector]) $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
return $ VU.unsafeFreeze vector
I'm not quite sure why the return vector is nested. Do I have to use VU.fold1M_ instead?

unsafeFreeze vector already returns IO (VU.Vector Int). Just change the last line to VU.unsafeFreeze vector.
On another note, you should iterate until VUM.length vector - 1, since both [x .. y] and randomRIO use inclusive ranges. Also, you can use plain forM_ here for iteration, since you only care about side effects.
import Control.Monad
import qualified Data.Vector.Unboxed as VU
import qualified Data.Vector.Unboxed.Mutable as VUM
import System.Random
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
forM_ [2..VUM.length vector - 1] $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
VU.unsafeFreeze vector
I looked at the generated code, and it seems that with GHC 7.10.3 forM_ compiles to an efficient loop while VU.forM_ retains the intermediate list and is surely significantly slower (which was my expected outcome for forM_, but I was unsure about VU.forM_).

I would try (note update at end):
import Control.Monad
randVector :: Int -> IO (VU.Vector Int)
randVector n = do
vector <- VU.unsafeThaw $ VU.enumFromN 1 n
forM_ [2..VUM.length vector] $ \i -> do
j <- randomRIO(0, i) :: IO Int
VUM.swap vector i j
return $ VU.unsafeFreeze vector
Edit: as #András Kovács pointed out, you don't want the return at the end so the last line should be:
VU.unsafeFreeze vector

Haskell: importing (infix) Data constructors from user added libraries

This is a simple question, but I cannot find the way to use the PSQ library.
The code below is messy, but seems to find PSQ and fromList, but fails to find Binding (Error: Not in scope: data constructor 'Data.PSQueue.Binding'). LearnYouAHaskell does not cover how to use non-standard libraries and I can't find any simple examples that just show PSQ being implemented.
import qualified Data.PSQueue (Binding, PSQ, fromList)
{-
data Binding k p
k :-> p binds the key k with the priority p.
Constructors
k :-> p
data PSQ k p
A mapping from keys k to priorites p.
-}
type VertHeap = Data.PSQueue.PSQ Int Int
main = do
--fromList :: (Ord k, Ord p) => [Binding k p] -> PSQ k p
return $ Data.PSQueue.fromList $ map (\k -> Data.PSQueue.Binding k 1000000) [2..10]

It can be easy to miss, but the data constructor for the Binding type is :->.
So this import should work:
import qualified Data.PSQueue (PSQ,Binding(..),fromList)
and later:
return $ Data.PSQueue.fromList $ map (\k -> k Data.PSQueue.:-> 1000000) [2..10]
Using Binding(..) will import all of the data constructors for the Binding data type.
Edit: :-> is just an infix operator defined by Data.PSQueue. Data.PSQueue.:-> is the fully qualified name for it.

Once I understood how to refer to Binding, I could use a more familiar pattern
import qualified Data.PSQueue as PSQ
type VertHeap = PSQ.PSQ Int Int
main = do
return $ PSQ.fromList $ map (\k -> k PSQ.:-> 1000000) [2..10]

Monadic creation of vectors (or: can someone type annotate this for me?)

I came across the following piece of code as part of this Redddit sub-thread discussing an implementation of the Fisher-Yates shuffle:
randomIs g n = fill g 0
where
v = enumFromN 0 n
fill g i = when (i < n) $ do
let (x,g') = randomR (i, n-1) g
G.swap v i x
fill g' (i+1)
(I guess G refers to Data.Vector.Generic.Mutable... right?). Having never created vectors monadically before, I'm struggling to grasp this, especially with no type annotations. Doesn't v have type Data.Vector Int? How come one can pass it to G.swap then? Won't it have to be thawed first?
I might have just misunderstood Data.Vector.Generic, but if someone could clarify the above (by adding type annotations, perhaps?), I'd appreciate it.
Addendum: Here's my own attempt at adding type annotations:
import qualified Data.Vector.Unboxed as UVect
import qualified Data.Vector.Unboxed.Mutable as UMVect
import qualified System.Random as R
import Control.Monad
import Control.Monad.ST
randomPermutation :: forall a. (R.RandomGen a) => a -> Int -> UVect.Vector Int
randomPermutation g n = runST newVect
where
newVect :: ST s (UVect.Vector Int)
newVect = UVect.unsafeThaw (UVect.enumFromN 0 n) >>= \v ->
fill v 0 g >>
UVect.unsafeFreeze v
fill x i gen = when (i < n) $
let (j, gen') = R.randomR (i, n-1) gen in
UMVect.unsafeSwap x i j >>
fill x (i+1) gen'
As you can see, I'm avoiding Data.Vector.Generic to rule out the error source caused by perhaps not understanding it right. I'm also doing things in the ST monad.
In my head, the type of fill should be
UMVect.MVector (ST s (UVect.Vector Int)) Int -> Int -> a -> ST s ()
but GHC objects. Any hints? Again: It typechecks if I don't annotate fill.
Sidenote: I'd also like randomPermutation to return the updated random number generator. Thus, I'd need fill to also handle the generator's state. With my current type confusion, I don't see how to do that neatly. Any hints?

The compile error is telling us:
Expected type: ST s (UMVect.MVector (ST s (UVect.Vector Int)) Int)
Actual type: ST s (UMVect.MVector (Control.Monad.Primitive.PrimState (ST s)) Int)
So, changing the type signature of fill to UMVect.MVector (PrimState (ST s)) Int -> Int -> a -> ST s () (adding import Control.Monad.Primitive too) solves the problem!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Creating a mutable Data.Vector in Haskell - haskell

Related

maxIndex of an MVector

Efficient Haskell equivalent to NumPy's argsort

Creating a random permutation of 1..N with Data.Vector.Unboxed.Mutable

Haskell: importing (infix) Data constructors from user added libraries

Monadic creation of vectors (or: can someone type annotate this for me?)

Categories

Resources