Create HashTable in Haskell - haskell

I want to create a HashTable in Haskell, insert hash values inside and look up in this HashTable.
I found this documentation but I just started Haskell and therefore I don't really know how to ue these functions.
If some of you could show me some lines of code it would be perfect.

I second Ingo's comment about starting with something simpler. However, I'll break down a few things in a bit of detail.
First of all, I assume you've installed the latest Haskell Platform. In the website for the Platform there is a page with collected documentation for the libraries included with it. Any library that's not in that page would be something you'd need to install separately.
The Platform does include Data.HashTable, so you don't need to install anything, but if you look at the latest Platform's documentation on it, you'll see that it's deprecated and going to be removed soon. So I would not use that module.
The Haskell Platform comes with the two most popular Haskell implementations of a map/dictionary data structure:
Data.Map. (Most of the documentation for this is in Data.Map.Lazy.) This implements a map as a kind of balanced search tree, which means that the keys need to be an ordered type—a type that implements the Ord class. A lot of the built-in Haskell types already implement this class, so this would probably be your easiest choice at first.
The Data.HashMap module hierarchy, with two variants; Data.HashMap.Lazy would be a good starting point. This implements maps as a kind of hash table, so the keys need to implement the Hashable class. This class is newer and not as popular as Ord, so often you might need to implement this class for your key types.
So Data.Map is the easier type to use. But to use it effectively you're going to need to understand a few things beside the most basic language constructs:
How to import a module in a source file.
How to use qualified imports—Data.Map has function names that collide with many of the built-in ones in Haskell, which requires some special syntax.
How to load a module into the ghci interpreter.
How to compile a project that uses the containers library where Data.Map lives (using the cabal tool).
Once you have that down, the easiest way to build a map is from a list of key/value pairs:
module MyModule where
import Data.Map (Map) -- This just imports the type name
import qualified Data.Map as Map -- Imports everything else, but with names
-- prefixed with "Map." (with the period).
-- Example: make a Map from a key/value pair
ages :: Map String Integer
ages = Map.fromList [("Joe", 35), ("Mary", 37), ("Irma", 16)]
A few examples on how to use maps:
-- Example: look up somebody and return a message saying what their age is.
-- 'Nothing' means that the map didn't have the key.
findAge :: String -> String
findAge name = case Map.lookup name ages of
Nothing -> "I don't know the age of " ++ name ++ "."
Just age -> name ++ " is " ++ show age ++ " years old."
-- Example: make a map with one extra entry compared to `ages` above.
moreAges :: Map String Integer
moreAges = Map.insert "Steve" 23 ages
-- Example: union of two maps.
evenMoreAges :: Map String Integer
evenMoreAges = Map.union moreAges anotherMap
where anotherMap = Map.fromList [("Metuselah", 111), ("Anuq", 3)]

As a complement to Ingo's answer, consider using the purely function Data.Map.
import qualified Data.Map as M
myMap :: M.Map Int String
myMap = M.fromList $ zip [1..10] ['a'..'j']
insertedMap :: M.Map Int String
insertedMap = M.insert 11 "fizzbuzz" oldMap
at11 :: Maybe String
at11 = M.lookup 11 insertedMap
Then you can use M.lookup, M.insert, and many other functions to modify/query the map. This datastructure is also purely functional/persistant (notice how IO is nowhere in the types). That means that we can do something like
let newMap = M.insert key val oldMap
in M.union oldMap otherMap
See how we can still use the older version of the map even after inserting something? That's "persistance", we never destroy the older versions of our data structure.

Just so to avoid someone calling the haskell community arrogant, here is a short break down of the first function you'll need:
new :: (key -> key -> Bool) -> (key -> Int32) -> IO (HashTable key val)
This tells us the following: to create a HashTable for a specific key type key you need to pass a function that checks equality on keys, and a function that computes a hash value for keys. So, if eq and hashit would be the desired functions, the following:
new eq hashit
gives you an empty HashTable in the IO-Monad.
An easier way could be to create a HashTable from a list using one of the predefined hash functions:
fromList hashInt [(42, "forty-two"), (0, "zero")]

Related

How to define an unordered collection in Haskell

Wondering how you would define an unordered group/collection in Haskell, where by "collection" I mean it can have many copies of the same element, and the items are unordered. I know of the List data type in Haskell, but this is inherently ordered. I would like to see what the definition would look like for an unordered collection/group/list.
I would define it this way
import qualified Data.Map.Lazy as Map
type MultiSet' a = Map.Map a Int
Just a mapping from a type a to an Int. In mathematics it would be something like f : S -> N. The elements you put into it must be ordable, that is because the underlying structure of the Map is defined by a binary tree. This shouldn't be a problem as you can forget about it when using the data structure. See the very extensive documentation of Data.Map for functions to deal with our MultiSet'.
Now there is already a definition together with implementation for this and it is called MultiSet. You can browse to its source code as well, there you see they defined it in an almost an identical way (they used the strict version of the map).
Alternatively you can use a hashmap, it will look like this:
import qualified Data.HashMap.Lazy as Map
type MultiSet'' a = Map.HashMap a Int
The elements you put into it do not need to be ordable, but hashable.
If you just want a structure that has no reasonable order then why not compose a Map with a hash?
type MyBag a = Map (Int,a) Int
insert x mp = Data.Map.insertWith (+) 1 (hash x, x) mp
The above is a balanced binary tree with an order that depends on the hash of the value you have inserted. The map itself is boring, along the lines of data Map k a = Bin (Map k a) a (Map k a) | Nil.
This said, I think you underspecified what you are looking for and what you hope to learn. Your searches have probably yielded hashtables and unordered-containers - why aren't those sufficiently informative?

Accessing vector element by index using lens

I'm looking for a way to reference an element of a vector using lens library...
Let me try to explain what I'm trying to achieve using a simplified example of my code.
I'm working in this monad transformer stack (where StateT is the focus, everything else is not important)
newtype MyType a = MyType (StateT MyState (ExceptT String IO) a)
MyState has a lot of fields but one of those is a vector of clients which is a data type I defined:
data MyState = MyState { ...
, _clients :: V.Vector ClientT
}
Whenever I need to access one of my clients I tend to do it like this:
import Control.Lens (use)
c <- use clients
let neededClient = c V.! someIndex
... -- calculate something, update client if needed
clients %= (V.// [(someIndex, updatedClient)])
Now, here is what I'm looking for: I would like my function to receive a "reference" to the client I'm interested in and use it (retrieve it from State, update it if needed).
In order to clear up what I mean here is a snippet (that won't compile even in pseudo code):
...
myFunction (clients.ix 0)
...
myFunction clientLens = do
c <- use clientLens -- I would like to access a client in the vector
... -- calculate stuff
clientLens .= updatedClient
Basically, I would like to pass to myFunction something from Lens library (I don't know what I'm passing here... Lens? Traversal? Getting? some other thingy?) which will allow me to point at particular element in the vector which is kept in my StateT. Is it at all possible? Currently, when using "clients.ix 0" I get an error that my ClientT is not an instance of Monoid.
It is a very dumbed down version of what I have. In order to answer the question "why I need it this way" requires a lot more explanation. I'm interested if it is possible to pass this "reference" which will point to some element in my vector which is kept in State.
clients.ix 0 is a traversal. In particular, traversals are setters, so setting and modifying should work fine:
clients.ix 0 .= updatedClient
Your problem is with use. Because a traversal doesn't necessarily contain exactly one value, when you use a traversal (or use some other getter function on it), it combines all the values assuming they are of a Monoid type.
In particular,
use (clients.ix n)
would want to return mempty if n is out of bounds.
Instead, you can use the preuse function, which discards all but the first target of a traversal (or more generally, a fold), and wraps it in a Maybe. E.g.
Just c <- preuse (clients.ix n)
Note this will give a pattern match error if n is out of bounds, since preuse returns Nothing then.

Does Haskell have pointers/references to record members?

I can create and reference relative pointers to struct members in C++ using the ::*, .*, and ->* syntax like :
char* fstab_t::*field = &fstab_t::fs_vfstype;
my_fstab.*field = ...
In Haskell, I can easily create temporary labels for record getters like :
(idxF_s,idxL_s) = swap_by_sign sgn (idxF,idxL) ;
Afaik, I cannot however then update records using these getters as labels like :
a { idxF_s = idxL_s b }
Is there an easy way to do this without coding for each record setter?
A getter and setter bundled together in a first-class value is referred to as a lens. There are quite a few packages for doing this; the most popular are data-lens and fclabels. This previous SO question is a good introduction.
Both of those libraries support deriving lenses from record definitions using Template Haskell (with data-lens, it's provided as an additional package for portability). Your example would be expressed as (using data-lens syntax):
setL idxF_s (b ^. idL_s) a
(or equivalently: idxF_s ^= (b ^. idL_s) $ a)
You can, of course, transform lenses in a generic way by transforming their getter and setter together:
-- I don't know what swap_by_sign is supposed to do.
negateLens :: (Num b) => Lens a b -> Lens a b
negateLens l = lens get set
where
get = negate . getL l
set = setL l . negate
(or equivalently: negateLens l = iso negate negate . l1)
In general, I would recommend using lenses whenever you have to deal with any kind of non-trivial record handling; not only do they vastly simplify pure transformation of records, but both packages contain convenience functions for accessing and modifying a state monad's state using lenses, which is incredibly useful. (For data-lens, you'll want to use the data-lens-fd package to use these convenience functions in any MonadState; again, they're in a separate package for portability.)
1 When using either package, you should start your modules with:
import Prelude hiding (id, (.))
import Control.Category
This is because they use generalised forms of the Prelude's id and (.) functions — id can be used as the lens from any value to itself (not all that useful, admittedly), and (.) is used to compose lenses (e.g. getL (fieldA . fieldB) a is the same as getL fieldA . getL fieldB $ a). The shorter negateLens definition uses this.
What you want here is first-class record labels, and while this does not exist in the language, there are several packages on Hackage which implement this pattern. One of these is fclabels, which can use Template Haskell to generate the required boilerplate for you. Here's an example:
{-# LANGUAGE TemplateHaskell #-}
import Control.Category
import Data.Label
import Prelude hiding ((.))
data Foo = Foo { _fieldA :: Int, _fieldB :: Int }
deriving (Show)
$(mkLabels [''Foo])
main = do
let foo = Foo 2 3
putStrLn "Pick a field, A or B"
line <- getLine
let field = (if line == "A" then fieldA else fieldB)
print $ modify field (*10) foo

haskell load module in list

Hey haskellers and haskellettes,
is it possible to load a module functions in a list.
in my concrete case i have a list of functions all checked with or
checkRules :: [Nucleotide] -> Bool
checkRules nucs = or $ map ($ nucs) [checkRule1, checkRule2]
i do import checkRule1 and checkRule2 from a seperate module - i don't know if i will need more of them in the future.
i'd like to have the same functionality look something like
-- import all functions from Rules as rules where
-- :t rules ~~> [([Nucleotide] -> Bool)]
checkRules :: [Nucleotide] -> Bool
checkRules nucs = or $ map ($ nucs) rules
the program sorts Pseudo Nucleotide Sequences in viable and nonviable squences according to given rules.
thanks in advance ε/2
Addendum:
So do i think right - i need:
genList :: File -> TypeSignature -> [TypeSignature]
chckfun :: (a->b) -> TypeSignature -> Bool
at compile time.
but i can't generate a list of all functions in the module - as they most probably will have not the same type signature and hence not all fit in one list. so i cannot filter given list with chckfun.
In order to do this i either want to check the written type signatures in the source file (?) or the inferenced types given by the compiler(?).
another problem that comes to my mind is: not every function written in the source file might get exported ?
Is this a problem a haskell beginner should try to solve after 5 months of learning - my brain is shaped like a klein's bottle after all this "compile time thinking".
There is a nice package on Hackage just for this: language-haskell-extract. In particular, the Template Haskell function functionExtractor takes a regular expression and returns a list of the matching top level bindings as (name, value) pairs. As long as they all have matching types, you're good to go.
{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.Extract
myFoo = "Hello"
myBar = "World"
allMyStuff = $(functionExtractor "^my")
main = print allMyStuff
Output:
[("myFoo", "Hello"), ("myBar", "World")]

Best way to implement ad-hoc polymorphism in Haskell?

I have a polymorphic function like:
convert :: (Show a) => a -> String
convert = " [label=" ++ (show a) ++ "]"
But sometimes I want to pass it a Data.Map and do some more fancy key value conversion. I know I can't pattern match here because Data.Map is an abstract data type (according to this similar SO question), but I have been unsuccessful using guards to this end, and I'm not sure if ViewPatterns would help here (and would rather avoid them for portability).
This is more what I want:
import qualified Data.Map as M
convert :: (Show a) => a -> String
convert a
| M.size \=0 = processMap2FancyKVString a -- Heres a Data.Map
| otherwise = " [label=" ++ (show a) ++ "]" -- Probably a string
But this doesn't work because M.size can't take anything other than a Data.Map.
Specifically, I am trying to modify the sl utility function in the Functional Graph Library in order to handle coloring and other attributes of edges in GraphViz output.
Update
I wish I could accept all three answers by TomMD, Antal S-Z, and luqui to this question as they all understood what I really was asking. I would say:
Antal S-Z gave the most 'elegant' solution as applied to the FGL but would also require the most rewriting and rethinking to implement in personal problem.
TomMD gave a great answer that lies somewhere between Antal S-Z's and luqui's in terms of applicability vs. correctness. It also is direct and to the point which I appreciate greatly and why I chose his answer.
luqui gave the best 'get it working quickly' answer which I will probably be using in practice (as I'm a grad student, and this is just some throwaway code to test some ideas). The reason I didn't accept was because TomMD's answer will probably help other people in more general situations better.
With that said, they are all excellent answers and the above classification is a gross simplification. I've also updated the question title to better represent my question (Thanks Thanks again for broadening my horizons everyone!
What you just explained is you want a function that behaves differently based on the type of the input. While you could use a data wrapper, thus closing the function for all time:
data Convertable k a = ConvMap (Map k a) | ConvOther a
convert (ConvMap m) = ...
convert (ConvOther o) = ...
A better way is to use type classes, thus leaving the convert function open and extensible while preventing users from inputting non-sensical combinations (ex: ConvOther M.empty).
class (Show a) => Convertable a where
convert :: a -> String
instance Convertable (M.Map k a) where
convert m = processMap2FancyKVString m
newtype ConvWrapper a = CW a
instance Convertable (ConvWrapper a) where
convert (CW a) = " [label=" ++ (show a) ++ "]"
In this manner you can have the instances you want used for each different data type and every time a new specialization is needed you can extend the definition of convert simply by adding another instance Convertable NewDataType where ....
Some people might frown at the newtype wrapper and suggest an instance like:
instance Convertable a where
convert ...
But this will require the strongly discouraged overlapping and undecidable instances extensions for very little programmer convenience.
You may not be asking the right thing. I'm going to assume that you either have a graph whose nodes are all Maps or you have a graph whose nodes are all something else. If you need a graph where Maps and non-maps coexist, then there is more to your problem (but this solution will still help). See the end of my answer in that case.
The cleanest answer here is simply to use different convert functions for different types, and have any type that depends on convert take it as an argument (a higher order function).
So in GraphViz (avoiding redesigning this crappy code) I would modify the graphviz function to look like:
graphvizWithLabeler :: (a -> String) -> ... -> String
graphvizWithLabeler labeler ... =
...
where sa = labeler a
And then have graphviz trivially delegate to it:
graphviz = graphvizWithLabeler sl
Then graphviz continues to work as before, and you have graphvizWithLabeler when you need the more powerful version.
So for graphs whose nodes are Maps, use graphvizWithLabeler processMap2FancyKVString, otherwise use graphviz. This decision can be postponed as long as possible by taking relevant things as higher order functions or typeclass methods.
If you need to have Maps and other things coexisting in the same graph, then you need to find a single type inhabited by everything a node could be. This is similar to TomMD's suggestion. For example:
data NodeType
= MapNode (Map.Map Foo Bar)
| IntNode Int
Parameterized to the level of genericity you need, of course. Then your labeler function should decide what to do in each of those cases.
A key point to remember is that Haskell has no downcasting. A function of type foo :: a -> a has no way of knowing anything about what was passed to it (within reason, cool your jets pedants). So the function you were trying to write is impossible to express in Haskell. But as you can see, there are other ways to get the job done, and they turn out to be more modular.
Did that tell you what you needed to know to accomplish what you wanted?
Your problem isn't actually the same as in that question. In the question you linked to, Derek Thurn had a function which he knew took a Set a, but couldn't pattern-match. In your case, you're writing a function which will take any a which has an instance of Show; you can't tell what type you're looking at at runtime, and can only rely on the functions which are available to any Showable type. If you want to have a function do different things for different data types, this is known as ad-hoc polymorphism, and is supported in Haskell with type classes like Show. (This is as opposed to parametric polymorphism, which is when you write a function like head (x:_) = x which has type head :: [a] -> a; the unconstrained universal a is what makes that parametric instead.) So to do what you want, you'll have to create your own type class, and instantiate it when you need it. However, it's a little more complicated than usual, because you want to make everything that's part of Show implicitly part of your new type class. This requires some potentially dangerous and probably unnecessarily powerful GHC extensions. Instead, why not simplify things? You can probably figure out the subset of types which you actually need to print in this manner. Once you do that, you can write the code as follows:
{-# LANGUAGE TypeSynonymInstances #-}
module GraphvizTypeclass where
import qualified Data.Map as M
import Data.Map (Map)
import Data.List (intercalate) -- For output formatting
surround :: String -> String -> String -> String
surround before after = (before ++) . (++ after)
squareBrackets :: String -> String
squareBrackets = surround "[" "]"
quoted :: String -> String
quoted = let replace '"' = "\\\""
replace c = [c]
in surround "\"" "\"" . concatMap replace
class GraphvizLabel a where
toGVItem :: a -> String
toGVLabel :: a -> String
toGVLabel = squareBrackets . ("label=" ++) . toGVItem
-- We only need to print Strings, Ints, Chars, and Maps.
instance GraphvizLabel String where
toGVItem = quoted
instance GraphvizLabel Int where
toGVItem = quoted . show
instance GraphvizLabel Char where
toGVItem = toGVItem . (: []) -- Custom behavior: no single quotes.
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
In this setup, everything which we can output to Graphviz is an instance of GraphvizLabel; the toGVItem function quotes things, and toGVLabel puts the whole thing in square brackets for immediate use. (I might have screwed some of the formatting you want up, but that part's just an example.) You then declare what's an instance of GraphvizLabel, and how to turn it into an item. The TypeSynonymInstances flag just lets us write instance GraphvizLabel String instead of instance GraphvizLabel [Char]; it's harmless.
Now, if you really need everything with a Show instance to be an instance of GraphvizLabel as well, there is a way. If you don't really need this, then don't use this code! If you do need to do this, you have to bring to bear the scarily-named UndecidableInstances and OverlappingInstances language extensions (and the less scarily named FlexibleInstances). The reason for this is that you have to assert that everything which is Showable is a GraphvizLabel—but this is hard for the compiler to tell. For instance, if you use this code and write toGVLabel [1,2,3] at the GHCi prompt, you'll get an error, since 1 has type Num a => a, and Char might be an instance of Num! You have to explicitly specify toGVLabel ([1,2,3] :: [Int]) to get it to work. Again, this is probably unnecessarily heavy machinery to bring to bear on your problem. Instead, if you can limit the things you think will be converted to labels, which is very likely, you can just specify those things instead! But if you really want Showability to imply GraphvizLabelability, this is what you need:
{-# LANGUAGE TypeSynonymInstances, FlexibleInstances
, UndecidableInstances, OverlappingInstances #-}
-- Leave the module declaration, imports, formatting code, and class declaration
-- the same.
instance GraphvizLabel String where
toGVItem = quoted
instance Show a => GraphvizLabel a where
toGVItem = quoted . show
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
Notice that your specific cases (GraphvizLabel String and GraphvizLabel (Map k v)) stay the same; you've just collapsed the Int and Char cases into the GraphvizLabel a case. Remember, UndecidableInstances means exactly what it says: the compiler cannot tell if instances are checkable or will instead make the typechecker loop! In this case, I am reasonably sure that everything here is in fact decidable (but if anybody notices where I'm wrong, please let me know). Nevertheless, using UndecidableInstances should always be approached with caution.

Resources