I am currently writing my bachelor CS thesis for my studies in Austria.
The programming language that I use is Haskell.
Now, I am trying to find a way to fix my following issue:
I have a list of tuples, lets say [(1,2),(2,3)]. From that list of tuples, I would now like to pick out each of that tuples and then do an operation on it:
Map.insert (1,2) XXX ftable
where
(1,2) is the first element of that list and XXX is some value and ftable is my map.
How can I "iterate" through that list and proceed with that operation inserting the "n-th" elment of my list to my map?
I guess I am just too much familiar with programming imperative and I do not find a way to fix that in Haskell.
It's not entirely clear what you mean here. Is it correct to assume that the tuples are meant to represent the keys in your map and XXX is some value attached to a specific key? Are all the values you want to match to a given key also provided in a list? In that case you can easily use the fromList function in Data.Map:
keys = [(1,2),(2,3),(7,9)]
values = ["A","B","C"]
map = Data.Map.fromList $ zip keys values
Think about what your loop is doing.
if it transforms each list element, then use map (or concatMap)
if it filters out some of the list elements, then use filter
if it reduces the list to a summary value, use a fold (e.g. foldl, foldr; more specific folds are sum, and, etc)
or if it's doing some combination of the above, use a combination of the above functions
In your case, I'm not entirely certain what you want, but I think you want to end up with a single Map, so you want to fold your list. Perhaps something like
foldl (\oldMap key -> Map.insert key xxx oldMap) ftable yourListOfTuples
You have several options to iterate on lists, each one to be chosen depending on the effect you're searching for:
folds, they are useful for many computations on lists. They may also be what you're looking for, but I do not understand from your question your specific issue.
map, that applies the same (given) function on every element of the input list returning the list of the results. There are also variants that works in monadic computations.
or, if it best suits your needs, you can always write your own tail recursive function: haskell will treat them just like for loops, without consuming heap (but please look at the reference link for a good explanation of the technique).
In the end, whatever function or technique you will use, it will be based on recursion since functional languages like Haskell do not admit for-loops in the form you are used to.
Building off of hakoja's suggestion...
It is probable that XXX at this point is either 1) constant for every key, or 2) some function based on the key, or 3) somehow defined in parallel to the key.
1) constant xxx for every key
keys = [(1,2),(2,3),(7,9)]
xxx = "A"
ftable = Data.Map.fromList $ zip keys (repeat xxx)
2) function produces value based on key
keys = [(1,2),(2,3),(7,9)]
f = ...
ftable = Data.Map.fromList $ zip keys (map f keys)
3) xxx defined in parallel: use hakoja's suggestion
Related
I am a student doing scientific calculations recently,usually,I use odeint function to solve Differential equations,now I need to solve a differential equation system with 100 variables.If I follow my previous programming style in python,I will act like this:
def XFunction(X,t,sets):
x1,x2,x3,x4,,,,,,x100=X
lambd=sets
return np.array([equation1,equation2,equation3,,,,,equation100])
But this method takes too long, is there a more efficient way to do this?
Yes, using integer suffixes like that indicates that you probably want to use sequence like a list or array, but a mapping, like a dict could also work. So instead of x1,x2,x3..., you write X[0], X[1], X[2]... when you need them without pulling them out into locals first. X might already be an array in your program.
If it's just an iterable and not a sequence, you can save it in a list first,
X = [*X]
Which lets you use the subscript operator X[i].
You don't normally "declare" variables in Python, that's implied by assignment, although you can declare without assignment by giving it a (type) annotation.
The [equation1, ...] part could perhaps be done with a list comprehension, which is like a mathematical set comprehension, but ordered.
Here's a stupid example with a single map and filter step. (You can have multiple filters or no filters, but you must use at least one loop.)
[x**2 for x in X if x % 2 == 0]
This list comprehension would generate a list of all squares of the elements of X where the element was even.
I don't know what set of formulae you need for your application, but if it can be parameterized by X, you can do it this way.
I have a list of tuples of integers [(2,10), [...] (4,11),(3,9)].
Tuples are added to the list as well as deleted from the list regularly. It will contain up to ~5000 Elements.
In my code I need to use this list sometimes sorted according to the first and sometimes to the second tuple-element. Hence ordering of the list will change drastically. Resorting might take place at any time.
Pythons tinsort is only fast when list are already sorted heavily. So this general approach of frequent resorting might be inefficient. A better approach would be to use two naturally sorted data-structures like the SortedList. But here I would need two lists (one for the first tuple element, and one for the second) as well as a dictionary to create the mapping of the above tuples.
What is the pythonic way to solve this?
In Java I would do it like this:
TreeSet<Integer> leftTupleEntry = new Treeset<Integer>();
TreeSet<Integer> rightTupleEntry = new Treeset<Integer>();
Hashmap<Integer, Integer> tupleMap = new HashMap<Integer,Integer>()
And have both sorting strategies in the best runtime complexity class as well as the necessary connection between both numbers.
When I need to sort it according to first tuple I need to access the whole list (as i need to calculate a cumulative sum, and other operations)
When I need to sort according to second element, I'm only interested in the smallest elements, which then is usually followed by the deletion of these respective tuples.
Typically after any insertation a new sort according to the first element is requested.
first_element_list = sorted([i[0] for i in list_tuple])
second_element_list = sorted([i[1] for i in list_tuple])
What I did:
I use the SortedKeyList and sorted according to the first tuple element. Inserting into this list is O(log(n)). Reading from it is O(log(n)) too.
from operator import itemgetter
from sortedcontainers import SortedKeyList
self.list = SortedKeyList(key=itemgetter(0))
self.list.add((1,4))
self.list.add((2,6))
When I need the argmin according to the second tuple element I used
np.argmin(self.list, axis=0)[0]
Which is O(n). Not optimal.
I am kind of new using data structures in haskell besides of Lists. My goal is to chose one container among Data.Vector, Data.Sequence, Data.List, etc ... My problem is the following:
I have to create a sequence (mathematically speaking). The sequence starts at 0. In each iteration two new elements are generated but only one should be appended based in whether the first element is already in the sequence. So in each iteration there is a call to elem function (see the pseudo-code below).
appendNewItem :: [Integer] -> [Integer]
appendNewItem acc = let firstElem = someFunc
secondElem = someOtherFunc
newElem = if firstElem `elem` acc
then secondElem
else firstElem
in acc `append` newElem
sequenceUptoN :: Int -> [Integer]
sequenceUptoN n = (iterate appendNewItem [0]) !! n
Where append and iterate functions vary depending on which colection you use (I am using lists in the type signature for simplicity).
The question is: Which data structure should I use?. Is Data.Sequence faster for this task because of the Finger Tree inner structure?
Thanks a lot!!
No, sequences are not faster for searching. A Vector is just a flat chunk of memory, which gives generally the best lookup performance. If you want to optimise searching, use Data.Vector.Unboxed. (The normal, “boxed” variant is also pretty good, but it actually contains only references to the elements in the flat memory-chunk, so it's not quite as fast for lookups.)
However, because of the flat memory layout, Vectors are not good for (pure-functional) appending: basically, whenever you add a new element, the whole array must be copied so as to not invalidate the old one (which somebody else might still be using). If you need to append, Seq is a pretty good choice, although it's not as fast as destructive appending: for maximum peformance, you'll want to pre-allocate an uninitialized Data.Vector.Unboxed.Mutable.MVector of the required size, populate it using the ST monad, and freeze the result. But this is much more fiddly than purely-functional alternatives, so unless you need to squeeze out every bit of performance, Data.Sequence is the way to go. If you only want to append, but not look up elements, then a plain old list in reverse order would also do the trick.
I suggest using Data.Sequence in conjunction with Data.Set. The Sequence to hold the sequence of values and the Set to track the collection.
Sequence, List, and Vector are all structures for working with values where the position in the structure has primary importance when it comes to indexing. In lists we can manipulate elements at the front efficiently, in sequences we can manipulate elements based on the log of the distance the closest end, and in vectors we can access any element in constant time. Vectors however, are not that useful if the length keeps changing, so that rules out their use here.
However, you also need to lookup a certain value within the list, which these structures don't help with. You have to search the whole of a list/sequence/vector to be certain that a new value isn't present. Data.Map and Data.Set are two of the structures for which you define an index value based on Ord, and let you lookup/insert in log(n). So, at the cost of memory usage you can lookup the presence of firstElem in your Set in log(n) time and then add newElem to the end of the sequence in constant time. Just make sure to keep these two structures in synch when adding or taking new elements.
I have a map where multiple keys can map to the same value. I'd like to do reverse lookups, such that given a value, I get a list of all keys that map to this value.
Note that unlike
Data.Bimap my map is not 1:1 but n :1.
Also, the reverse lookup should not take O(n) like running through all map entries would require but rather O(log n) or better like with a reverse index. The map will contain many ten-thousands of entries with a high load of add/remove/lookup operations.
Is such a data structure available in functional form (Haskell or Frege preferred)?
In C++ and other languages, add-on libraries implement a multi-index container, e.g. Boost.Multiindex. That is, a collection that stores one type of value but maintains multiple different indices over those values. These indices provide for different access methods and sorting behaviors, e.g. map, multimap, set, multiset, array, etc. Run-time complexity of the multi-index container is generally the sum of the individual indices' complexities.
Is there an equivalent for Haskell or do people compose their own? Specifically, what is the most idiomatic way to implement a collection of type T with both a set-type of index (T is an instance of Ord) as well as a map-type of index (assume that a key value of type K could be provided for each T, either explicitly or via a function T -> K)?
I just uploaded IxSet to hackage this morning,
http://hackage.haskell.org/package/ixset
ixset provides sets which have multiple indexes.
ixset has been around for a long time as happstack-ixset. This version removes the dependencies on anything happstack specific, and is the new official version of IxSet.
Another option would be kdtree:
darcs get http://darcs.monoid.at/kdtree
kdtree aims to improve on IxSet by offering greater type-safety and better time and space usage. The current version seems to do well on all three of those aspects -- but it is not yet ready for prime time. Additional contributors would be highly welcomed.
In the trivial case where every element has a unique key that's always available, you can just use a Map and extract the key to look up an element. In the slightly less trivial case where each value merely has a key available, a simple solution it would be something like Map K (Set T). Looking up an element directly would then involve first extracting the key, indexing the Map to find the set of elements that share that key, then looking up the one you want.
For the most part, if something can be done straightforwardly in the above fashion (simple transformation and nesting), it probably makes sense to do it that way. However, none of this generalizes well to, e.g., multiple independent keys or keys that may not be available, for obvious reasons.
Beyond that, I'm not aware of any widely-used standard implementations. Some examples do exist, for example IxSet from happstack seems to roughly fit the bill. I suspect one-size-kinda-fits-most solutions here are liable to have a poor benefit/complexity ratio, so people tend to just roll their own to suit specific needs.
Intuitively, this seems like a problem that might work better not as a single implementation, but rather a collection of primitives that could be composed more flexibly than Data.Map allows, to create ad-hoc specialized structures. But that's not really helpful for short-term needs.
For this specific question, you can use a Bimap. In general, though, I'm not aware of any common class for multimaps or multiply-indexed containers.
I believe that the simplest way to do this is simply with Data.Map. Although it is designed to use single indices, when you insert the same element multiple times, most compilers (certainly GHC) will make the values place to the same place. A separate implementation of a multimap wouldn't be that efficient, as you want to find elements based on their index, so you cannot naively associate each element with multiple indices - say [([key], value)] - as this would be very inefficient.
However, I have not looked at the Boost implementations of Multimaps to see, definitively, if there is an optimized way of doing so.
Have I got the problem straight? Both T and K have an order. There is a function key :: T -> K but it is not order-preserving. It is desired to manage a collection of Ts, indexed (for rapid access) both by the T order and the K order. More generally, one might want a collection of T elements indexed by a bunch of orders key1 :: T -> K1, .. keyn :: T -> Kn, and it so happens that here key1 = id. Is that the picture?
I think I agree with gereeter's suggestion that the basis for a solution is just to maintiain in sync a bunch of (Map K1 T, .. Map Kn T). Inserting a key-value pair in a map duplicates neither the key nor the value, allocating only the extra heap required to make a new entry in the right place in the index. Inserting the same value, suitably keyed, in multiple indices should not break sharing (even if one of the keys is the value). It is worth wrapping the structure in an API which ensures that any subsequent modifications to the value are computed once and shared, rather than recomputed for each entry in an index.
Bottom line: it should be possible to maintain multiple maps, ensuring that the values are shared, even though the key-orders are separate.