I've narrowed a problem down to one particular function call to one of my library routines that looks like (pop-hstack current-hstack), which pops an element from a stack structure. It is causing data corruption (an inconsistency, see below) in the stack structure, but only when multiple threads are running. I've tried wrapping the call in a lock like so (bt:with-lock-held (*lock*) (pop-hstack current-hstack), but current-hstack is still becoming inconsistent somewhere during execution when there are two or more threads active. The arguments to pop-hstack (eg, current-hstack) in each thread are dynamically bound special variables, and so are not shared between threads. It's confusing whether the inconsistency is being introduced by multi-threading (no inconsistency running single-thread), or perhaps by a contingent programming bug in the structure definition or pop-hstack function.
(defstruct hstack
"An hstack (hash stack) is an expanded stack representation containing an
adjustable one-dimensional array of elements, plus a hash table for quickly
determining if an element is in the stack. Keyfn is applied to elements to
access the hash table. New elements are pushed at the fill-pointer, and
popped at the fill-pointer minus 1."
(vector (make-array 0 :adjustable t :fill-pointer t) :type (array * (*)))
(table (make-hash-table) :type hash-table) ;can take a custom hash table
(keyfn #'identity :type function)) ;fn to get hash table key for an element
(defun pop-hstack (hstk)
"Pops an element from hstack's vector. Also removes the element's index from
the element's hash table entry--and the entry itself if it's the last index."
(let* ((vec (hstack-vector hstk))
(fptr-1 (1- (fill-pointer vec)))
(tbl (hstack-table hstk))
(key (funcall (hstack-keyfn hstk) (aref vec fptr-1))))
(when (null (setf (gethash key tbl) (delete fptr-1 (gethash key tbl))))
(remhash key tbl))
(vector-pop vec)))
Normally, hstack's stack vector and hash table are in sync, containing the same number of entries: (length (hstack-vector x)) = (hash-table-count (hstack-table x)). Only when there are duplicate elements in hstack, will the number of entries differ. (Because then a single hash table entry will contain multiple vector indices for duplicate elements appearing in the vector.) However, the inconsistency between the number of entries in the vector and the hash table still shows up when there are no duplicate elements. Typically, there are one or two extra elements in the hash table, indicating that these extra elements were not properly removed during a pop-hstack operation. The stack vector always seems to have the correct elements.
EDIT(5/2/19): Corrected a coding error in pop-hstack: Replace (delete fptr-1 (gethash key tbl)) with (setf (gethash key tbl) (delete fptr-1 (gethash key tbl))).
The form (delete fptr-1 (gethash key tbl)) might be the cause, it modifies the list structure so that concurrent access might see a corrupt list.
What's the definition of the push operation?
Does corruption also occur if all push and all pop operations are wrapped in with-lock-held (using the same lock)?
Related
i need to retrieve the key whose value contains a string "TRY"
:CAB "NAB/TRY/FIGHT.jar"
so in this case the output should be :CAB .
I am new to Clojure, I tried a few things like .contains etc but I could not form the exact function for the above problem.its easier in few other languages like python but I don't know how to do it in Clojure.
Is there a way to retrieve the name of the key ?
for can also filter with :when. E.g.
(for [[k v] {:FOO "TRY" :BAR "BAZ"}
:when (.contains v "TRY")]
k)
First, using .contains is not recommended - first, you are using the internals of the underlying language (Java or JavaScript) without need, and second, it forces Clojure to do a reflection as it cannot be sure that the argument is a string.
It's better to use clojure.string/includes? instead.
Several working solutions have been already proposed here for extracting a key depending on the value, here is one more, that uses the keep function:
(require '[clojure.string :as cs])
(keep (fn [[k v]] (when (cs/includes? v "TRY") k))
{:CAB "NAB/TRY/FIGHT.jar" :BLAH "NOWAY.jar"}) ; => (:CAB)
The easiest way is to use the contains method from java.lang.String. I'd use that to map valid keys, and then filter to remove all nil values:
(filter some?
(map (fn [[k v]] (when (.contains v "TRY") k))
{:CAB "NAB/TRY/FIGHT.jar" :BLAH "NOWAY.jar"}))
=> (:CAB)
If you think there is at most one such matching k/v pair in the map, then you can just call first on that to get the relevant key.
You can also use a regular expression instead of .contains, e.g.
(fn [[k v]] (when (re-find #"TRY" v) k))
You can use some on your collection, some will operate in every value in your map a given function until the function returns a non nil value.
We're gonna use the function
(fn [[key value]] (when (.contains values "TRY") key))
when returns nil unless the condition is matched so it will work perfectly for our use case. We're using destructuring in the arguments of the function to get the key and value. When used by some, your collection will indeed be converted to a coll which will look like
'((:BAR "NAB/TRY/FIGHT.jar"))
If your map is named coll, the following code will do the trick
(some
(fn [[key value]] (when (.contains value "TRY") key))
coll)
We're in the process of converting our imperative brains to a mostly-functional paradigm. This function is giving me trouble. I want to construct an array that EITHER contains two pairs or three pairs, depending on a condition (whether refreshToken is null). How can I do this cleanly using a FP paradigm? Of course with imperative code and mutation, I would just conditionally .push() the extra value onto the end which looks quite clean.
Is this an example of the "local mutation is ok" FP caveat?
(We're using ReadonlyArray in TypeScript to enforce immutability, which makes this somewhat more ugly.)
const itemsToSet = [
[JWT_KEY, jwt],
[JWT_EXPIRES_KEY, tokenExpireDate.toString()],
[REFRESH_TOKEN_KEY, refreshToken /*could be null*/]]
.filter(item => item[1] != null) as ReadonlyArray<ReadonlyArray<string>>;
AsyncStorage.multiSet(itemsToSet.map(roArray => [...roArray]));
What's wrong with itemsToSet as given in the OP? It looks functional to me, but it may be because of my lack of knowledge of TypeScript.
In Haskell, there's no null, but if we use Maybe for the second element, I think that itemsToSet could be translated to this:
itemsToSet :: [(String, String)]
itemsToSet = foldr folder [] values
where
values = [
(jwt_key, jwt),
(jwt_expires_key, tokenExpireDate),
(refresh_token_key, refreshToken)]
folder (key, Just value) acc = (key, value) : acc
folder _ acc = acc
Here, jwt, tokenExpireDate, and refreshToken are all of the type Maybe String.
itemsToSet performs a right fold over values, pattern-matching the Maye String elements against Just and (implicitly) Nothing. If it's a Just value, it cons the (key, value) pair to the accumulator acc. If not, folder just returns acc.
foldr traverses the values list from right to left, building up the accumulator as it visits each element. The initial accumulator value is the empty list [].
You don't need 'local mutation' in functional programming. In general, you can refactor from 'local mutation' to proper functional style by using recursion and introducing an accumulator value.
While foldr is a built-in function, you could implement it yourself using recursion.
In Haskell, I'd just create an array with three elements and, depending on the condition, pass it on either as-is or pass on just a slice of two elements. Thanks to laziness, no computation effort will be spent on the third element unless it's actually needed. In TypeScript, you probably will get the cost of computing the third element even if it's not needed, but perhaps that doesn't matter.
Alternatively, if you don't need the structure to be an actual array (for String elements, performance probably isn't that critical, and the O (n) direct-access cost isn't an issue if the length is limited to three elements), I'd use a singly-linked list instead. Create the list with two elements and, depending on the condition, append the third. This does not require any mutation: the 3-element list simply contains the unchanged 2-element list as a substructure.
Based on the description, I don't think arrays are the best solution simply because you know ahead of time that they contain either 2 values or 3 values depending on some condition. As such, I would model the problem as follows:
type alias Pair = (String, String)
type TokenState
= WithoutRefresh (Pair, Pair)
| WithRefresh (Pair, Pair, Pair)
itemsToTokenState: String -> Date -> Maybe String -> TokenState
itemsToTokenState jwtKey jwtExpiry maybeRefreshToken =
case maybeRefreshToken of
Some refreshToken ->
WithRefresh (("JWT_KEY", jwtKey), ("JWT_EXPIRES_KEY", toString jwtExpiry), ("REFRESH_TOKEN_KEY", refreshToken))
None ->
WithoutRefresh (("JWT_KEY", jwtKey), ("JWT_EXPIRES_KEY", toString jwtExpiry))
This way you are leveraging the type system more effectively, and could be improved on further by doing something more ergonomic than returning tuples.
According to the theory of ADTs (Algebraic Data Types) the concatenation of two lists has to take O(n) where n is the length of the first list. You, basically, have to recursively iterate through the first list until you find the end.
From a different point of view, one can argue that the second list can simply be linked to the last element of the first. This would take constant time, if the end of the first list is known.
What am I missing here ?
Operationally, an Haskell list is typically represented by a pointer to the first cell of a single-linked list (roughly). In this way, tail just returns the pointer to the next cell (it does not have to copy anything), and consing x : in front of the list allocates a new cell, makes it point to the old list, and returns the new pointer. The list accessed by the old pointer is unchanged, so there's no need to copy it.
If you instead append a value with ++ [x], then you can not modify the original liked list by changing its last pointer unless you know that the original list will never be accessed. More concretely, consider
x = [1..5]
n = length (x ++ [6]) + length x
If you modify x when doing x++[6], the value of n would turn up to be 12, which is wrong. The last x refer to the unchanged list which has length 5, so the result of n must be 11.
Practically, you can't expect the compiler to optimize this, even in those cases in which x is no longer used and it could, theoretically, be updated in place (a "linear" use). What happens is that the evaluation of x++[6] must be ready for the worst-case in which x is reused afterwards, and so it must copy the whole list x.
As #Ben notes, saying "the list is copied" is imprecise. What actually happens is that the cells with the pointers are copied (the so-called "spine" on the list), but the elements are not. For instance,
x = [[1,2],[2,3]]
y = x ++ [[3,4]]
requires only to allocate [1,2],[2,3],[3,4] once. The lists of lists x,y will share pointers to the lists of integers, which do not have to be duplicated.
What you're asking for is related to a question I wrote for TCS Stackexchange some time back: the data structure that supports constant-time concatenation of functional lists is a difference list.
A way of handling such lists in a functional programming language was worked out by Yasuhiko Minamide in the 90s; I effectively rediscovered it a while back. However, the good run-time guarantees require language-level support that's not available in Haskell.
It's because of immutable state. A list is an object + a pointer, so if we imagined a list as a Tuple it might look like this:
let tupleList = ("a", ("b", ("c", [])))
Now let's get the first item in this "list" with a "head" function. This head function takes O(1) time because we can use fst:
> fst tupleList
If we want to swap out the first item in the list with a different one we could do this:
let tupleList2 = ("x",snd tupleList)
Which can also be done in O(1). Why? Because absolutely no other element in the list stores a reference to the first entry. Because of immutable state, we now have two lists, tupleList and tupleList2. When we made tupleList2 we didn't copy the whole list. Because the original pointers are immutable we can continue to reference them but use something else at the start of our list.
Now let's try to get the last element of our 3 item list:
> snd . snd $ fst tupleList
That happened in O(3), which is equal to the length of our list i.e. O(n).
But couldn't we store a pointer to the last element in the list and access that in O(1)? To do that we would need an array, not a list. An array allows O(1) lookup time of any element as it is a primitive data structure implemented on a register level.
(ASIDE: If you're unsure of why we would use a Linked List instead of an Array then you should do some more reading about data structures, algorithms on data structures and Big-O time complexity of various operations like get, poll, insert, delete, sort, etc).
Now that we've established that, let's look at concatenation. Let's concat tupleList with a new list, ("e", ("f", [])). To do this we have to traverse the whole list just like getting the last element:
tupleList3 = (fst tupleList, (snd $ fst tupleList, (snd . snd $ fst tupleList, ("e", ("f", [])))
The above operation is actually worse than O(n) time, because for each element in the list we have to re-read the list up to that index. But if we ignore that for a moment and focus on the key aspect: in order to get to the last element in the list, we must traverse the entire structure.
You may be asking, why don't we just store in memory what the last list item is? That way appending to the end of the list would be done in O(1). But not so fast, we can't change the last list item without changing the entire list. Why?
Let's take a stab at how that might look:
data Queue a = Queue { last :: Queue a, head :: a, next :: Queue a} | Empty
appendEnd :: a -> Queue a -> Queue a
appendEnd a2 (Queue l, h, n) = ????
IF I modify "last", which is an immutable variable, I won't actually be modifying the pointer for the last item in the queue. I will be creating a copy of the last item. Everything else that referenced that original item, will continue referencing the original item.
So in order to update the last item in the queue, I have to update everything that has a reference to it. Which can only be done in optimally O(n) time.
So in our traditional list, we have our final item:
List a []
But if we want to change it, we make a copy of it. Now the second last item has a reference to an old version. So we need to update that item.
List a (List a [])
But if we update the second last item we make a copy of it. Now the third last item has an old reference. So we need to update that. Repeat until we get to the head of the list. And we come full circle. Nothing keeps a reference to the head of the list so editing that takes O(1).
This is the reason that Haskell doesn't have Doubly Linked Lists. It's also why a "Queue" (or at least a FIFO queue) can't be implemented in a traditional way. Making a Queue in Haskell involves some serious re-thinking of traditional data structures.
If you become even more curious about how all of this works, consider getting the book Purely Funtional Data Structures.
EDIT: If you've ever seen this: http://visualgo.net/list.html you might notice that in the visualization "Insert Tail" happens in O(1). But in order to do that we need to modify the final entry in the list to give it a new pointer. Updating a pointer mutates state which is not allowed in a purely functional language. Hopefully that was made clear with the rest of my post.
In order to concatenate two lists (call them xs and ys), we need to modify the final node in xs in order to link it to (i.e. point at) the first node of ys.
But Haskell lists are immutable, so we have to create a copy of xs first. This operation is O(n) (where n is the length of xs).
Example:
xs
|
v
1 -> 2 -> 3
1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7
^ ^
| |
xs ++ ys ys
I learn Haskell. When I'm reading books (the russian translate of them) I often see the mapping word... I'm not sure I understand it right.
In my understanding: the mapping - this is the getting of new value on the base of some old value. So it is a result of any function with parameters (at least one), or data constructor. The new value isn't obliged to have the same type, as old.
I.e.
-- mapping samples:
func a b = a + b
func' a = show a
func'' a = a
func''' a = Just a
Am I right?
Yes what you have understood is correct. Mapping means getting new values based on the old value by applying it to some function. The new value may or may not be of the same type (of the old value). In mathematics, the word mapping and function is actually used interchangeably.
There is another concept related to mapping: map
map is a famous higher order function which can perform mapping on a list of values.
λ> map (+ 1) [1,2,3]
[2,3,4]
In the previous example, you are using the map function to apply the function (+ 1) on each of the list values.
I am using a nested plist in order to create a structure of objects (CLOS type), passing on the nested ones to its parts. I want to append the nested plist in an iterative way, but therefore I want to do it efficiently in terms of time and memory.
The below example shows the delta due to one iteration:
'(:airframer "Boeing" :type "777" :wing-plist ((:side :left :winglet? nil)
(:side :right :winglet? nil)))
into
'(:airframer "Boeing" :type "777" :wing-plist ((:type :main-wing :side :left)
(:type :main-wing :side :right)
(:type :stabilizer :size :left)))
I already read that the use of vectors instead of lists might help, as you access elements without too much penalty: Replace an item in a list in Common Lisp?. However, I would really like to bypass the use of vectors.
Furthermore, I think the use a destructive function would save memory and hopefully calculation time.
This is how I solved it now at the moment, but I have the feeling that it is not elegant and efficient. The function fill is used for destructiveness.
(defun append-nested-plist (plist key sub-plist)
(let* ((key-pos (position key plist)))
(fill plist (append (getf plist key) (list sub-plist))
:start (+ key-pos 1) :end (+ key-pos 2))))
I am looking forward to your answers.
How about this?
(defun append-nested-plist (plist key sub-plist)
(push-to-end sub-plist (getf plist key))
plist)
Push-to-end is a commonly-defined macro that's not part of the common lisp standard:
(defmacro push-to-end (item place)
`(setf ,place (nconc ,place (list ,item))))