Erlang: Extracting values from a Key/Value tuple - get

I have a tuple list that look like this:
{[{<<"id">>,1},
{<<"alerts_count">>,0},
{<<"username">>,<<"santiagopoli">>},
{<<"facebook_name">>,<<"Santiago Ignacio Poli">>},
{<<"lives">>,{[{<<"quantity">>,8},
{<<"max">>,8},
{<<"unlimited">>,true}]}}]}
I want to know how to extract properties from that tuple. For example:
get_value("id",TupleList), %% should return 1.
get_value("facebook_name",TupleList), %% should return "Santiago Ignacio Poli".
get_value("lives"), %% should return another TupleList, so i can call
get_value("quantity",get_value("lives",TupleList)).
I tried to match all the "properties" to a record called "User" but I don't know how to do it.
To be more specific: I used the Jiffy library (github.com/davisp/jiffy) to parse a JSON. Now i want to obtain a value from that JSON.
Thanks!

The first strange thing is that the tuple contains a single item list: where [{Key, Value}] is embedded in {} for no reason. So let's reference all that stuff you wrote as a variable called Stuff, and pull it out:
{KVList} = Stuff
Good start. Now we are dealing with a {Key, Value} type list. With that done, we can now do:
lists:keyfind(<<"id">>, 1, KVList)
or alternately:
proplists:get_value(<<"id">>, KVList)
...and we would get the first answer you asked about. (Note the difference in what the two might return if the Key isn't in the KVList before you copypasta some code from here...).
A further examination of this particular style of question gets into two distinctly different areas:
Erlang docs regarding data functions that have {Key, Value} functions (hint: the lists, proplists, orddict, and any other modules based on the same concept is a good candidate for research, all in the standard library), including basic filter and map.
The underlying concept of data structures as semantically meaningful constructs. Honestly, I don't see a lot of deliberate thought given to this in the functional programming world outside advanced type systems (like in Haskell, or what Dialyzer tries hard to give you). The best place to learn about this is relational database concepts -- once you know what "5NF" really means, then come back to the real world and you'll have a different, more insightful perspective, and problems like this won't just be trivial, they will beg for better foundations.

You should look into proplists module and their proplist:get_value/2 function.
You just need to think how it should behave when Key is not present in the list (or is the default proplists behavior satisfying).
And two notes:
since you keys are binnary, you should use <<"id">> in your function
proplists works on lists, but data you presented is list inside one element tuple. So you need to extract this you Data.
{PropList} = Data,
Id = proplists:get_value(<<"id">>, PropList),

Related

Ensure variable is a list

I often find myself in a situation where I have a variable that may or may not be a list of items that I want to iterate over. If it's not a list I'll make a list out of it so I can still iterate over it. That way I don't have to write the code inside the loop twice.
def dispatch(stuff):
if type(stuff) is list:
stuff = [stuff]
for item in stuff:
# this could be several lines of code
do_something_with(item)
What I don't like about this is (1) the two extra lines (2) the type checking which is generally discouraged (3) besides I really should be checking if stuff is an iterable because it could as well be a tuple, but then it gets even more complicated. The point is, any robust solution I can think of involves an unpleasant amount of boilerplate code.
You cannot ensure stuff is a list by writing for item in [stuff] because it will make a nested list if stuff is already a list, and not iterate over the items in stuff. And you can't do for item in list(stuff) either because the constructor of list throws an error if stuff is not an iterable.
So the question I'd like to ask: is there anything obvious I've missed to the effect of ensurelist(stuff), and if not, can you think of a reason why such functionality is not made easily accessible by the language?
Edit:
In particular, I wonder why list(x) doesn't allow x to be non-iterable, simply returning a list with x as a single item.
Consider the example of the classes defined in the io module, which provide separate write and writelines methods for writing a single line and writing a list of lines. Provide separate functions that do different things. (One can even use the other.)
def dispatch(stuff):
do_something_with(item)
def dispatch_all(stuff):
for item in stuff:
dispatch(item)
The caller will have an easier time deciding whether dispatch or dispatch_all is the correct function to use than your do-it-all function will have deciding whether it needs to iterate over its argument or not.

Options for representing string input as an object

I am receiving as input a "map" represented by strings, where certain nodes of the map have significance (s). For example:
---s--
--s---
s---s-
s---s-
-----s
My question is, what reasonable options are there for representing this input as an object.
The only option that really comes to mind is:
(1) Each position translated to node with up,down,left,right pointers. The whole object contains a pointer to top right node.
This seems like just a graph representation specific to this problem.
Thanks for the help.
Additionally, if there are common terms for this type of input, please let me know
Well, it depends a lot on what you need to delegate to those objects. OOP is basically about asking objects to perform things in order to solve a given problem, so it is hard to tell without knowing what you need to accomplish.
The solution you mention can be a valid one, as can also be having a matrix (in this case of 6x5) where you store in each matrix cell an object representing the node (just as an example, I used both approaches once to model the Conway's game of life). If you could give some more information on what you need to do with the object representation of your map then a better design can be discussed.
HTH

Is there an object-identity-based, thread-safe memoization library somewhere?

I know that memoization seems to be a perennial topic here on the haskell tag on stack overflow, but I think this question has not been asked before.
I'm aware of several different 'off the shelf' memoization libraries for Haskell:
The memo-combinators and memotrie packages, which make use of a beautiful trick involving lazy infinite data structures to achieve memoization in a purely functional way. (As I understand it, the former is slightly more flexible, while the latter is easier to use in simple cases: see this SO answer for discussion.)
The uglymemo package, which uses unsafePerformIO internally but still presents a referentially transparent interface. The use of unsafePerformIO internally results in better performance than the previous two packages. (Off the shelf, its implementation uses comparison-based search data structures, rather than perhaps-slightly-more-efficient hash functions; but I think that if you find and replace Cmp for Hashable and Data.Map for Data.HashMap and add the appropraite imports, you get a hash based version.)
However, I'm not aware of any library that looks answers up based on object identity rather than object value. This can be important, because sometimes the kinds of object which are being used as keys to your memo table (that is, as input to the function being memoized) can be large---so large that fully examining the object to determine whether you've seen it before is itself a slow operation. Slow, and also unnecessary, if you will be applying the memoized function again and again to an object which is stored at a given 'location in memory' 1. (This might happen, for example, if we're memoizing a function which is being called recursively over some large data structure with a lot of structural sharing.) If we've already computed our memoized function on that exact object before, we can already know the answer, even without looking at the object itself!
Implementing such a memoization library involves several subtle issues and doing it properly requires several special pieces of support from the language. Luckily, GHC provides all the special features that we need, and there is a paper by Peyton-Jones, Marlow and Elliott which basically worries about most of these issues for you, explaining how to build a solid implementation. They don't provide all details, but they get close.
The one detail which I can see which one probably ought to worry about, but which they don't worry about, is thread safety---their code is apparently not threadsafe at all.
My question is: does anyone know of a packaged library which does the kind of memoization discussed in the Peyton-Jones, Marlow and Elliott paper, filling in all the details (and preferably filling in proper thread-safety as well)?
Failing that, I guess I will have to code it up myself: does anyone have any ideas of other subtleties (beyond thread safety and the ones discussed in the paper) which the implementer of such a library would do well to bear in mind?
UPDATE
Following #luqui's suggestion below, here's a little more data on the exact problem I face. Let's suppose there's a type:
data Node = Node [Node] [Annotation]
This type can be used to represent a simple kind of rooted DAG in memory, where Nodes are DAG Nodes, the root is just a distinguished Node, and each node is annotated with some Annotations whose internal structure, I think, need not concern us (but if it matters, just ask and I'll be more specific.) If used in this way, note that there may well be significant structural sharing between Nodes in memory---there may be exponentially more paths which lead from the root to a node than there are nodes themselves. I am given a data structure of this form, from an external library with which I must interface; I cannot change the data type.
I have a function
myTransform : Node -> Node
the details of which need not concern us (or at least I think so; but again I can be more specific if needed). It maps nodes to nodes, examining the annotations of the node it is given, and the annotations its immediate children, to come up with a new Node with the same children but possibly different annotations. I wish to write a function
recursiveTransform : Node -> Node
whose output 'looks the same' as the data structure as you would get by doing:
recursiveTransform Node originalChildren annotations =
myTransform Node recursivelyTransformedChildren annotations
where
recursivelyTransformedChildren = map recursiveTransform originalChildren
except that it uses structural sharing in the obvious way so that it doesn't return an exponential data structure, but rather one on the order of the same size as its input.
I appreciate that this would all be easier if say, the Nodes were numbered before I got them, or I could otherwise change the definition of a Node. I can't (easily) do either of these things.
I am also interested in the general question of the existence of a library implementing the functionality I mention quite independently of the particular concrete problem I face right now: I feel like I've had to work around this kind of issue on a few occasions, and it would be nice to slay the dragon once and for all. The fact that SPJ et al felt that it was worth adding not one but three features to GHC to support the existence of libraries of this form suggests that the feature is genuinely useful and can't be worked around in all cases. (BUT I'd still also be very interested in hearing about workarounds which will help in this particular case too: the long term problem is not as urgent as the problem I face right now :-) )
1 Technically, I don't quite mean location in memory, since the garbage collector sometimes moves objects around a bit---what I really mean is 'object identity'. But we can think of this as being roughly the same as our intuitive idea of location in memory.
If you only want to memoize based on object identity, and not equality, you can just use the existing laziness mechanisms built into the language.
For example, if you have a data structure like this
data Foo = Foo { ... }
expensive :: Foo -> Bar
then you can just add the value to be memoized as an extra field and let the laziness take care of the rest for you.
data Foo = Foo { ..., memo :: Bar }
To make it easier to use, add a smart constructor to tie the knot.
makeFoo ... = let foo = Foo { ..., memo = expensive foo } in foo
Though this is somewhat less elegant than using a library, and requires modification of the data type to really be useful, it's a very simple technique and all thread-safety issues are already taken care of for you.
It seems that stable-memo would be just what you needed (although I'm not sure if it can handle multiple threads):
Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).
stable-memo only evaluates keys to WHNF.
This can be more suitable for recursive functions over graphs with cycles.
stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.
Data.StableMemo.Weak provides an alternative set of combinators that also avoid retaining the results of the function, only reusing results if they have not yet been garbage collected.
There is no type class constraint on the function's argument.
stable-memo will not work for arguments which happen to have the same value but are not the same heap object. This rules out many candidates for memoization, such as the most common example, the naive Fibonacci implementation whose domain is machine Ints; it can still be made to work for some domains, though, such as the lazy naturals.
Ekmett just uploaded a library that handles this and more (produced at HacPhi): http://hackage.haskell.org/package/intern. He assures me that it is thread safe.
Edit: Actually, strictly speaking I realize this does something rather different. But I think you can use it for your purposes. It's really more of a stringtable-atom type interning library that works over arbitrary data structures (including recursive ones). It uses WeakPtrs internally to maintain the table. However, it uses Ints to index the values to avoid structural equality checks, which means packing them into the data type, when what you want are apparently actually StableNames. So I realize this answers a related question, but requires modifying your data type, which you want to avoid...

Tuples in .net 4.0.When Should I use them

I have come across Tuples in net 4.0. I have seen few example on msdn,however it's still not clear to me about the purpose of it and when to use them.
Is it the idea that if i want to create a collections of mix types I should use a tuple?
Any clear examples out there I can relate to?
When did you last use them?
Thanks for any suggestions
Tuples are just used on the coding process by a developer. If you want to return two informations instead of one, then you can use a Tuple for fast coding, but I recoment you make yourself a type that will contain both properties, with appropriate naming, and documentation.
Tuples are not used to mix types as you imagine. Tuples are used to make compositions of other types. e.g. a type that holds both an int and a string can be represented by a tuple: Tuple<int,string>.
Tuples exists in a lot of sizes, not only two.
I don't recommend using tuples in your final code, since their meaning is not clear.
Tuples can be used as multipart keys for dictionaries or grouping statements because being value types they understand equality. Avoid using them to move data around because they have poor language support in C# and named values (classes, structs) are better (simpler) than ordered values (tuples).

Ordering of parameters to make use of currying

I have twice recently refactored code in order to change the order of parameters because there was too much code where hacks like flip or \x -> foo bar x 42 were happening.
When designing a function signature what principles will help me to make the best use of currying?
For languages that support currying and partial-application easily, there is one compelling series of arguments, originally from Chris Okasaki:
Put the data structure as the last argument
Why? You can then compose operations on the data nicely. E.g. insert 1 $ insert 2 $ insert 3 $ s. This also helps for functions on state.
Standard libraries such as "containers" follow this convention.
Alternate arguments are sometimes given to put the data structure first, so it can be closed over, yielding functions on a static structure (e.g. lookup) that are a bit more concise. However, the broad consensus seems to be that this is less of a win, especially since it pushes you towards heavily parenthesized code.
Put the most varying argument last
For recursive functions, it is common to put the argument that varies the most (e.g. an accumulator) as the last argument, while the argument that varies the least (e.g. a function argument) at the start. This composes well with the data structure last style.
A summary of the Okasaki view is given in his Edison library (again, another data structure library):
Partial application: arguments more likely to be static usually appear before other arguments in order to facilitate partial application.
Collection appears last: in all cases where an operation queries a single collection or modifies an existing collection, the collection argument will appear last. This is something of a de facto standard for Haskell datastructure libraries and lends a degree of consistency to the API.
Most usual order: where an operation represents a well-known mathematical function on more than one datastructure, the arguments are chosen to match the most usual argument order for the function.
Place the arguments that you are most likely to reuse first. Function arguments are a great example of this. You are much more likely to want to map f over two different lists, than you are to want to map many different functions over the same list.
I tend to do what you did, pick some order that seems good and then refactor if it turns out that another order is better. The order depends a lot on how you are going to use the function (naturally).

Resources