Ensure variable is a list - python-3.x

I often find myself in a situation where I have a variable that may or may not be a list of items that I want to iterate over. If it's not a list I'll make a list out of it so I can still iterate over it. That way I don't have to write the code inside the loop twice.
def dispatch(stuff):
if type(stuff) is list:
stuff = [stuff]
for item in stuff:
# this could be several lines of code
do_something_with(item)
What I don't like about this is (1) the two extra lines (2) the type checking which is generally discouraged (3) besides I really should be checking if stuff is an iterable because it could as well be a tuple, but then it gets even more complicated. The point is, any robust solution I can think of involves an unpleasant amount of boilerplate code.
You cannot ensure stuff is a list by writing for item in [stuff] because it will make a nested list if stuff is already a list, and not iterate over the items in stuff. And you can't do for item in list(stuff) either because the constructor of list throws an error if stuff is not an iterable.
So the question I'd like to ask: is there anything obvious I've missed to the effect of ensurelist(stuff), and if not, can you think of a reason why such functionality is not made easily accessible by the language?
Edit:
In particular, I wonder why list(x) doesn't allow x to be non-iterable, simply returning a list with x as a single item.

Consider the example of the classes defined in the io module, which provide separate write and writelines methods for writing a single line and writing a list of lines. Provide separate functions that do different things. (One can even use the other.)
def dispatch(stuff):
do_something_with(item)
def dispatch_all(stuff):
for item in stuff:
dispatch(item)
The caller will have an easier time deciding whether dispatch or dispatch_all is the correct function to use than your do-it-all function will have deciding whether it needs to iterate over its argument or not.

Related

Why do we need map() and apply() if we already have starmap()?

import multiprocessing as mp
pool = mp.Pool()
As I understand it (please correct me if wrong):
map works on functions that take only one input parameter, and you
can pass in a list of such single parameter to make multiple function calls
apply works on functions that can take multiple parameters, but you
can only pass one tuple of such parameters to make one function call
starmap can deal with both multi-parameter functions and be passed
in a list of tuples of parameters
Since starmap can handle what map and apply can, why does Python provide three functions instead of just one? In other words, what are the advantages of map and/or apply over starmap?
Update
As #coldspeed pointed out, it can just be backward compatibility. But this begs the question, and I guess this is what actually puzzles me: Why did python make both map and apply in the first place, what's so hard about allowing multi-parameter and a list of jobs at the same time?
Hope this is not closed due to being considered "primarily opinion based". There must be some reasons that can be universally appreciated why the original python developers made two functions each having its own limit instead of one single all-powerful one.

Python, understanding list-comprehension

I'm learning data structures and I wanted to put the data in the stack into a list and I did it using this code
data_list=[Stack1.pop() for data in range(Stack1.get_top()+1)]
Now this does achieve it. But I would like know
even though the variable 'data' is not being used in the expression 'Stack1.pop()' , the comprehension works. Please explain it's working with an example, where the variable is not being used in the expression.
whether this approach is good w.r.to stack, queue ?
Like any list comprehension, you can modify your code into an equivalent for loop with repeated append calls:
data_list = []
for _ in range(Stack1.get_top()+1):
data_list.append(Stack1.pop())
The code works (I assume) because get_top returns one less than the size of the stack. It does have a side effect though, of emptying out the stack, which may or may not be what you want.
A more natural way of using the items from a stack is to use a while loop:
while not some_stack.is_empty():
item = stack.pop()
do_something(item)
The advantage of the while loop is that it will still work if do_something modifies the stack (either by pushing new values or popping off additional ones).
A final note: It's not usually necessary to use a special stack type in Python. Lists have O(1) methods to append() and pop() from the end of the list. If you want the items from a list in the order they'd be popped, you can just reverse it using the list.reverse() method (to reverse in place), the reversed builtin function (to get a reverse iterator), or an "alien smiley" slice (some_list[::-1]; to get a reversed copy of the list).

In Python, how to know if a function is getting a variable or an object?

How can you test whether your function is getting [1,2,4,3] or l?
That might be useful to decide whether you want to return, for example, an ordered list or replace it in place.
For example, if it gets [1,2,4,3] it should return [1,2,3,4]. If it gets l, it should link the ordered list to l and do not return anything.
You can't tell the difference in any reasonable way; you could do terrible things with the gc module to count references, but that's not a reasonable way to do things. There is no difference between an anonymous object and a named variable (aside from the reference count), because it will be named no matter what when received by the function; "variables" aren't really a thing, Python has "names" which reference objects, with the object utterly unconcerned with whether it has named or unnamed references.
Make a consistent API. If you need to have it operate both ways, either have it do both things (mutate in place and return the mutated copy for completeness), or make two distinct APIs (one of which can be written in terms of the other, by having the mutating version used to implement the return new version by making a local copy of the argument, passing it to the mutating version, then returning the mutated local copy).

Erlang: Extracting values from a Key/Value tuple

I have a tuple list that look like this:
{[{<<"id">>,1},
{<<"alerts_count">>,0},
{<<"username">>,<<"santiagopoli">>},
{<<"facebook_name">>,<<"Santiago Ignacio Poli">>},
{<<"lives">>,{[{<<"quantity">>,8},
{<<"max">>,8},
{<<"unlimited">>,true}]}}]}
I want to know how to extract properties from that tuple. For example:
get_value("id",TupleList), %% should return 1.
get_value("facebook_name",TupleList), %% should return "Santiago Ignacio Poli".
get_value("lives"), %% should return another TupleList, so i can call
get_value("quantity",get_value("lives",TupleList)).
I tried to match all the "properties" to a record called "User" but I don't know how to do it.
To be more specific: I used the Jiffy library (github.com/davisp/jiffy) to parse a JSON. Now i want to obtain a value from that JSON.
Thanks!
The first strange thing is that the tuple contains a single item list: where [{Key, Value}] is embedded in {} for no reason. So let's reference all that stuff you wrote as a variable called Stuff, and pull it out:
{KVList} = Stuff
Good start. Now we are dealing with a {Key, Value} type list. With that done, we can now do:
lists:keyfind(<<"id">>, 1, KVList)
or alternately:
proplists:get_value(<<"id">>, KVList)
...and we would get the first answer you asked about. (Note the difference in what the two might return if the Key isn't in the KVList before you copypasta some code from here...).
A further examination of this particular style of question gets into two distinctly different areas:
Erlang docs regarding data functions that have {Key, Value} functions (hint: the lists, proplists, orddict, and any other modules based on the same concept is a good candidate for research, all in the standard library), including basic filter and map.
The underlying concept of data structures as semantically meaningful constructs. Honestly, I don't see a lot of deliberate thought given to this in the functional programming world outside advanced type systems (like in Haskell, or what Dialyzer tries hard to give you). The best place to learn about this is relational database concepts -- once you know what "5NF" really means, then come back to the real world and you'll have a different, more insightful perspective, and problems like this won't just be trivial, they will beg for better foundations.
You should look into proplists module and their proplist:get_value/2 function.
You just need to think how it should behave when Key is not present in the list (or is the default proplists behavior satisfying).
And two notes:
since you keys are binnary, you should use <<"id">> in your function
proplists works on lists, but data you presented is list inside one element tuple. So you need to extract this you Data.
{PropList} = Data,
Id = proplists:get_value(<<"id">>, PropList),

Why do some programming languages restrict you from editing the array you're looping through?

Pseudo-code:
for each x in someArray {
// possibly add an element to someArray
}
I forget the name of the exception this throws in some languages.
I'm curious to know why some languages prohibit this use case, whereas other languages allow it. Are the allowing languages unsafe -- open to some pitfall? Or are the prohibiting languages simply being overly cautious, or perhaps lazy (they could have implemented the language to gracefully handle this case, but simply didn't bother).
Thanks!
What would you want the behavior to be?
list = [1,2,3,4]
foreach x in list:
print x
if x == 2: list.remove(1)
possible behaviors:
list is some linked-list type iterator, where deletions don't affect your current iterator:
[1,2,3,4]
list is some array, where your iterator iterates via pointer increment:
[1,2,4]
same as before, only the system tries to cache the iteration count
[1,2,4,<segfault>]
The problem is that different collections implementing this enumerable/sequence interface that allows for foreach-looping have different behaviors.
Depending on the language (or platform, as .Net), iteration may be implemented differently.
Typically a foreach creates an Iterator or Enumerator object on the array, which internally keeps its state about the iteration details. If you modify the array (by adding or deleting an element), the iterator state would be inconsistent in regard to the new state of the array.
Platforms such as .Net allow you to define your own enumerators which may not be susceptible to adding/removing elements of the underlying array.
A generic solution to the problem of adding/removing elements while iterating is to collect the elements in a new list/collection/array, and add/remove the collected elements after the enumeration has completed.
Suppose your array has 10 elements. You get to the 7th element, and decide there that you need to add a new element earlier in the array. Uh-oh! That element doesn't get iterated on! for each has the semantics, to me at least, of operating on each and every element of the array, once and only once.
Your pseudo example code would lead to an infinite loop. For each element you look at, you add one to the collection, hence if you have at least 1 element to start with, you will have i (iterative counter) + 1 elements.
Arrays are typically fixed in the number of elements. You get flexible sized widths through wrapped objects (such as List) that allow the flexibility to occur. I suspect that the language may have issues if the mechanism they used created a whole new array to allow for the edit.
Many compiled languages implement "for" loops with the assumption that the number of iterations will be calculated once at loop startup (or better yet, compile time). This means that if you change the value of the "to" variable inside the "for i = 1 to x" loop, it won't change the number of iterations. Doing this allows a legion of loop optimizations, which are very important in speeding up number-crunching applications.
If you don't like that semantics, the idea is that you should use the language's "while" construct instead.
Note that in this view of the world, C and C++ don't have proper "for" loops, just fancy "while" loops.
To implement the lists and enumerators to handle this, would mean a lot of overhead. This overhead would always be there, and it would only be useful in a vast miniority of the cases.
Also, any implementation that were chosen would not always make sense. Take for example the simple case of inserting an item in the list while enumerating it, would the new item always be included in the enumeration, always excluded, or should that depend on where in the list the item was added? If I insert the item at the current position, would that change the value of the Current property of the enumerator, and should it skip the currently current item which is then the next item?
This only happens within foreach blocks. Use a for loop with an index value and you'll be allowed to. Just make sure to iterate backwards so that you can delete items without causing issues.
From the top of my head there could be two scenarios of implementing iteration on a collection.
the iterator iterates over the collection for which it was created
the iterator iterates over a copy of the collection for which it was created
when changes are made to the collection on the fly, the first option should either update its iteration sequence (which could be very hard or even impossible to do reliably) or just deny the possibility (throw an exception). The last of which obviously is the safe option.
In the second option changes can be made upon the original collection without bothering the iteration sequence. But any adjustments will not be seen in the iteration, this might be confusing for users (leaky abstraction).
I could imagine languages/libraries implementing any of these possibilities with equal merit.

Resources