list a map object in Python3 - python-3.x

Example
def f(x):
return x**2
list(map(f,[x for x in range(3)]))
Q1:
How does list automatically iterates over the map object? What's actually going on here?
Q2:
Since list is a class and a map object is an iterator, so does this mean that whenever a class acts on an iterator, it will always automatically iterates over the iterator?
Can anyone help me out here? Thanks a lot!

The map function is actually a powerful tool for speed. It is no different in functionality than a for loop.
The map function is written to execute directly by C code - hence making it run faster than if it were executed by interpreted python code. It works with any function (first param) and iterable (second param).
list(map(f,[x for x in range(3)])) # [0, 1, 4]
is the same as..
for x in range(3):
f(x)
You can use a lambda expression as well. This would produce the same results as your function, without declaring the function:
list(map(lambda f: (f ** 2), [x for x in range(3)])) # [0, 1, 4]

The list initializer delegates to list.extend() if it is passed an iterable.
No. This behavior is specific to sequence types.

Related

what is the use of * in print(*a) where 'a' is a list in python

I am a python newbie. I saw a code which had * inside a print function // print(*a) // in which 'a' was a list. I know * is multiplication operator in python, but don't know what's it in a list
(If you don't know about the variable number of argument methods, leave this topic and learn this after that)
Unpacking elements in list
Consider new_list = [1, 2, 3]. Now assume you have a function called addNum(*arguments) that expects 'n' number of arguments at different instances.
case 1:
Consider calling our function with one parameter in the list. How will you call it? Will you do it by addNum(new_list[0])?
Cool! No problem.
case 2: Now consider calling our function with two parameters in the list. How will you call it? Will you do it by addNum(new_list[0], new_list[1])?
Seems tricky!!
Case 3: Now consider calling our function with all three parameters in the list. Will you call it by addNum(new_list[0], new_list[1], new_list[2])? Instead what if you can split the values like this with an operator?
Yes! addNum(new_list[0], new_list[1], new_list[2]) <=> addNum(*new_list)
Similarly, addNum(new_list[0], new_list[1]) <=> addNum(*new_list[:2])
Also, addNum(new_list[0]) <=> addNum(*new_list[:1])
By using this operator, you could achieve this!!
It'd print all items without the need of looping over the list. The * operator used here unpacks all items from the list.
a = [1,2,3]
print(a)
# [1,2,3]
print(*a)
# 1 2 3
print(*a,sep=",")
# 1,2,3

Why do I need to define varibales in this example while sorting multiple lists at once in Python(3)?

Simple problem.
I've got these lists:
alpha = [5,10,1,2]
beta = [1,5,2]
gamma = [5,2,87,100,1]
I thought I could sort them all with:
map(lambda x: list.sort(x),[alpha,beta,gamma])
Which doesn't work.
What is working though is:
a,b,c = map(lambda x: list.sort(x),[alpha,beta,gamma])
Can someone explain why I need to define a,b and c for this code to work?
Because map() is lazy (since Python 3). It returns a generator-like object that is only evaluated when its contents are asked for, for instance because you want to assign its individual elements to variables.
E.g., this also forces it to evaluate:
>>> list(map(lambda x: list.sort(x),[alpha,beta,gamma]))
But using map is a bit archaic, list comprehensions and generator comprehensions exist and are almost always more idiomatic. Also, list.sort(x) is a bit of an odd way to write x.sort() that may or may not work, avoid it.
[x.sort() for x in [alpha, beta, gamma]]
works as you expected.
But if you aren't interested in the result, building a list isn't relevant. What you really want is a simple for loop:
for x in [alpha, beta, gamma]:
x.sort()
Which is perfectly Pythonic, except that I maybe like this one even better in this fixed, simple case:
alpha.sort()
beta.sort()
gamma.sort()
can't get more explicit than that.
Its is because list.sort() returns None. It sorts the list in place
If you want to sort them lists and return it do this:
a,b,c = (sorted(l) for l in [alpha, beta, gamma])
In [10]: alpha.sort()
In [11]: sorted(alpha)
Out[11]: [1, 2, 5, 10]

Which object exactly support asterisk(or double asterick) for positional argument(or keyward argument) in python? Any criterion?

I have used * or ** for passing arguments to a function in Python2, not in question, usually with list, set and dictionary.
def func(a, b, c):
pass
l = ['a', 'b', 'c']
func(*l)
d = {'a': 'a', 'b': 'b', 'c': 'c'}
func(**d)
However, in Python3, There appear the new objects replacing list with or something, for example, dict_keys, dict_values, range, map and so on.
While I have migrated my Python2 code to Python3, I need to decide whether the new objects could support the operation which former object in Python2 did so that If not, I should change the code using like type-cast to origin type, for instance list(dict_keys), or something.
d = {'a': 'a', 'b': 'b'}
print(list(d.keys())[0]) # type-case to use list-index
For Iterating I could figure out by the way below.
import collections
isinstance(dict_keys, collections.Iterable)
isinstance(dict_values, collections.Iterable)
isinstance(map, collections.Iterable)
isinstance(range, collections.Iterable)
It looks clear to distinguish if the new object is iterable or not but like the title of the question, how about asterisk operation for position/keyword argument?
Up to now, all objects replaced list with support asterisk operation as my testing but I need clear criterion not testing by hand.
I have tried a few way but there is no common criterion.
they are all Iterable class?
no, Iterable generator doesn't support.
they are all Iterator class?
no, Iterator generator doesn't support.
they are all Container class?
no map class is not Container
they all have a common superclass class?
no there is no common superclass(tested with Class.mro())
How could I know if the object support asterisk(*, **) operation for position/keyword argument?
Each iterable "supports" starred expression; even genrators and maps do. However, that "an object supports *" is a misleading term, because the star means "unpack my interable and pass each element in order to the parameteres of an interface". Hence, the * operator supports iterables.
And this is maybe where your problem comes in: the iterables you use with * have to have as many elements as the interface has parameters. See for example the following snippets:
# a function that takes three args
def fun(a, b, c):
return a, b, c
# a dict of some arbitrary vals:
d = {'x':10, 'y':20, 'z':30} # 3 elements
d2 = {'x':10, 'y':20, 'z':30, 't':0} # 4 elements
You can pass d to fun in many ways:
fun(*d) # valid
fun(*d.keys()) # valid
fun(*d.values()) # valid
You cannot, however, pass d2 to fun since it has more elements then
fun takes arguments:
fun(*d2) # invalid
You can also pass maps to fun using stared expression. But remeber, the result of map has to have as many arguments as fun takes arguments.
def sq(x):
return x**2
sq_vals = map(sq, *d.values())
fun(*sq_vals) # Result (100, 400, 900)
The same holds for generators if it produces as many elements as your function takes arguments:
def genx():
for i in range(3):
yield i
fun(*genx()) # Result: (0, 1, 2)
In order to check whether you can unpack an iterable into a function's interface using starred expression, you need to check if your iterable has the same number of elements as your function takes arguments.
If you want make your function safe against different length of arguments, you could, for example, try redefine you function the following way:
# this version of fun takes min. 3 arguments:
fun2(a, b, c, *args):
return a, b, c
fun2(*range(10)) # valid
fun(*range(10)) # TypeError
The single asterisk form ( *args ) is used to pass a non-keyworded,
variable- length argument list, and the double asterisk form is used
to pass a keyworded, variable-length argument list
args and kwargs explainedalso this one

How to correctly use enumerate with two inputs and three expected outputs in python spark

I've been tryng to replicate the code in http://www.data-intuitive.com/2015/01/transposing-a-spark-rdd/ to traspose an RDD in pyspark. I am able to load my RDD correctly and apply the zipWithIndex method to it as follows:
m1.rdd.zipWithIndex().collect()
[(Row(c1_1=1, c1_2=2, c1_3=3), 0),
(Row(c1_1=4, c1_2=5, c1_3=6), 1),
(Row(c1_1=7, c1_2=8, c1_3=9), 2)]
But, when I want to apply it a flatMap with a lambda enumerating that array either the syntax is non-valid:
m1.rdd.zipWithIndex().flatMap(lambda (x,i): [(i,j,e) for (j,e) in enumerate(x)]).take(1)
Or, the positional argument i appears as missing:
m1.rdd.zipWithIndex().flatMap(lambda x,i: [(i,j,e) for (j,e) in enumerate(x)]).take(1)
When I run the lambda in python, it needs an extra index parameter to catch the function.
aa = m1.rdd.zipWithIndex().collect()
g = lambda x,i: [(i,j,e) for (j,e) in enumerate(x)]
g(aa,3) #extra parameter
Which seems to me unnecessary as the index has been calculated previously.
I'm quite an amateur in python and spark and I would like to know what is the issue with the indexes and why neither spark nor python are catching them. Thank you.
First let's take a look a the signature of RDD.flatMap (preservesPartitioning parameter removed for clarity):
flatMap(self: RDD[T], f: Callable[[T], Iterable[U]]) -> RDD[U]: ...
As you can see flatMap expects an unary function.
Going back to your code:
lambda x, i: ... is a binary function, so clearly it won't work.
lambda (x, i): ... use to be a syntax for an unary function with tuple argument unpacking. It used structural matching to destructure (unpack in Python nomenclature) a single input argument (here Tuple[Any, Any]). This syntax was brittle and has been removed in Python 3. A correct way to achieve the same result in Python 3 is indexing:
lambda xi: ((x[1], j, e) for e, j in enumerate(x[0]))
If you prefer structural matching just use standard function:
def flatten(xsi):
xs, i = xsi
for j, x in enumerate(xs):
yield i, j, x
rdd.flatMap(flatten)

Why map is not executing a function in python 3

I have written a below small python program
def abc(x):
print(x)
and then called
map(abc, [1,2,3])
but the above map function has just displayed
<map object at 0x0000000001F0BC88>
instead of printing x value.
I know map is an iterator in python 3, but still it should have printed the 'x' value right. Does it mean that abc(x) method is not called when we use "map"?
The map iterator lazily computes the values, so you will not see the output until you iterate through them. Here's an explicit way you could make the values print out:
def abc(x):
print(x)
it = map(abc, [1,2,3])
next(it)
next(it)
next(it)
The next function calls it.__next__ to step to the next value. This is what is used under the hood when you use for i in it: # do something or when you construct a list from an iterator, list(it), so doing either of these things would also print out of the values.
So, why laziness? It comes in handy when working with very large or infinite sequences. Imagine if instead of passing [1,2,3] to map, you passed itertools.count() to it. The laziness allows you to still iterate over the resulting map without trying to generate all (and there are infinitely many) values up front.
lazy-evaluation
map(or range, etc.) in Python3 is lazy-evaluation: it will evaluate when you need it.
If you want a result of map, you can use:
list(map(abc, [1,2,3]))

Resources