Context manager inside list comprehension inside class [duplicate] - python-3.x

How do you access other class variables from a list comprehension within the class definition? The following works in Python 2 but fails in Python 3:
class Foo:
x = 5
y = [x for i in range(1)]
Python 3.2 gives the error:
NameError: global name 'x' is not defined
Trying Foo.x doesn't work either. Any ideas on how to do this in Python 3?
A slightly more complicated motivating example:
from collections import namedtuple
class StateDatabase:
State = namedtuple('State', ['name', 'capital'])
db = [State(*args) for args in [
['Alabama', 'Montgomery'],
['Alaska', 'Juneau'],
# ...
]]
In this example, apply() would have been a decent workaround, but it is sadly removed from Python 3.

Class scope and list, set or dictionary comprehensions, as well as generator expressions do not mix.
The why; or, the official word on this
In Python 3, list comprehensions were given a proper scope (local namespace) of their own, to prevent their local variables bleeding over into the surrounding scope (see List comprehension rebinds names even after scope of comprehension. Is this right?). That's great when using such a list comprehension in a module or in a function, but in classes, scoping is a little, uhm, strange.
This is documented in pep 227:
Names in class scope are not accessible. Names are resolved in
the innermost enclosing function scope. If a class definition
occurs in a chain of nested scopes, the resolution process skips
class definitions.
and in the class compound statement documentation:
The class’s suite is then executed in a new execution frame (see section Naming and binding), using a newly created local namespace and the original global namespace. (Usually, the suite contains only function definitions.) When the class’s suite finishes execution, its execution frame is discarded but its local namespace is saved. [4] A class object is then created using the inheritance list for the base classes and the saved local namespace for the attribute dictionary.
Emphasis mine; the execution frame is the temporary scope.
Because the scope is repurposed as the attributes on a class object, allowing it to be used as a nonlocal scope as well leads to undefined behaviour; what would happen if a class method referred to x as a nested scope variable, then manipulates Foo.x as well, for example? More importantly, what would that mean for subclasses of Foo? Python has to treat a class scope differently as it is very different from a function scope.
Last, but definitely not least, the linked Naming and binding section in the Execution model documentation mentions class scopes explicitly:
The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods – this includes comprehensions and generator expressions since they are implemented using a function scope. This means that the following will fail:
class A:
a = 42
b = list(a + i for i in range(10))
So, to summarize: you cannot access the class scope from functions, list comprehensions or generator expressions enclosed in that scope; they act as if that scope does not exist. In Python 2, list comprehensions were implemented using a shortcut, but in Python 3 they got their own function scope (as they should have had all along) and thus your example breaks. Other comprehension types have their own scope regardless of Python version, so a similar example with a set or dict comprehension would break in Python 2.
# Same error, in Python 2 or 3
y = {x: x for i in range(1)}
The (small) exception; or, why one part may still work
There's one part of a comprehension or generator expression that executes in the surrounding scope, regardless of Python version. That would be the expression for the outermost iterable. In your example, it's the range(1):
y = [x for i in range(1)]
# ^^^^^^^^
Thus, using x in that expression would not throw an error:
# Runs fine
y = [i for i in range(x)]
This only applies to the outermost iterable; if a comprehension has multiple for clauses, the iterables for inner for clauses are evaluated in the comprehension's scope:
# NameError
y = [i for i in range(1) for j in range(x)]
# ^^^^^^^^^^^^^^^^^ -----------------
# outer loop inner, nested loop
This design decision was made in order to throw an error at genexp creation time instead of iteration time when creating the outermost iterable of a generator expression throws an error, or when the outermost iterable turns out not to be iterable. Comprehensions share this behavior for consistency.
Looking under the hood; or, way more detail than you ever wanted
You can see this all in action using the dis module. I'm using Python 3.3 in the following examples, because it adds qualified names that neatly identify the code objects we want to inspect. The bytecode produced is otherwise functionally identical to Python 3.2.
To create a class, Python essentially takes the whole suite that makes up the class body (so everything indented one level deeper than the class <name>: line), and executes that as if it were a function:
>>> import dis
>>> def foo():
... class Foo:
... x = 5
... y = [x for i in range(1)]
... return Foo
...
>>> dis.dis(foo)
2 0 LOAD_BUILD_CLASS
1 LOAD_CONST 1 (<code object Foo at 0x10a436030, file "<stdin>", line 2>)
4 LOAD_CONST 2 ('Foo')
7 MAKE_FUNCTION 0
10 LOAD_CONST 2 ('Foo')
13 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
16 STORE_FAST 0 (Foo)
5 19 LOAD_FAST 0 (Foo)
22 RETURN_VALUE
The first LOAD_CONST there loads a code object for the Foo class body, then makes that into a function, and calls it. The result of that call is then used to create the namespace of the class, its __dict__. So far so good.
The thing to note here is that the bytecode contains a nested code object; in Python, class definitions, functions, comprehensions and generators all are represented as code objects that contain not only bytecode, but also structures that represent local variables, constants, variables taken from globals, and variables taken from the nested scope. The compiled bytecode refers to those structures and the python interpreter knows how to access those given the bytecodes presented.
The important thing to remember here is that Python creates these structures at compile time; the class suite is a code object (<code object Foo at 0x10a436030, file "<stdin>", line 2>) that is already compiled.
Let's inspect that code object that creates the class body itself; code objects have a co_consts structure:
>>> foo.__code__.co_consts
(None, <code object Foo at 0x10a436030, file "<stdin>", line 2>, 'Foo')
>>> dis.dis(foo.__code__.co_consts[1])
2 0 LOAD_FAST 0 (__locals__)
3 STORE_LOCALS
4 LOAD_NAME 0 (__name__)
7 STORE_NAME 1 (__module__)
10 LOAD_CONST 0 ('foo.<locals>.Foo')
13 STORE_NAME 2 (__qualname__)
3 16 LOAD_CONST 1 (5)
19 STORE_NAME 3 (x)
4 22 LOAD_CONST 2 (<code object <listcomp> at 0x10a385420, file "<stdin>", line 4>)
25 LOAD_CONST 3 ('foo.<locals>.Foo.<listcomp>')
28 MAKE_FUNCTION 0
31 LOAD_NAME 4 (range)
34 LOAD_CONST 4 (1)
37 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
40 GET_ITER
41 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
44 STORE_NAME 5 (y)
47 LOAD_CONST 5 (None)
50 RETURN_VALUE
The above bytecode creates the class body. The function is executed and the resulting locals() namespace, containing x and y is used to create the class (except that it doesn't work because x isn't defined as a global). Note that after storing 5 in x, it loads another code object; that's the list comprehension; it is wrapped in a function object just like the class body was; the created function takes a positional argument, the range(1) iterable to use for its looping code, cast to an iterator. As shown in the bytecode, range(1) is evaluated in the class scope.
From this you can see that the only difference between a code object for a function or a generator, and a code object for a comprehension is that the latter is executed immediately when the parent code object is executed; the bytecode simply creates a function on the fly and executes it in a few small steps.
Python 2.x uses inline bytecode there instead, here is output from Python 2.7:
2 0 LOAD_NAME 0 (__name__)
3 STORE_NAME 1 (__module__)
3 6 LOAD_CONST 0 (5)
9 STORE_NAME 2 (x)
4 12 BUILD_LIST 0
15 LOAD_NAME 3 (range)
18 LOAD_CONST 1 (1)
21 CALL_FUNCTION 1
24 GET_ITER
>> 25 FOR_ITER 12 (to 40)
28 STORE_NAME 4 (i)
31 LOAD_NAME 2 (x)
34 LIST_APPEND 2
37 JUMP_ABSOLUTE 25
>> 40 STORE_NAME 5 (y)
43 LOAD_LOCALS
44 RETURN_VALUE
No code object is loaded, instead a FOR_ITER loop is run inline. So in Python 3.x, the list generator was given a proper code object of its own, which means it has its own scope.
However, the comprehension was compiled together with the rest of the python source code when the module or script was first loaded by the interpreter, and the compiler does not consider a class suite a valid scope. Any referenced variables in a list comprehension must look in the scope surrounding the class definition, recursively. If the variable wasn't found by the compiler, it marks it as a global. Disassembly of the list comprehension code object shows that x is indeed loaded as a global:
>>> foo.__code__.co_consts[1].co_consts
('foo.<locals>.Foo', 5, <code object <listcomp> at 0x10a385420, file "<stdin>", line 4>, 'foo.<locals>.Foo.<listcomp>', 1, None)
>>> dis.dis(foo.__code__.co_consts[1].co_consts[2])
4 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (i)
12 LOAD_GLOBAL 0 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
This chunk of bytecode loads the first argument passed in (the range(1) iterator), and just like the Python 2.x version uses FOR_ITER to loop over it and create its output.
Had we defined x in the foo function instead, x would be a cell variable (cells refer to nested scopes):
>>> def foo():
... x = 2
... class Foo:
... x = 5
... y = [x for i in range(1)]
... return Foo
...
>>> dis.dis(foo.__code__.co_consts[2].co_consts[2])
5 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (i)
12 LOAD_DEREF 0 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
The LOAD_DEREF will indirectly load x from the code object cell objects:
>>> foo.__code__.co_cellvars # foo function `x`
('x',)
>>> foo.__code__.co_consts[2].co_cellvars # Foo class, no cell variables
()
>>> foo.__code__.co_consts[2].co_consts[2].co_freevars # Refers to `x` in foo
('x',)
>>> foo().y
[2]
The actual referencing looks the value up from the current frame data structures, which were initialized from a function object's .__closure__ attribute. Since the function created for the comprehension code object is discarded again, we do not get to inspect that function's closure. To see a closure in action, we'd have to inspect a nested function instead:
>>> def spam(x):
... def eggs():
... return x
... return eggs
...
>>> spam(1).__code__.co_freevars
('x',)
>>> spam(1)()
1
>>> spam(1).__closure__
>>> spam(1).__closure__[0].cell_contents
1
>>> spam(5).__closure__[0].cell_contents
5
So, to summarize:
List comprehensions get their own code objects in Python 3, and there is no difference between code objects for functions, generators or comprehensions; comprehension code objects are wrapped in a temporary function object and called immediately.
Code objects are created at compile time, and any non-local variables are marked as either global or as free variables, based on the nested scopes of the code. The class body is not considered a scope for looking up those variables.
When executing the code, Python has only to look into the globals, or the closure of the currently executing object. Since the compiler didn't include the class body as a scope, the temporary function namespace is not considered.
A workaround; or, what to do about it
If you were to create an explicit scope for the x variable, like in a function, you can use class-scope variables for a list comprehension:
>>> class Foo:
... x = 5
... def y(x):
... return [x for i in range(1)]
... y = y(x)
...
>>> Foo.y
[5]
The 'temporary' y function can be called directly; we replace it when we do with its return value. Its scope is considered when resolving x:
>>> foo.__code__.co_consts[1].co_consts[2]
<code object y at 0x10a5df5d0, file "<stdin>", line 4>
>>> foo.__code__.co_consts[1].co_consts[2].co_cellvars
('x',)
Of course, people reading your code will scratch their heads over this a little; you may want to put a big fat comment in there explaining why you are doing this.
The best work-around is to just use __init__ to create an instance variable instead:
def __init__(self):
self.y = [self.x for i in range(1)]
and avoid all the head-scratching, and questions to explain yourself. For your own concrete example, I would not even store the namedtuple on the class; either use the output directly (don't store the generated class at all), or use a global:
from collections import namedtuple
State = namedtuple('State', ['name', 'capital'])
class StateDatabase:
db = [State(*args) for args in [
('Alabama', 'Montgomery'),
('Alaska', 'Juneau'),
# ...
]]

In my opinion it is a flaw in Python 3. I hope they change it.
Old Way (works in 2.7, throws NameError: name 'x' is not defined in 3+):
class A:
x = 4
y = [x+i for i in range(1)]
NOTE: simply scoping it with A.x would not solve it
New Way (works in 3+):
class A:
x = 4
y = (lambda x=x: [x+i for i in range(1)])()
Because the syntax is so ugly I just initialize all my class variables in the constructor typically

The accepted answer provides excellent information, but there appear to be a few other wrinkles here -- differences between list comprehension and generator expressions. A demo that I played around with:
class Foo:
# A class-level variable.
X = 10
# I can use that variable to define another class-level variable.
Y = sum((X, X))
# Works in Python 2, but not 3.
# In Python 3, list comprehensions were given their own scope.
try:
Z1 = sum([X for _ in range(3)])
except NameError:
Z1 = None
# Fails in both.
# Apparently, generator expressions (that's what the entire argument
# to sum() is) did have their own scope even in Python 2.
try:
Z2 = sum(X for _ in range(3))
except NameError:
Z2 = None
# Workaround: put the computation in lambda or def.
compute_z3 = lambda val: sum(val for _ in range(3))
# Then use that function.
Z3 = compute_z3(X)
# Also worth noting: here I can refer to XS in the for-part of the
# generator expression (Z4 works), but I cannot refer to XS in the
# inner-part of the generator expression (Z5 fails).
XS = [15, 15, 15, 15]
Z4 = sum(val for val in XS)
try:
Z5 = sum(XS[i] for i in range(len(XS)))
except NameError:
Z5 = None
print(Foo.Z1, Foo.Z2, Foo.Z3, Foo.Z4, Foo.Z5)

Since the outermost iterator is evaluated in the surrounding scope we can use zip together with itertools.repeat to carry the dependencies over to the comprehension's scope:
import itertools as it
class Foo:
x = 5
y = [j for i, j in zip(range(3), it.repeat(x))]
One can also use nested for loops in the comprehension and include the dependencies in the outermost iterable:
class Foo:
x = 5
y = [j for j in (x,) for i in range(3)]
For the specific example of the OP:
from collections import namedtuple
import itertools as it
class StateDatabase:
State = namedtuple('State', ['name', 'capital'])
db = [State(*args) for State, args in zip(it.repeat(State), [
['Alabama', 'Montgomery'],
['Alaska', 'Juneau'],
# ...
])]

This is a bug in Python. Comprehensions are advertised as being equivalent to for loops, but this is not true in classes. At least up to Python 3.6.6, in a comprehension used in a class, only one variable from outside the comprehension is accessible inside the comprehension, and it must be used as the outermost iterator. In a function, this scope limitation does not apply.
To illustrate why this is a bug, let's return to the original example. This fails:
class Foo:
x = 5
y = [x for i in range(1)]
But this works:
def Foo():
x = 5
y = [x for i in range(1)]
The limitation is stated at the end of this section in the reference guide.

This may be by design, but IMHO, it's a bad design. I know I'm not an expert here, and I've tried reading the rationale behind this, but it just goes over my head, as I think it would for any average Python programmer.
To me, a comprehension doesn't seem that much different than a regular mathematical expression. For example, if 'foo' is a local function variable, I can easily do something like:
(foo + 5) + 7
But I can't do:
[foo + x for x in [1,2,3]]
To me, the fact that one expression exists in the current scope and the other creates a scope of its own is very surprising and, no pun intended, 'incomprehensible'.

I spent quite some time to understand why this is a feature, not a bug.
Consider the simple code:
a = 5
def myfunc():
print(a)
Since there is no "a" defined in myfunc(), the scope would expand and the code will execute.
Now consider the same code in the class. It cannot work because this would completely mess around accessing the data in the class instances. You would never know, are you accessing a variable in the base class or the instance.
The list comprehension is just a sub-case of the same effect.

One can use a for loop:
class A:
x=5
##Won't work:
## y=[i for i in range(101) if i%x==0]
y=[]
for i in range(101):
if i%x==0:
y.append(i)
Please correct me i'm not wrong...

Related

In Python, what is the purpose of __length_hint__ of the set_iterator class? [duplicate]

class Foo:
def __getitem__(self, item):
print('getitem', item)
if item == 6:
raise IndexError
return item**2
def __len__(self):
print('len')
return 3
class Bar:
def __iter__(self):
print('iter')
return iter([3, 5, 42, 69])
def __len__(self):
print('len')
return 3
Demo:
>>> list(Foo())
len
getitem 0
getitem 1
getitem 2
getitem 3
getitem 4
getitem 5
getitem 6
[0, 1, 4, 9, 16, 25]
>>> list(Bar())
iter
len
[3, 5, 42, 69]
Why does list call __len__? It doesn't seem to use the result for anything obvious. A for loop doesn't do it. This isn't mentioned anywhere in the iterator protocol, which just talks about __iter__ and __next__.
Is this Python reserving space for the list in advance, or something clever like that?
(CPython 3.6.0 on Linux)
See the Rationale section from PEP 424 that introduced __length_hint__ and offers insight on the motivation:
Being able to pre-allocate lists based on the expected size, as estimated by __length_hint__ , can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present.
In addition to that, the documentation for object.__length_hint__ verifies the fact that this is purely an optimization feature:
Called to implement operator.length_hint(). Should return an estimated length for the object (which may be greater or less than the actual length). The length must be an integer >= 0. This method is purely an optimization and is never required for correctness.
So __length_hint__ is here because it can result in some nice optimizations.
PyObject_LengthHint, first tries to get a value from object.__len__ (if it is defined) and then tries to see if object.__length_hint__ is available. If neither is there, it returns a default value of 8 for lists.
listextend, which is called from list_init as Eli stated in his answer, was modified according to this PEP to offer this optimization for anything that defines either a __len__ or a __length_hint__.
list isn't the only one that benefits from this, of course, bytes objects do:
>>> bytes(Foo())
len
getitem 0
...
b'\x00\x01\x04\t\x10\x19'
so do bytearray objects but, only when you extend them:
>>> bytearray().extend(Foo())
len
getitem 0
...
and tuple objects which create an intermediary sequence to populate themselves:
>>> tuple(Foo())
len
getitem 0
...
(0, 1, 4, 9, 16, 25)
If anybody is wandering why exactly 'iter' is printed before 'len' in class Bar and not after as happens with class Foo:
This is because if the object in hand defines an __iter__ Python will first call it to get the iterator, thereby running the print('iter') too. The same doesn't happen if it falls back to using __getitem__.
list is a list object constructor that will allocate an initial slice of memory for its contents. The list constructor attempts to figure out a good size for that initial slice of memory by checking the length hint or the length of any object passed into the constructor . See the call to PyObject_LengthHint in the Python source here. This place is called from the list constructor -- list_init
If your object has no __len__ or __length_hint__, that's OK -- a default value of 8 is used; it just may be less efficient due to reallocations.
Note: I prepared the answer for [SO]: Why __len__ is called and the result is not used when iterating with __getitem__?, which was marked as a dupe (as it's exactly this question) while I was writing it, so it was no longer possible to post it there, and since I already had it, I decided to post it here (with small adjustments).
Here's a modified version of your code that makes things a bit clearer.
code00.py:
#!/usr/bin/env python3
import sys
class Foo:
def __getitem__(self, item):
print("{0:s}.{1:s}: {2:d}".format(self.__class__.__name__, "getitem", item))
if item == 6:
raise IndexError
return item ** 2
class Bar:
def __iter__(self):
print("{0:s}.{1:s}".format(self.__class__.__name__, "iter"))
return iter([3, 5, 42, 69])
def __len__(self):
result = 3
print("{0:s}.{1:s}: {2:d}".format(self.__class__.__name__, "len", result))
return result
def main():
print("Start ...\n")
for class_obj in [Foo, Bar]:
inst_obj = class_obj()
print("Created {0:s} instance".format(class_obj.__name__))
list_obj = list(inst_obj)
print("Converted instance to list")
print("{0:s}: {1:}\n".format(class_obj.__name__, list_obj))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main()
print("\nDone.")
Output:
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q041474829]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code00.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32
Start ...
Created Foo instance
Foo.getitem: 0
Foo.getitem: 1
Foo.getitem: 2
Foo.getitem: 3
Foo.getitem: 4
Foo.getitem: 5
Foo.getitem: 6
Converted instance to list
Foo: [0, 1, 4, 9, 16, 25]
Created Bar instance
Bar.iter
Bar.len: 3
Converted instance to list
Bar: [3, 5, 42, 69]
Done.
As seen, __len__ is called when the list is constructed. Browsing [GitHub]: python/cpython - (master) cpython/Objects/listobject.c:
list___init__ (which is the initializer: __init__ (tp_init member in PyList_Type)) calls list___init___impl
list___init___impl calls list_extend
list_extend calls PyObject_LengthHint (n = PyObject_LengthHint(iterable, 8);)
PyObject_LengthHint (in abstract.c), does the check:
Py_ssize_t
PyObject_LengthHint(PyObject *o, Py_ssize_t defaultvalue)
// ...
if (_PyObject_HasLen(o)) {
res = PyObject_Length(o);
// ...
So, it's an optimization feature that works for iterables that define __len__.
This is particularly handy when the iterable has a large number of elements, so that they are allocated at once, and therefore skip the list growth mechanism (didn't check if still applies, but at one point, it was): "Space increases by ~12.5% when full" (according to David M. Beazley). It is very useful when lists were constructed out of (other) lists or tuples. For example, constructing a list from an iterable (that doesn't define __len__) with 1000 elements, instead of allocating everything at once, there will be ~41 (log1.125(1000 / 8)) operations (allocation, data shifting, deallocation) required only for increasing the new list as it gets filled (with elements from the source iterable).
Needless to say that for "modern" iterables, the improvement no longer applies.

How to elegantly define a generator producing an empty sequence [duplicate]

A generator function can be defined by putting the yield keyword in the function’s body:
def gen():
for i in range(10):
yield i
How to define an empty generator function?
The following code doesn’t work, since Python cannot know that it is supposed to be a generator function instead of a normal function:
def empty():
pass
I could do something like this:
def empty():
if False:
yield
But that would be very ugly. Is there a nicer way?
You can use return once in a generator; it stops iteration without yielding anything, and thus provides an explicit alternative to letting the function run out of scope. So use yield to turn the function into a generator, but precede it with return to terminate the generator before yielding anything.
>>> def f():
... return
... yield
...
>>> list(f())
[]
I'm not sure it's that much better than what you have -- it just replaces a no-op if statement with a no-op yield statement. But it is more idiomatic. Note that just using yield doesn't work.
>>> def f():
... yield
...
>>> list(f())
[None]
Why not just use iter(())?
This question asks specifically about an empty generator function. For that reason, I take it to be a question about the internal consistency of Python's syntax, rather than a question about the best way to create an empty iterator in general.
If question is actually about the best way to create an empty iterator, then you might agree with Zectbumo about using iter(()) instead. However, it's important to observe that iter(()) doesn't return a function! It directly returns an empty iterable. Suppose you're working with an API that expects a callable that returns an iterable each time it's called, just like an ordinary generator function. You'll have to do something like this:
def empty():
return iter(())
(Credit should go to Unutbu for giving the first correct version of this answer.)
Now, you may find the above clearer, but I can imagine situations in which it would be less clear. Consider this example of a long list of (contrived) generator function definitions:
def zeros():
while True:
yield 0
def ones():
while True:
yield 1
...
At the end of that long list, I'd rather see something with a yield in it, like this:
def empty():
return
yield
or, in Python 3.3 and above (as suggested by DSM), this:
def empty():
yield from ()
The presence of the yield keyword makes it clear at the briefest glance that this is just another generator function, exactly like all the others. It takes a bit more time to see that the iter(()) version is doing the same thing.
It's a subtle difference, but I honestly think the yield-based functions are more readable and maintainable.
See also this great answer from user3840170 that uses dis to show another reason why this approach is preferable: it emits the fewest instructions when compiled.
iter(())
You don't require a generator. C'mon guys!
Python 3.3 (because I'm on a yield from kick, and because #senderle stole my first thought):
>>> def f():
... yield from ()
...
>>> list(f())
[]
But I have to admit, I'm having a hard time coming up with a use case for this for which iter([]) or (x)range(0) wouldn't work equally well.
Another option is:
(_ for _ in ())
Like #senderle said, use this:
def empty():
return
yield
I’m writing this answer mostly to share another justification for it.
One reason for choosing this solution above the others is that it is optimal as far as the interpreter is concerned.
>>> import dis
>>> def empty_yield_from():
... yield from ()
...
>>> def empty_iter():
... return iter(())
...
>>> def empty_return():
... return
... yield
...
>>> def noop():
... pass
...
>>> dis.dis(empty_yield_from)
2 0 LOAD_CONST 1 (())
2 GET_YIELD_FROM_ITER
4 LOAD_CONST 0 (None)
6 YIELD_FROM
8 POP_TOP
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>> dis.dis(empty_iter)
2 0 LOAD_GLOBAL 0 (iter)
2 LOAD_CONST 1 (())
4 CALL_FUNCTION 1
6 RETURN_VALUE
>>> dis.dis(empty_return)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
>>> dis.dis(noop)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
As we can see, the empty_return has exactly the same bytecode as a regular empty function; the rest perform a number of other operations that don’t change the behaviour anyway. The only difference between empty_return and noop is that the former has the generator flag set:
>>> dis.show_code(noop)
Name: noop
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, NOFREE
Constants:
0: None
>>> dis.show_code(empty_return)
Name: empty_return
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE
Constants:
0: None
The above disassembly is outdated as of CPython 3.11, but empty_return still comes out on top, with only two more opcodes (four bytes) than a no-op function:
>>> dis.dis(empty_yield_from)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 1 (())
8 GET_YIELD_FROM_ITER
10 LOAD_CONST 0 (None)
>> 12 SEND 3 (to 20)
14 YIELD_VALUE
16 RESUME 2
18 JUMP_BACKWARD_NO_INTERRUPT 4 (to 12)
>> 20 POP_TOP
22 LOAD_CONST 0 (None)
24 RETURN_VALUE
>>> dis.dis(empty_iter)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + iter)
14 LOAD_CONST 1 (())
16 PRECALL 1
20 CALL 1
30 RETURN_VALUE
>>> dis.dis(empty_return)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 0 (None)
8 RETURN_VALUE
>>> dis.dis(noop)
1 0 RESUME 0
2 2 LOAD_CONST 0 (None)
4 RETURN_VALUE
Of course, the strength of this argument is very dependent on the particular implementation of Python in use; a sufficiently smart alternative interpreter may notice that the other operations amount to nothing useful and optimise them out. However, even if such optimisations are present, they require the interpreter to spend time performing them and to safeguard against optimisation assumptions being broken, like the iter identifier at global scope being rebound to something else (even though that would most likely indicate a bug if it actually happened). In the case of empty_return there is simply nothing to optimise, as bytecode generation stops after a return statement, so even the relatively naïve CPython will not waste time on any spurious operations.
Must it be a generator function? If not, how about
def f():
return iter(())
The "standard" way to make an empty iterator appears to be iter([]).
I suggested to make [] the default argument to iter(); this was rejected with good arguments, see http://bugs.python.org/issue25215
- Jurjen
I want to give a class based example since we haven't had any suggested yet. This is a callable iterator that generates no items. I believe this is a straightforward and descriptive way to solve the issue.
class EmptyGenerator:
def __iter__(self):
return self
def __next__(self):
raise StopIteration
>>> list(EmptyGenerator())
[]
generator = (item for item in [])
Nobody has mentioned it yet, but calling the built-in function zip with no arguments returns an empty iterator:
>>> it = zip()
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

python3 list creation from class makes a global list rather than a series iterated ones

So here is the problem I am having. I am trying to iterate the makeAThing class, and then create a list for the iteration using the makeAList class. Instead of making seperate lists for each iteration of makeAThing, it is making one big global list and adding the different values to it. Is there something I am missing/don't know yet, or is this just how python behaves?
class ListMaker(object):
def __init__(self,bigList = []):
self.bigList = bigList
class makeAThing(object):
def __init__(self,name = 0, aList = []):
self.name = name
self.aList = aList
def makeAList(self):
self.aList = ListMaker()
k = []
x = 0
while x < 3:
k.append(makeAThing())
k[x].name = x
k[x].makeAList()
k[x].aList.bigList.append(x)
x += 1
for e in k:
print(e.name, e.aList.bigList)
output:
0 [0, 1, 2]
1 [0, 1, 2]
2 [0, 1, 2]
the output I am trying to achieve:
0 [0]
1 [1]
2 [2]
After which I want to be able to edit the individual lists and keep them assigned to their iterations
Your init functions are using mutable default arguments.
From the Python documentation:
Default parameter values are evaluated from left to right when the
function definition is executed. This means that the expression is
evaluated once, when the function is defined, and that the same
“pre-computed” value is used for each call. This is especially
important to understand when a default parameter is a mutable object,
such as a list or a dictionary: if the function modifies the object
(e.g. by appending an item to a list), the default value is in effect
modified. This is generally not what was intended. A way around this
is to use None as the default, and explicitly test for it in the body
of the function, e.g.:
def whats_on_the_telly(penguin=None):
if penguin is None:
penguin = []
penguin.append("property of the zoo")
return penguin
In your code, the default argument bigList = [] is evaluated once - when the function is defined - the empty list is created once. Every time the function is called, the same list is used - even though it is no longer empty.
The default argument aList = [] has the same problem, but you immediately overwrite self.aList with a call to makeAList, so it doesn't cause any problems.
To verify this with your code, try the following after your code executes:
print(k[0].aList.bigList is k[1].aList.bigList)
The objects are the same.
There are instances where this behavior can be useful (Memoization comes to mind - although there are other/better ways of doing that). Generally, avoid mutable default arguments. The empty string is fine (and frequently used) because strings are immutable. For lists, dictionaries and the sort, you'll have to add a bit of logic inside the function.

Variable scope python3

I have
def func1(var):
if var == 0:
return
else
var = var - 1
func1(var)
PROPOSAL = 1
def func2():
func1(PROPOSAL)
print(PROPOSAL)
In the recursive calls in func1, will the variable PROPOSAL be decremented, meaning the print statement will print 0?
Edit: I should've asked, why doesn't it do this?
No, the PROPOSAL global variable will not be decremented by your code. This isn't really because of scope, but because of how Python passes arguments.
When you call a function that takes an argument, the value of the argument you pass is bound to a parameter name, just like an assignment to a variable. If the value is mutable, an in-place modification through one name will be visible through the other name, but if the variable is immutable (as ints are in Python), you'll never see a change to one variable effect another.
Here's an example that shows functions and regular assignment working the same way:
x = 1
y = x # binds the y to the same value as x
y += 1 # modify y (which will rebind it, since integers are immutable)
print(x, y) # prints "1 2"
def func(z): # z is a local variable in func
z += 1
print(x, z)
func(x) # also prints "1 2", for exactly the same reasons as the code above
X = [1]
Y = X # again, binds Y to the same list as X
Y.append(2) # this time, we modify the list in place (without rebinding)
print(X, Y) # prints "[1, 2] [1, 2]", since both names still refer to the same list
def FUNC(Z):
Z.append(3):
print(X, Z)
FUNC(X) # prints "[1, 2, 3] [1, 2, 3]"
Of course, rebinding a variable that referred to a mutable value will also cause the change not to be reflected in other references to the original value. For instance, you replaced the append() calls in the second part of the code with Y = Y + [2] and Z = Z + [3], the original X list would not be changed, since those assignment statements rebind Y and Z rather than modifying the original values in place.
The "augmented assignment" operators like +=, -=, and *= are a bit tricky. They will first try to do an in-place modification if the value on the left side supports it (and many mutable types do). If the value doesn't support in-place modification of that kind (for instance, because it's an immutable object, or because the specific operator is not allowed), it will fall back on using the regular + operator to create a new value instead (if that fails too it will raise an exception).
func1(PROPOSAL) will return NONE and it won't affect the global PROPOSAL variable because you don't assign that return value to PROPOSAL.
func2() just calls func1() and then prints the PROPOSAL variable that wasn't changed it that scope just in func1(PROPOSAL)

When is map() necessary?

Given the following:
(I) a = map(function, sequence)
(II) a = [function(x) for x in sequence]
When would I need to use (I)? Why choose a map object over a list when the latter is subscriptable and IMO more readable?
Also, could someone explain line 6 of the following code (Python 3):
>>>import math
>>>a = map(int,str(math.factorial(100)))
>>>sum(a)
648
>>>sum(a)
0
Why is the sum of the map object changing?
When would I need to use (I)? Why choose a map object over a list when the latter is subscriptable and IMO more readable?
map was introduced in Python 1.0, while list comprehension was not introduced until Python 2.0.
For Python 2+, you never need to use one or the other.
Reasons for still using map could include:
preference. You prefer list comprehension, not everyone agrees.
familiarity. map is very common across languages. If Python's not your native language, "map" is the function you'll look up.
brevity. map is often shorter. Compare map and lambda f,l: [f(x) for x in l].
I is an iterator -- it creates a stream of values which then vanish. II is a list -- it lasts for a while and has lots of features, like len(mylist) and mylist[-3:].
The sum changes because the iterator vanishes after you use it.
Use lists and list comprehensions. If you process tons of data, then iterators (and generators, and generator comprehensions) are awesome, but they can be confusing.
Or, use an iterator and convert into a list for further processing:
a = list( map(int,str(math.factorial(100))) )
From the docs:
Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel...
The sum changes to 0 because the iterator is iterated, so it becomes nothing. This is the same concept with .read() (Try calling x = open('myfile.txt'), and then type print x.read() twice.)
In order to preserve the iterable, surround it with list():
>>> import math
>>> a = map(int,str(math.factorial(100)))
>>> sum(a)
648
>>> sum(a)
0
>>> a = list(map(int,str(math.factorial(100))))
>>> sum(a)
648
>>> sum(a)
648
Example from the docs:
>>> seq = range(8)
>>> def add(x, y): return x+y
...
>>> map(add, seq, seq)
[0, 2, 4, 6, 8, 10, 12, 14]

Resources