What exactly are the Python scoping rules?
If I have some code:
code1
class Foo:
code2
def spam.....
code3
for code4..:
code5
x()
Where is x found? Some possible choices include the list below:
In the enclosing source file
In the class namespace
In the function definition
In the for loop index variable
Inside the for loop
Also there is the context during execution, when the function spam is passed somewhere else. And maybe lambda functions pass a bit differently?
There must be a simple reference or algorithm somewhere. It's a confusing world for intermediate Python programmers.
Actually, a concise rule for Python Scope resolution, from Learning Python, 3rd. Ed.. (These rules are specific to variable names, not attributes. If you reference it without a period, these rules apply.)
LEGB Rule
Local — Names assigned in any way within a function (def or lambda), and not declared global in that function
Enclosing-function — Names assigned in the local scope of any and all statically enclosing functions (def or lambda), from inner to outer
Global (module) — Names assigned at the top-level of a module file, or by executing a global statement in a def within the file
Built-in (Python) — Names preassigned in the built-in names module: open, range, SyntaxError, etc
So, in the case of
code1
class Foo:
code2
def spam():
code3
for code4:
code5
x()
The for loop does not have its own namespace. In LEGB order, the scopes would be
L: Local in def spam (in code3, code4, and code5)
E: Any enclosing functions (if the whole example were in another def)
G: Were there any x declared globally in the module (in code1)?
B: Any builtin x in Python.
x will never be found in code2 (even in cases where you might expect it would, see Antti's answer or here).
Essentially, the only thing in Python that introduces a new scope is a function definition. Classes are a bit of a special case in that anything defined directly in the body is placed in the class's namespace, but they are not directly accessible from within the methods (or nested classes) they contain.
In your example there are only 3 scopes where x will be searched in:
spam's scope - containing everything defined in code3 and code5 (as well as code4, your loop variable)
The global scope - containing everything defined in code1, as well as Foo (and whatever changes after it)
The builtins namespace. A bit of a special case - this contains the various Python builtin functions and types such as len() and str(). Generally this shouldn't be modified by any user code, so expect it to contain the standard functions and nothing else.
More scopes only appear when you introduce a nested function (or lambda) into the picture.
These will behave pretty much as you'd expect however. The nested function can access everything in the local scope, as well as anything in the enclosing function's scope. eg.
def foo():
x=4
def bar():
print x # Accesses x from foo's scope
bar() # Prints 4
x=5
bar() # Prints 5
Restrictions:
Variables in scopes other than the local function's variables can be accessed, but can't be rebound to new parameters without further syntax. Instead, assignment will create a new local variable instead of affecting the variable in the parent scope. For example:
global_var1 = []
global_var2 = 1
def func():
# This is OK: It's just accessing, not rebinding
global_var1.append(4)
# This won't affect global_var2. Instead it creates a new variable
global_var2 = 2
local1 = 4
def embedded_func():
# Again, this doen't affect func's local1 variable. It creates a
# new local variable also called local1 instead.
local1 = 5
print local1
embedded_func() # Prints 5
print local1 # Prints 4
In order to actually modify the bindings of global variables from within a function scope, you need to specify that the variable is global with the global keyword. Eg:
global_var = 4
def change_global():
global global_var
global_var = global_var + 1
Currently there is no way to do the same for variables in enclosing function scopes, but Python 3 introduces a new keyword, "nonlocal" which will act in a similar way to global, but for nested function scopes.
There was no thorough answer concerning Python3 time, so I made an answer here. Most of what is described here is detailed in the 4.2.2 Resolution of names of the Python 3 documentation.
As provided in other answers, there are 4 basic scopes, the LEGB, for Local, Enclosing, Global and Builtin. In addition to those, there is a special scope, the class body, which does not comprise an enclosing scope for methods defined within the class; any assignments within the class body make the variable from there on be bound in the class body.
Especially, no block statement, besides def and class, create a variable scope. In Python 2 a list comprehension does not create a variable scope, however in Python 3 the loop variable within list comprehensions is created in a new scope.
To demonstrate the peculiarities of the class body
x = 0
class X(object):
y = x
x = x + 1 # x is now a variable
z = x
def method(self):
print(self.x) # -> 1
print(x) # -> 0, the global x
print(y) # -> NameError: global name 'y' is not defined
inst = X()
print(inst.x, inst.y, inst.z, x) # -> (1, 0, 1, 0)
Thus unlike in function body, you can reassign the variable to the same name in class body, to get a class variable with the same name; further lookups on this name resolve
to the class variable instead.
One of the greater surprises to many newcomers to Python is that a for loop does not create a variable scope. In Python 2 the list comprehensions do not create a scope either (while generators and dict comprehensions do!) Instead they leak the value in the function or the global scope:
>>> [ i for i in range(5) ]
>>> i
4
The comprehensions can be used as a cunning (or awful if you will) way to make modifiable variables within lambda expressions in Python 2 - a lambda expression does create a variable scope, like the def statement would, but within lambda no statements are allowed. Assignment being a statement in Python means that no variable assignments in lambda are allowed, but a list comprehension is an expression...
This behaviour has been fixed in Python 3 - no comprehension expressions or generators leak variables.
The global really means the module scope; the main python module is the __main__; all imported modules are accessible through the sys.modules variable; to get access to __main__ one can use sys.modules['__main__'], or import __main__; it is perfectly acceptable to access and assign attributes there; they will show up as variables in the global scope of the main module.
If a name is ever assigned to in the current scope (except in the class scope), it will be considered belonging to that scope, otherwise it will be considered to belonging to any enclosing scope that assigns to the variable (it might not be assigned yet, or not at all), or finally the global scope. If the variable is considered local, but it is not set yet, or has been deleted, reading the variable value will result in UnboundLocalError, which is a subclass of NameError.
x = 5
def foobar():
print(x) # causes UnboundLocalError!
x += 1 # because assignment here makes x a local variable within the function
# call the function
foobar()
The scope can declare that it explicitly wants to modify the global (module scope) variable, with the global keyword:
x = 5
def foobar():
global x
print(x)
x += 1
foobar() # -> 5
print(x) # -> 6
This also is possible even if it was shadowed in enclosing scope:
x = 5
y = 13
def make_closure():
x = 42
y = 911
def func():
global x # sees the global value
print(x, y)
x += 1
return func
func = make_closure()
func() # -> 5 911
print(x, y) # -> 6 13
In python 2 there is no easy way to modify the value in the enclosing scope; usually this is simulated by having a mutable value, such as a list with length of 1:
def make_closure():
value = [0]
def get_next_value():
value[0] += 1
return value[0]
return get_next_value
get_next = make_closure()
print(get_next()) # -> 1
print(get_next()) # -> 2
However in python 3, the nonlocal comes to rescue:
def make_closure():
value = 0
def get_next_value():
nonlocal value
value += 1
return value
return get_next_value
get_next = make_closure() # identical behavior to the previous example.
The nonlocal documentation says that
Names listed in a nonlocal statement, unlike those listed in a global statement, must refer to pre-existing bindings in an enclosing scope (the scope in which a new binding should be created cannot be determined unambiguously).
i.e. nonlocal always refers to the innermost outer non-global scope where the name has been bound (i.e. assigned to, including used as the for target variable, in the with clause, or as a function parameter).
Any variable that is not deemed to be local to the current scope, or any enclosing scope, is a global variable. A global name is looked up in the module global dictionary; if not found, the global is then looked up from the builtins module; the name of the module was changed from python 2 to python 3; in python 2 it was __builtin__ and in python 3 it is now called builtins. If you assign to an attribute of builtins module, it will be visible thereafter to any module as a readable global variable, unless that module shadows them with its own global variable with the same name.
Reading the builtin module can also be useful; suppose that you want the python 3 style print function in some parts of file, but other parts of file still use the print statement. In Python 2.6-2.7 you can get hold of the Python 3 print function with:
import __builtin__
print3 = __builtin__.__dict__['print']
The from __future__ import print_function actually does not import the print function anywhere in Python 2 - instead it just disables the parsing rules for print statement in the current module, handling print like any other variable identifier, and thus allowing the print the function be looked up in the builtins.
A slightly more complete example of scope:
from __future__ import print_function # for python 2 support
x = 100
print("1. Global x:", x)
class Test(object):
y = x
print("2. Enclosed y:", y)
x = x + 1
print("3. Enclosed x:", x)
def method(self):
print("4. Enclosed self.x", self.x)
print("5. Global x", x)
try:
print(y)
except NameError as e:
print("6.", e)
def method_local_ref(self):
try:
print(x)
except UnboundLocalError as e:
print("7.", e)
x = 200 # causing 7 because has same name
print("8. Local x", x)
inst = Test()
inst.method()
inst.method_local_ref()
output:
1. Global x: 100
2. Enclosed y: 100
3. Enclosed x: 101
4. Enclosed self.x 101
5. Global x 100
6. global name 'y' is not defined
7. local variable 'x' referenced before assignment
8. Local x 200
The scoping rules for Python 2.x have been outlined already in other answers. The only thing I would add is that in Python 3.0, there is also the concept of a non-local scope (indicated by the 'nonlocal' keyword). This allows you to access outer scopes directly, and opens up the ability to do some neat tricks, including lexical closures (without ugly hacks involving mutable objects).
EDIT: Here's the PEP with more information on this.
Python resolves your variables with -- generally -- three namespaces available.
At any time during execution, there
are at least three nested scopes whose
namespaces are directly accessible:
the innermost scope, which is searched
first, contains the local names; the
namespaces of any enclosing functions,
which are searched starting with the
nearest enclosing scope; the middle
scope, searched next, contains the
current module's global names; and the
outermost scope (searched last) is the
namespace containing built-in names.
There are two functions: globals and locals which show you the contents two of these namespaces.
Namespaces are created by packages, modules, classes, object construction and functions. There aren't any other flavors of namespaces.
In this case, the call to a function named x has to be resolved in the local name space or the global namespace.
Local in this case, is the body of the method function Foo.spam.
Global is -- well -- global.
The rule is to search the nested local spaces created by method functions (and nested function definitions), then search global. That's it.
There are no other scopes. The for statement (and other compound statements like if and try) don't create new nested scopes. Only definitions (packages, modules, functions, classes and object instances.)
Inside a class definition, the names are part of the class namespace. code2, for instance, must be qualified by the class name. Generally Foo.code2. However, self.code2 will also work because Python objects look at the containing class as a fall-back.
An object (an instance of a class) has instance variables. These names are in the object's namespace. They must be qualified by the object. (variable.instance.)
From within a class method, you have locals and globals. You say self.variable to pick the instance as the namespace. You'll note that self is an argument to every class member function, making it part of the local namespace.
See Python Scope Rules, Python Scope, Variable Scope.
Where is x found?
x is not found as you haven't defined it. :-) It could be found in code1 (global) or code3 (local) if you put it there.
code2 (class members) aren't visible to code inside methods of the same class — you would usually access them using self. code4/code5 (loops) live in the same scope as code3, so if you wrote to x in there you would be changing the x instance defined in code3, not making a new x.
Python is statically scoped, so if you pass ‘spam’ to another function spam will still have access to globals in the module it came from (defined in code1), and any other containing scopes (see below). code2 members would again be accessed through self.
lambda is no different to def. If you have a lambda used inside a function, it's the same as defining a nested function. In Python 2.2 onwards, nested scopes are available. In this case you can bind x at any level of function nesting and Python will pick up the innermost instance:
x= 0
def fun1():
x= 1
def fun2():
x= 2
def fun3():
return x
return fun3()
return fun2()
print fun1(), x
2 0
fun3 sees the instance x from the nearest containing scope, which is the function scope associated with fun2. But the other x instances, defined in fun1 and globally, are not affected.
Before nested_scopes — in Python pre-2.1, and in 2.1 unless you specifically ask for the feature using a from-future-import — fun1 and fun2's scopes are not visible to fun3, so S.Lott's answer holds and you would get the global x:
0 0
The Python name resolution only knows the following kinds of scope:
builtins scope which provides the Builtin Functions, such as print, int, or zip,
module global scope which is always the top-level of the current module,
three user-defined scopes that can be nested into each other, namely
function closure scope, from any enclosing def block, lambda expression or comprehension.
function local scope, inside a def block, lambda expression or comprehension,
class scope, inside a class block.
Notably, other constructs such as if, for, or with statements do not have their own scope.
The scoping TLDR: The lookup of a name begins at the scope in which the name is used, then any enclosing scopes (excluding class scopes), to the module globals, and finally the builtins – the first match in this search order is used.
The assignment to a scope is by default to the current scope – the special forms nonlocal and global must be used to assign to a name from an outer scope.
Finally, comprehensions and generator expressions as well as := asignment expressions have one special rule when combined.
Nested Scopes and Name Resolution
These different scopes build a hierarchy, with builtins then global always forming the base, and closures, locals and class scope being nested as lexically defined. That is, only the nesting in the source code matters, not for example the call stack.
print("builtins are available without definition")
some_global = "1" # global variables are at module scope
def outer_function():
some_closure = "3.1" # locals and closure are defined the same, at function scope
some_local = "3.2" # a variable becomes a closure if a nested scope uses it
class InnerClass:
some_classvar = "3.3" # class variables exist *only* at class scope
def inner_function(self):
some_local = "3.2" # locals can replace outer names
print(some_closure) # closures are always readable
return InnerClass
Even though class creates a scope and may have nested classes, functions and comprehensions, the names of the class scope are not visible to enclosed scopes. This creates the following hierarchy:
┎ builtins [print, ...]
┗━┱ globals [some_global]
┗━┱ outer_function [some_local, some_closure]
┣━╾ InnerClass [some_classvar]
┗━╾ inner_function [some_local]
Name resolution always starts at the current scope in which a name is accessed, then goes up the hierarchy until a match is found. For example, looking up some_local inside outer_function and inner_function starts at the respective function - and immediately finds the some_local defined in outer_function and inner_function, respectively. When a name is not local, it is fetched from the nearest enclosing scope that defines it – looking up some_closure and print inside inner_function searches until outer_function and builtins, respectively.
Scope Declarations and Name Binding
By default, a name belongs to any scope in which it is bound to a value. Binding the same name again in an inner scope creates a new variable with the same name - for example, some_local exists separately in both outer_function and inner_function. As far as scoping is concerned, binding includes any statement that sets the value of a name – assignment statements, but also the iteration variable of a for loop, or the name of a with context manager. Notably, del also counts as name binding.
When a name must refer to an outer variable and be bound in an inner scope, the name must be declared as not local. Separate declarations exists for the different kinds of enclosing scopes: nonlocal always refers to the nearest closure, and global always refers to a global name. Notably, nonlocal never refers to a global name and global ignores all closures of the same name. There is no declaration to refer to the builtin scope.
some_global = "1"
def outer_function():
some_closure = "3.2"
some_global = "this is ignored by a nested global declaration"
def inner_function():
global some_global # declare variable from global scope
nonlocal some_closure # declare variable from enclosing scope
message = " bound by an inner scope"
some_global = some_global + message
some_closure = some_closure + message
return inner_function
Of note is that function local and nonlocal are resolved at compile time. A nonlocal name must exist in some outer scope. In contrast, a global name can be defined dynamically and may be added or removed from the global scope at any time.
Comprehensions and Assignment Expressions
The scoping rules of list, set and dict comprehensions and generator expressions are almost the same as for functions. Likewise, the scoping rules for assignment expressions are almost the same as for regular name binding.
The scope of comprehensions and generator expressions is of the same kind as function scope. All names bound in the scope, namely the iteration variables, are locals or closures to the comprehensions/generator and nested scopes. All names, including iterables, are resolved using name resolution as applicable inside functions.
some_global = "global"
def outer_function():
some_closure = "closure"
return [ # new function-like scope started by comprehension
comp_local # names resolved using regular name resolution
for comp_local # iteration targets are local
in "iterable"
if comp_local in some_global and comp_local in some_global
]
An := assignment expression works on the nearest function, class or global scope. Notably, if the target of an assignment expression has been declared nonlocal or global in the nearest scope, the assignment expression honors this like a regular assignment.
print(some_global := "global")
def outer_function():
print(some_closure := "closure")
However, an assignment expression inside a comprehension/generator works on the nearest enclosing scope of the comprehension/generator, not the scope of the comprehension/generator itself. When several comprehensions/generators are nested, the nearest function or global scope is used. Since the comprehension/generator scope can read closures and global variables, the assignment variable is readable in the comprehension as well. Assigning from a comprehension to a class scope is not valid.
print(some_global := "global")
def outer_function():
print(some_closure := "closure")
steps = [
# v write to variable in containing scope
(some_closure := some_closure + comp_local)
# ^ read from variable in containing scope
for comp_local in some_global
]
return some_closure, steps
While the iteration variable is local to the comprehension in which it is bound, the target of the assignment expression does not create a local variable and is read from the outer scope:
┎ builtins [print, ...]
┗━┱ globals [some_global]
┗━┱ outer_function [some_closure]
┗━╾ <listcomp> [comp_local]
In Python,
any variable that is assigned a value is local to the block in which
the assignment appears.
If a variable can't be found in the current scope, please refer to the LEGB order.
How can I pass an integer by reference in Python?
I want to modify the value of a variable that I am passing to the function. I have read that everything in Python is pass by value, but there has to be an easy trick. For example, in Java you could pass the reference types of Integer, Long, etc.
How can I pass an integer into a function by reference?
What are the best practices?
It doesn't quite work that way in Python. Python passes references to objects. Inside your function you have an object -- You're free to mutate that object (if possible). However, integers are immutable. One workaround is to pass the integer in a container which can be mutated:
def change(x):
x[0] = 3
x = [1]
change(x)
print x
This is ugly/clumsy at best, but you're not going to do any better in Python. The reason is because in Python, assignment (=) takes whatever object is the result of the right hand side and binds it to whatever is on the left hand side *(or passes it to the appropriate function).
Understanding this, we can see why there is no way to change the value of an immutable object inside a function -- you can't change any of its attributes because it's immutable, and you can't just assign the "variable" a new value because then you're actually creating a new object (which is distinct from the old one) and giving it the name that the old object had in the local namespace.
Usually the workaround is to simply return the object that you want:
def multiply_by_2(x):
return 2*x
x = 1
x = multiply_by_2(x)
*In the first example case above, 3 actually gets passed to x.__setitem__.
Most cases where you would need to pass by reference are where you need to return more than one value back to the caller. A "best practice" is to use multiple return values, which is much easier to do in Python than in languages like Java.
Here's a simple example:
def RectToPolar(x, y):
r = (x ** 2 + y ** 2) ** 0.5
theta = math.atan2(y, x)
return r, theta # return 2 things at once
r, theta = RectToPolar(3, 4) # assign 2 things at once
Not exactly passing a value directly, but using it as if it was passed.
x = 7
def my_method():
nonlocal x
x += 1
my_method()
print(x) # 8
Caveats:
nonlocal was introduced in python 3
If the enclosing scope is the global one, use global instead of nonlocal.
Maybe it's not pythonic way, but you can do this
import ctypes
def incr(a):
a += 1
x = ctypes.c_int(1) # create c-var
incr(ctypes.ctypes.byref(x)) # passing by ref
Really, the best practice is to step back and ask whether you really need to do this. Why do you want to modify the value of a variable that you're passing in to the function?
If you need to do it for a quick hack, the quickest way is to pass a list holding the integer, and stick a [0] around every use of it, as mgilson's answer demonstrates.
If you need to do it for something more significant, write a class that has an int as an attribute, so you can just set it. Of course this forces you to come up with a good name for the class, and for the attribute—if you can't think of anything, go back and read the sentence again a few times, and then use the list.
More generally, if you're trying to port some Java idiom directly to Python, you're doing it wrong. Even when there is something directly corresponding (as with static/#staticmethod), you still don't want to use it in most Python programs just because you'd use it in Java.
Maybe slightly more self-documenting than the list-of-length-1 trick is the old empty type trick:
def inc_i(v):
v.i += 1
x = type('', (), {})()
x.i = 7
inc_i(x)
print(x.i)
A numpy single-element array is mutable and yet for most purposes, it can be evaluated as if it was a numerical python variable. Therefore, it's a more convenient by-reference number container than a single-element list.
import numpy as np
def triple_var_by_ref(x):
x[0]=x[0]*3
a=np.array([2])
triple_var_by_ref(a)
print(a+1)
output:
7
The correct answer, is to use a class and put the value inside the class, this lets you pass by reference exactly as you desire.
class Thing:
def __init__(self,a):
self.a = a
def dosomething(ref)
ref.a += 1
t = Thing(3)
dosomething(t)
print("T is now",t.a)
In Python, every value is a reference (a pointer to an object), just like non-primitives in Java. Also, like Java, Python only has pass by value. So, semantically, they are pretty much the same.
Since you mention Java in your question, I would like to see how you achieve what you want in Java. If you can show it in Java, I can show you how to do it exactly equivalently in Python.
class PassByReference:
def Change(self, var):
self.a = var
print(self.a)
s=PassByReference()
s.Change(5)
class Obj:
def __init__(self,a):
self.value = a
def sum(self, a):
self.value += a
a = Obj(1)
b = a
a.sum(1)
print(a.value, b.value)// 2 2
In Python, everything is passed by value, but if you want to modify some state, you can change the value of an integer inside a list or object that's passed to a method.
integers are immutable in python and once they are created we cannot change their value by using assignment operator to a variable we are making it to point to some other address not the previous address.
In python a function can return multiple values we can make use of it:
def swap(a,b):
return b,a
a,b=22,55
a,b=swap(a,b)
print(a,b)
To change the reference a variable is pointing to we can wrap immutable data types(int, long, float, complex, str, bytes, truple, frozenset) inside of mutable data types (bytearray, list, set, dict).
#var is an instance of dictionary type
def change(var,key,new_value):
var[key]=new_value
var =dict()
var['a']=33
change(var,'a',2625)
print(var['a'])
Are parameters passed by reference or by value? How do I pass by reference so that the code below outputs 'Changed' instead of 'Original'?
class PassByReference:
def __init__(self):
self.variable = 'Original'
self.change(self.variable)
print(self.variable)
def change(self, var):
var = 'Changed'
See also: Why can a function modify some arguments as perceived by the caller, but not others?
Arguments are passed by assignment. The rationale behind this is twofold:
the parameter passed in is actually a reference to an object (but the reference is passed by value)
some data types are mutable, but others aren't
So:
If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.
If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.
To make it even more clear, let's have some examples.
List - a mutable type
Let's try to modify the list that was passed to a method:
def try_to_change_list_contents(the_list):
print('got', the_list)
the_list.append('four')
print('changed to', the_list)
outer_list = ['one', 'two', 'three']
print('before, outer_list =', outer_list)
try_to_change_list_contents(outer_list)
print('after, outer_list =', outer_list)
Output:
before, outer_list = ['one', 'two', 'three']
got ['one', 'two', 'three']
changed to ['one', 'two', 'three', 'four']
after, outer_list = ['one', 'two', 'three', 'four']
Since the parameter passed in is a reference to outer_list, not a copy of it, we can use the mutating list methods to change it and have the changes reflected in the outer scope.
Now let's see what happens when we try to change the reference that was passed in as a parameter:
def try_to_change_list_reference(the_list):
print('got', the_list)
the_list = ['and', 'we', 'can', 'not', 'lie']
print('set to', the_list)
outer_list = ['we', 'like', 'proper', 'English']
print('before, outer_list =', outer_list)
try_to_change_list_reference(outer_list)
print('after, outer_list =', outer_list)
Output:
before, outer_list = ['we', 'like', 'proper', 'English']
got ['we', 'like', 'proper', 'English']
set to ['and', 'we', 'can', 'not', 'lie']
after, outer_list = ['we', 'like', 'proper', 'English']
Since the the_list parameter was passed by value, assigning a new list to it had no effect that the code outside the method could see. The the_list was a copy of the outer_list reference, and we had the_list point to a new list, but there was no way to change where outer_list pointed.
String - an immutable type
It's immutable, so there's nothing we can do to change the contents of the string
Now, let's try to change the reference
def try_to_change_string_reference(the_string):
print('got', the_string)
the_string = 'In a kingdom by the sea'
print('set to', the_string)
outer_string = 'It was many and many a year ago'
print('before, outer_string =', outer_string)
try_to_change_string_reference(outer_string)
print('after, outer_string =', outer_string)
Output:
before, outer_string = It was many and many a year ago
got It was many and many a year ago
set to In a kingdom by the sea
after, outer_string = It was many and many a year ago
Again, since the the_string parameter was passed by value, assigning a new string to it had no effect that the code outside the method could see. The the_string was a copy of the outer_string reference, and we had the_string point to a new string, but there was no way to change where outer_string pointed.
I hope this clears things up a little.
EDIT: It's been noted that this doesn't answer the question that #David originally asked, "Is there something I can do to pass the variable by actual reference?". Let's work on that.
How do we get around this?
As #Andrea's answer shows, you could return the new value. This doesn't change the way things are passed in, but does let you get the information you want back out:
def return_a_whole_new_string(the_string):
new_string = something_to_do_with_the_old_string(the_string)
return new_string
# then you could call it like
my_string = return_a_whole_new_string(my_string)
If you really wanted to avoid using a return value, you could create a class to hold your value and pass it into the function or use an existing class, like a list:
def use_a_wrapper_to_simulate_pass_by_reference(stuff_to_change):
new_string = something_to_do_with_the_old_string(stuff_to_change[0])
stuff_to_change[0] = new_string
# then you could call it like
wrapper = [my_string]
use_a_wrapper_to_simulate_pass_by_reference(wrapper)
do_something_with(wrapper[0])
Although this seems a little cumbersome.
The problem comes from a misunderstanding of what variables are in Python. If you're used to most traditional languages, you have a mental model of what happens in the following sequence:
a = 1
a = 2
You believe that a is a memory location that stores the value 1, then is updated to store the value 2. That's not how things work in Python. Rather, a starts as a reference to an object with the value 1, then gets reassigned as a reference to an object with the value 2. Those two objects may continue to coexist even though a doesn't refer to the first one anymore; in fact they may be shared by any number of other references within the program.
When you call a function with a parameter, a new reference is created that refers to the object passed in. This is separate from the reference that was used in the function call, so there's no way to update that reference and make it refer to a new object. In your example:
def __init__(self):
self.variable = 'Original'
self.Change(self.variable)
def Change(self, var):
var = 'Changed'
self.variable is a reference to the string object 'Original'. When you call Change you create a second reference var to the object. Inside the function you reassign the reference var to a different string object 'Changed', but the reference self.variable is separate and does not change.
The only way around this is to pass a mutable object. Because both references refer to the same object, any changes to the object are reflected in both places.
def __init__(self):
self.variable = ['Original']
self.Change(self.variable)
def Change(self, var):
var[0] = 'Changed'
I found the other answers rather long and complicated, so I created this simple diagram to explain the way Python treats variables and parameters.
It is neither pass-by-value or pass-by-reference - it is call-by-object. See this, by Fredrik Lundh:
Call By Object
Here is a significant quote:
"...variables [names] are not objects; they cannot be denoted by other variables or referred to by objects."
In your example, when the Change method is called--a namespace is created for it; and var becomes a name, within that namespace, for the string object 'Original'. That object then has a name in two namespaces. Next, var = 'Changed' binds var to a new string object, and thus the method's namespace forgets about 'Original'. Finally, that namespace is forgotten, and the string 'Changed' along with it.
Think of stuff being passed by assignment instead of by reference/by value. That way, it is always clear, what is happening as long as you understand what happens during the normal assignment.
So, when passing a list to a function/method, the list is assigned to the parameter name. Appending to the list will result in the list being modified. Reassigning the list inside the function will not change the original list, since:
a = [1, 2, 3]
b = a
b.append(4)
b = ['a', 'b']
print a, b # prints [1, 2, 3, 4] ['a', 'b']
Since immutable types cannot be modified, they seem like being passed by value - passing an int into a function means assigning the int to the function's parameter. You can only ever reassign that, but it won't change the original variables value.
There are no variables in Python
The key to understanding parameter passing is to stop thinking about "variables". There are names and objects in Python and together they
appear like variables, but it is useful to always distinguish the three.
Python has names and objects.
Assignment binds a name to an object.
Passing an argument into a function also binds a name (the parameter name of the function) to an object.
That is all there is to it. Mutability is irrelevant to this question.
Example:
a = 1
This binds the name a to an object of type integer that holds the value 1.
b = x
This binds the name b to the same object that the name x is currently bound to.
Afterward, the name b has nothing to do with the name x anymore.
See sections 3.1 and 4.2 in the Python 3 language reference.
How to read the example in the question
In the code shown in the question, the statement self.Change(self.variable) binds the name var (in the scope of function Change) to the object that holds the value 'Original' and the assignment var = 'Changed' (in the body of function Change) assigns that same name again: to some other object (that happens to hold a string as well but could have been something else entirely).
How to pass by reference
So if the thing you want to change is a mutable object, there is no problem, as everything is effectively passed by reference.
If it is an immutable object (e.g. a bool, number, string), the way to go is to wrap it in a mutable object.
The quick-and-dirty solution for this is a one-element list (instead of self.variable, pass [self.variable] and in the function modify var[0]).
The more pythonic approach would be to introduce a trivial, one-attribute class. The function receives an instance of the class and manipulates the attribute.
Effbot (aka Fredrik Lundh) has described Python's variable passing style as call-by-object: http://effbot.org/zone/call-by-object.htm
Objects are allocated on the heap and pointers to them can be passed around anywhere.
When you make an assignment such as x = 1000, a dictionary entry is created that maps the string "x" in the current namespace to a pointer to the integer object containing one thousand.
When you update "x" with x = 2000, a new integer object is created and the dictionary is updated to point at the new object. The old one thousand object is unchanged (and may or may not be alive depending on whether anything else refers to the object).
When you do a new assignment such as y = x, a new dictionary entry "y" is created that points to the same object as the entry for "x".
Objects like strings and integers are immutable. This simply means that there are no methods that can change the object after it has been created. For example, once the integer object one-thousand is created, it will never change. Math is done by creating new integer objects.
Objects like lists are mutable. This means that the contents of the object can be changed by anything pointing to the object. For example, x = []; y = x; x.append(10); print y will print [10]. The empty list was created. Both "x" and "y" point to the same list. The append method mutates (updates) the list object (like adding a record to a database) and the result is visible to both "x" and "y" (just as a database update would be visible to every connection to that database).
Hope that clarifies the issue for you.
Technically, Python always uses pass by reference values. I am going to repeat my other answer to support my statement.
Python always uses pass-by-reference values. There isn't any exception. Any variable assignment means copying the reference value. No exception. Any variable is the name bound to the reference value. Always.
You can think about a reference value as the address of the target object. The address is automatically dereferenced when used. This way, working with the reference value, it seems you work directly with the target object. But there always is a reference in between, one step more to jump to the target.
Here is the example that proves that Python uses passing by reference:
If the argument was passed by value, the outer lst could not be modified. The green are the target objects (the black is the value stored inside, the red is the object type), the yellow is the memory with the reference value inside -- drawn as the arrow. The blue solid arrow is the reference value that was passed to the function (via the dashed blue arrow path). The ugly dark yellow is the internal dictionary. (It actually could be drawn also as a green ellipse. The colour and the shape only says it is internal.)
You can use the id() built-in function to learn what the reference value is (that is, the address of the target object).
In compiled languages, a variable is a memory space that is able to capture the value of the type. In Python, a variable is a name (captured internally as a string) bound to the reference variable that holds the reference value to the target object. The name of the variable is the key in the internal dictionary, the value part of that dictionary item stores the reference value to the target.
Reference values are hidden in Python. There isn't any explicit user type for storing the reference value. However, you can use a list element (or element in any other suitable container type) as the reference variable, because all containers do store the elements also as references to the target objects. In other words, elements are actually not contained inside the container -- only the references to elements are.
A simple trick I normally use is to just wrap it in a list:
def Change(self, var):
var[0] = 'Changed'
variable = ['Original']
self.Change(variable)
print variable[0]
(Yeah I know this can be inconvenient, but sometimes it is simple enough to do this.)
(edit - Blair has updated his enormously popular answer so that it is now accurate)
I think it is important to note that the current post with the most votes (by Blair Conrad), while being correct with respect to its result, is misleading and is borderline incorrect based on its definitions. While there are many languages (like C) that allow the user to either pass by reference or pass by value, Python is not one of them.
David Cournapeau's answer points to the real answer and explains why the behavior in Blair Conrad's post seems to be correct while the definitions are not.
To the extent that Python is pass by value, all languages are pass by value since some piece of data (be it a "value" or a "reference") must be sent. However, that does not mean that Python is pass by value in the sense that a C programmer would think of it.
If you want the behavior, Blair Conrad's answer is fine. But if you want to know the nuts and bolts of why Python is neither pass by value or pass by reference, read David Cournapeau's answer.
You got some really good answers here.
x = [ 2, 4, 4, 5, 5 ]
print x # 2, 4, 4, 5, 5
def go( li ) :
li = [ 5, 6, 7, 8 ] # re-assigning what li POINTS TO, does not
# change the value of the ORIGINAL variable x
go( x )
print x # 2, 4, 4, 5, 5 [ STILL! ]
raw_input( 'press any key to continue' )
Python’s pass-by-assignment scheme isn’t quite the same as C++’s reference parameters option, but it turns out to be very similar to the argument-passing model of the C language (and others) in practice:
Immutable arguments are effectively passed “by value.” Objects such as integers and strings are passed by object reference instead of by copying, but because you can’t change immutable objects in place anyhow, the effect is much like making a copy.
Mutable arguments are effectively passed “by pointer.” Objects such as lists
and dictionaries are also passed by object reference, which is similar to the way C
passes arrays as pointers—mutable objects can be changed in place in the function,
much like C arrays.
In this case the variable titled var in the method Change is assigned a reference to self.variable, and you immediately assign a string to var. It's no longer pointing to self.variable. The following code snippet shows what would happen if you modify the data structure pointed to by var and self.variable, in this case a list:
>>> class PassByReference:
... def __init__(self):
... self.variable = ['Original']
... self.change(self.variable)
... print self.variable
...
... def change(self, var):
... var.append('Changed')
...
>>> q = PassByReference()
['Original', 'Changed']
>>>
I'm sure someone else could clarify this further.
There are a lot of insights in answers here, but I think an additional point is not clearly mentioned here explicitly. Quoting from Python documentation What are the rules for local and global variables in Python?
In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a new value anywhere within the function’s body, it’s assumed to be a local. If a variable is ever assigned a new value inside the function, the variable is implicitly local, and you need to explicitly declare it as ‘global’.
Though a bit surprising at first, a moment’s consideration explains this. On one hand, requiring global for assigned variables provides a bar against unintended side-effects. On the other hand, if global was required for all global references, you’d be using global all the time. You’d have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the global declaration for identifying side-effects.
Even when passing a mutable object to a function this still applies. And to me it clearly explains the reason for the difference in behavior between assigning to the object and operating on the object in the function.
def test(l):
print "Received", l, id(l)
l = [0, 0, 0]
print "Changed to", l, id(l) # New local object created, breaking link to global l
l = [1, 2, 3]
print "Original", l, id(l)
test(l)
print "After", l, id(l)
gives:
Original [1, 2, 3] 4454645632
Received [1, 2, 3] 4454645632
Changed to [0, 0, 0] 4474591928
After [1, 2, 3] 4454645632
The assignment to an global variable that is not declared global therefore creates a new local object and breaks the link to the original object.
As you can state you need to have a mutable object, but let me suggest you to check over the global variables as they can help you or even solve this kind of issue!
http://docs.python.org/3/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python
example:
>>> def x(y):
... global z
... z = y
...
>>> x
<function x at 0x00000000020E1730>
>>> y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'y' is not defined
>>> z
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'z' is not defined
>>> x(2)
>>> x
<function x at 0x00000000020E1730>
>>> y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'y' is not defined
>>> z
2
Here is the simple (I hope) explanation of the concept pass by object used in Python.
Whenever you pass an object to the function, the object itself is passed (object in Python is actually what you'd call a value in other programming languages) not the reference to this object. In other words, when you call:
def change_me(list):
list = [1, 2, 3]
my_list = [0, 1]
change_me(my_list)
The actual object - [0, 1] (which would be called a value in other programming languages) is being passed. So in fact the function change_me will try to do something like:
[0, 1] = [1, 2, 3]
which obviously will not change the object passed to the function. If the function looked like this:
def change_me(list):
list.append(2)
Then the call would result in:
[0, 1].append(2)
which obviously will change the object. This answer explains it well.
Aside from all the great explanations on how this stuff works in Python, I don't see a simple suggestion for the problem. As you seem to do create objects and instances, the Pythonic way of handling instance variables and changing them is the following:
class PassByReference:
def __init__(self):
self.variable = 'Original'
self.Change()
print self.variable
def Change(self):
self.variable = 'Changed'
In instance methods, you normally refer to self to access instance attributes. It is normal to set instance attributes in __init__ and read or change them in instance methods. That is also why you pass self as the first argument to def Change.
Another solution would be to create a static method like this:
class PassByReference:
def __init__(self):
self.variable = 'Original'
self.variable = PassByReference.Change(self.variable)
print self.variable
#staticmethod
def Change(var):
var = 'Changed'
return var
I used the following method to quickly convert some Fortran code to Python. True, it's not pass by reference as the original question was posed, but it is a simple workaround in some cases.
a = 0
b = 0
c = 0
def myfunc(a, b, c):
a = 1
b = 2
c = 3
return a, b, c
a, b, c = myfunc(a, b, c)
print a, b, c
There is a little trick to pass an object by reference, even though the language doesn't make it possible. It works in Java too; it's the list with one item. ;-)
class PassByReference:
def __init__(self, name):
self.name = name
def changeRef(ref):
ref[0] = PassByReference('Michael')
obj = PassByReference('Peter')
print obj.name
p = [obj] # A pointer to obj! ;-)
changeRef(p)
print p[0].name # p->name
It's an ugly hack, but it works. ;-P
Since it seems to be nowhere mentioned an approach to simulate references as known from e.g. C++ is to use an "update" function and pass that instead of the actual variable (or rather, "name"):
def need_to_modify(update):
update(42) # set new value 42
# other code
def call_it():
value = 21
def update_value(new_value):
nonlocal value
value = new_value
need_to_modify(update_value)
print(value) # prints 42
This is mostly useful for "out-only references" or in a situation with multiple threads / processes (by making the update function thread / multiprocessing safe).
Obviously the above does not allow reading the value, only updating it.
Given the way Python handles values and references to them, the only way you can reference an arbitrary instance attribute is by name:
class PassByReferenceIsh:
def __init__(self):
self.variable = 'Original'
self.change('variable')
print self.variable
def change(self, var):
self.__dict__[var] = 'Changed'
In real code you would, of course, add error checking on the dict lookup.
Since your example happens to be object-oriented, you could make the following change to achieve a similar result:
class PassByReference:
def __init__(self):
self.variable = 'Original'
self.change('variable')
print(self.variable)
def change(self, var):
setattr(self, var, 'Changed')
# o.variable will equal 'Changed'
o = PassByReference()
assert o.variable == 'Changed'
Since dictionaries are passed by reference, you can use a dict variable to store any referenced values inside it.
# returns the result of adding numbers `a` and `b`
def AddNumbers(a, b, ref): # using a dict for reference
result = a + b
ref['multi'] = a * b # reference the multi. ref['multi'] is number
ref['msg'] = "The result: " + str(result) + " was nice!"
return result
number1 = 5
number2 = 10
ref = {} # init a dict like that so it can save all the referenced values. this is because all dictionaries are passed by reference, while strings and numbers do not.
sum = AddNumbers(number1, number2, ref)
print("sum: ", sum) # the returned value
print("multi: ", ref['multi']) # a referenced value
print("msg: ", ref['msg']) # a referenced value
You can merely use an empty class as an instance to store reference objects because internally object attributes are stored in an instance dictionary. See the example.
class RefsObj(object):
"A class which helps to create references to variables."
pass
...
# an example of usage
def change_ref_var(ref_obj):
ref_obj.val = 24
ref_obj = RefsObj()
ref_obj.val = 1
print(ref_obj.val) # or print ref_obj.val for python2
change_ref_var(ref_obj)
print(ref_obj.val)
While pass by reference is nothing that fits well into Python and should be rarely used, there are some workarounds that actually can work to get the object currently assigned to a local variable or even reassign a local variable from inside of a called function.
The basic idea is to have a function that can do that access and can be passed as object into other functions or stored in a class.
One way is to use global (for global variables) or nonlocal (for local variables in a function) in a wrapper function.
def change(wrapper):
wrapper(7)
x = 5
def setter(val):
global x
x = val
print(x)
The same idea works for reading and deleting a variable.
For just reading, there is even a shorter way of just using lambda: x which returns a callable that when called returns the current value of x. This is somewhat like "call by name" used in languages in the distant past.
Passing 3 wrappers to access a variable is a bit unwieldy so those can be wrapped into a class that has a proxy attribute:
class ByRef:
def __init__(self, r, w, d):
self._read = r
self._write = w
self._delete = d
def set(self, val):
self._write(val)
def get(self):
return self._read()
def remove(self):
self._delete()
wrapped = property(get, set, remove)
# Left as an exercise for the reader: define set, get, remove as local functions using global / nonlocal
r = ByRef(get, set, remove)
r.wrapped = 15
Pythons "reflection" support makes it possible to get a object that is capable of reassigning a name/variable in a given scope without defining functions explicitly in that scope:
class ByRef:
def __init__(self, locs, name):
self._locs = locs
self._name = name
def set(self, val):
self._locs[self._name] = val
def get(self):
return self._locs[self._name]
def remove(self):
del self._locs[self._name]
wrapped = property(get, set, remove)
def change(x):
x.wrapped = 7
def test_me():
x = 6
print(x)
change(ByRef(locals(), "x"))
print(x)
Here the ByRef class wraps a dictionary access. So attribute access to wrapped is translated to a item access in the passed dictionary. By passing the result of the builtin locals and the name of a local variable, this ends up accessing a local variable. The Python documentation as of 3.5 advises that changing the dictionary might not work, but it seems to work for me.
Pass-by-reference in Python is quite different from the concept of pass by reference in C++/Java.
Java and C#: primitive types (including string) pass by value (copy). A reference type is passed by reference (address copy), so all changes made in the parameter in the called function are visible to the caller.
C++: Both pass-by-reference or pass-by-value are allowed. If a parameter is passed by reference, you can either modify it or not depending upon whether the parameter was passed as const or not. However, const or not, the parameter maintains the reference to the object and reference cannot be assigned to point to a different object within the called function.
Python:
Python is “pass-by-object-reference”, of which it is often said: “Object references are passed by value.” (read here). Both the caller and the function refer to the same object, but the parameter in the function is a new variable which is just holding a copy of the object in the caller. Like C++, a parameter can be either modified or not in function. This depends upon the type of object passed. For example, an immutable object type cannot be modified in the called function whereas a mutable object can be either updated or re-initialized.
A crucial difference between updating or reassigning/re-initializing the mutable variable is that updated value gets reflected back in the called function whereas the reinitialized value does not. The scope of any assignment of new object to a mutable variable is local to the function in the python. Examples provided by #blair-conrad are great to understand this.
I am new to Python, started yesterday (though I have been programming for 45 years).
I came here because I was writing a function where I wanted to have two so-called out-parameters. If it would have been only one out-parameter, I wouldn't get hung up right now on checking how reference/value works in Python. I would just have used the return value of the function instead. But since I needed two such out-parameters I felt I needed to sort it out.
In this post I am going to show how I solved my situation. Perhaps others coming here can find it valuable, even though it is not exactly an answer to the topic question. Experienced Python programmers of course already know about the solution I used, but it was new to me.
From the answers here I could quickly see that Python works a bit like JavaScript in this regard, and that you need to use workarounds if you want the reference functionality.
But then I found something neat in Python that I don't think I have seen in other languages before, namely that you can return more than one value from a function, in a simple comma-separated way, like this:
def somefunction(p):
a = p + 1
b = p + 2
c = -p
return a, b, c
and that you can handle that on the calling side similarly, like this
x, y, z = somefunction(w)
That was good enough for me and I was satisfied. There isn't any need to use some workaround.
In other languages you can of course also return many values, but then usually in the from of an object, and you need to adjust the calling side accordingly.
The Python way of doing it was nice and simple.
If you want to mimic by reference even more, you could do as follows:
def somefunction(a, b, c):
a = a * 2
b = b + a
c = a * b * c
return a, b, c
x = 3
y = 5
z = 10
print(F"Before : {x}, {y}, {z}")
x, y, z = somefunction(x, y, z)
print(F"After : {x}, {y}, {z}")
which gives this result
Before : 3, 5, 10
After : 6, 11, 660
Alternatively, you could use ctypes which would look something like this:
import ctypes
def f(a):
a.value = 2398 ## Resign the value in a function
a = ctypes.c_int(0)
print("pre f", a)
f(a)
print("post f", a)
As a is a c int and not a Python integer and apparently passed by reference. However, you have to be careful as strange things could happen, and it is therefore not advised.
Use dataclasses. Also, it allows you to apply type restrictions (aka "type hints").
from dataclasses import dataclass
#dataclass
class Holder:
obj: your_type # Need any type? Use "obj: object" then.
def foo(ref: Holder):
ref.obj = do_something()
I agree with folks that in most cases you'd better consider not to use it.
And yet, when we're talking about contexts, it's worth to know that way.
You can design an explicit context class though. When prototyping, I prefer dataclasses, just because it's easy to serialize them back and forth.
There are already many great answers (or let's say opinions) about this and I've read them, but I want to mention a missing one. The one from Python's documentation in the FAQ section. I don't know the date of publishing this page, but this should be our true reference:
Remember that arguments are passed by assignment in Python. Since
assignment just creates references to objects, there’s no alias
between an argument name in the caller and callee, and so no
call-by-reference per se.
If you have:
a = SOMETHING
def fn(arg):
pass
and you call it like fn(a), you're doing exactly what you do in assignment. So this happens:
arg = a
An additional reference to SOMETHING is created. Variables are just symbols/names/references. They don't "hold" anything.