Why/how to detemine when a function overwrites a local variable in Julia? - scope

I am relatively new to Julia, and working on porting over some C functions to check the speed difference. One this I'm struggling with is the scope of variables. Specifically, sometimes a function call in Julia overwrites a local variable, and other times not. For example, here's a function to calculate a minimum spanning tree:
function mst(my_X::Array{Float64})
n = size(my_X)[1]
N = zeros(Int16,n,n)
tree = []
lv = maximum(my_X)+1
my_X[diagind(my_X)] .=lv
indexi = 1
for ijk in 1:(n-1)
tree = vcat(tree, indexi)
m = minimum(my_X[:,tree],dims = 1)
a = zeros(Int64, length(tree))
print(tree)
for k in 1:length(tree)
a[k] = sortperm(my_X[:,tree[k]])[1,]
end
b = sortperm(vec(m))[1]
indexj = tree[b]
indexi = a[b]
N[indexi,indexj] = 1
N[indexj,indexi] = 1
for j in tree
my_X[indexi,j] = lv
my_X[j,indexi] = lv
end
end
return N
end
Now we can apply this to a distance matrix X:
julia> X
5×5 Array{Float64,2}:
0.0 0.54 1.08 1.12 0.95
0.54 0.0 0.84 0.67 1.05
1.08 0.84 0.0 0.86 1.14
1.12 0.67 0.86 0.0 1.2
0.95 1.05 1.14 1.2 0.0
But when I do so, it overwrites all of the entries of X
julia> M = mst(X)
julia> M
5×5 Array{Int16,2}:
0 1 0 0 1
1 0 1 1 0
0 1 0 0 0
0 1 0 0 0
1 0 0 0 0
julia> X
5×5 Array{Float64,2}:
2.2 2.2 2.2 2.2 2.2
2.2 2.2 2.2 2.2 2.2
2.2 2.2 2.2 2.2 2.2
2.2 2.2 2.2 2.2 2.2
2.2 2.2 2.2 2.2 2.2
Of course I can override this if I explicitly put something like this in the function:
function mst(my_Z::Array{Float64})
my_X = copy(my_Z)
.
.
.
But it seems like the issue is deeper than this. For example, if I try to replicate this in a simple example I can't recreate the issue:
function add_one(my_X::Int64)
my_X = my_X + 1
return my_X
end
julia> Z = 1
julia> W = add_one(Z)
julia> W
2
julia> Z
1
What is going on here?? I've read and re-read the julia help docs on variable scopes and I cannot figure out what the distinction is.

There are the following inter-related issues here:
Values in Julia can be either mutable or immutable.
A variable in Julia is bound to a value (which can be either immutable or mutable).
Some operations can modify mutable value.
So the first point is about mutability vs immutability of values. The discussion in the Julia manual is given here. You can check if a value is mutable or not using isimmutable function.
Typical cases are the following:
numbers, strings, Tuple, NamedTuple, structs are immutable
julia> isimmutable(1)
true
julia> isimmutable("sdaf")
false
julia> isimmutable((1,2,3))
true
Arrays, dicts, mutable structs etc. (in general container types other than Tuple, NamedTuple and structs) are mutable:
julia> isimmutable([1,2,3])
false
julia> isimmutable(Dict(1=>2))
false
The key difference between immutable and mutable values is that mutable values can have their contents modified. Here is a simple example:
julia> x = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> x[1] = 10
10
julia> x
3-element Array{Int64,1}:
10
2
3
Now let us dissect what we have seen here:
the assignment statement x = [1, 2, 3] binds the value (in this case a vector) to a variable x
the statement x[1] = 10 mutates the value (a vector) in place
Note that the same would fail for a Tuple as it is immutable:
julia> x = (1,2,3)
(1, 2, 3)
julia> x[1] = 10
ERROR: MethodError: no method matching setindex!(::Tuple{Int64,Int64,Int64}, ::Int64, ::Int64)
Now we come to a second point - binding a value to a variable name. This is typically done using a = operator if on its left hand side we see a variable name like above with x = [1,2,3] or x = (1,2,3).
Note that in particular also += (and similar) are doing rebinding, e.g.:
julia> x = [1, 2, 3]
3-element Array{Int64,1}:
1
2
3
julia> y = x
3-element Array{Int64,1}:
1
2
3
julia> x += [1,2,3]
3-element Array{Int64,1}:
2
4
6
julia> x
3-element Array{Int64,1}:
2
4
6
julia> y
3-element Array{Int64,1}:
1
2
3
as in this case it is just a shorthand of x = x + [1, 2, 3], and we know that = rebinds.
In particular (as #pszufe noted in the comment) if you pass a value to a function nothing is copied. What happens here is that a variable which is in the function signature is bound to the passed value (this kind of behavior is sometimes called pass by sharing). So you have:
julia> x = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> f(y) = y
f (generic function with 1 method)
julia> f(x) === x
true
Essentially what happens is "as if" you have written y = x. The difference is that function creates a variable y in a new scope (scope of the function), while y = x would create a binding of the value that x is bound to to the variable y in the scope where statement y = x is present.
Now on the other hand things like x[1] = 10 (which is essentially a setindex! function application) or x .= [1,2,3] are in-place operations (they do not rebind a value but try to mutate the container). So this works in-place (note that in the example I combine broadcasting with += to make it in place):
julia> x = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> y = x
3-element Array{Int64,1}:
1
2
3
julia> x .+= [1,2,3]
3-element Array{Int64,1}:
2
4
6
julia> y
3-element Array{Int64,1}:
2
4
6
but if we tried to do the same with eg. an integer, which is immutable, the operation will fail:
julia> x = 10
10
julia> x .+= 1
ERROR: MethodError: no method matching copyto!(::Int64, ::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Tuple{},typeof(+),Tuple{Int64,Int64}})
The same with setting index for an immutable value:
julia> x = 10
10
julia> x[] = 1
ERROR: MethodError: no method matching setindex!(::Int64, ::Int64)
Finally the third thing is which operations try to mutate the value in-place. We have noted already some of them (like setindex!: x[10] = 10 and broadcating assignment x .= [1,2,3]). In general it is not always easy to decide if calling f(x) will mutate x if f is some general function (it may or it may not mutate x if x is mutable). Therefore in Julia there is a convention to add ! at the end of names of functions that may mutate their arguments to visually signal this (it should be stressed that this is a convention only - in particular just adding ! at the end of the the name of the function has no direct influence on how it works). We have already seen this with setindex! (for which a shorthand is x[1] = 10 as discussed), but here is a different example:
julia> x = [1, 2, 3]
3-element Array{Int64,1}:
1
2
3
julia> filter(==(1), x) # no ! so a new vector is created
1-element Array{Int64,1}:
1
julia> x
3-element Array{Int64,1}:
1
2
3
julia> filter!(==(1), x) # ! so x is mutated in place
1-element Array{Int64,1}:
1
julia> x
1-element Array{Int64,1}:
1
If you use a function (like setindex!) that mutates its argument and want to avoid mutation use copy when passing an argument to it (or deepcopy if your structure is multiply nested and potentially mutation can happen on a deeper level - but this is rare).
So in our example:
julia> x = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> y = filter!(==(1), copy(x))
1-element Array{Int64,1}:
1
julia> y
1-element Array{Int64,1}:
1
julia> x
3-element Array{Int64,1}:
1
2
3

Related

How to understand this simultaneous assignment evaluation in python 3? [duplicate]

This question already has answers here:
Multiple assignment and evaluation order in Python
(11 answers)
Closed 3 years ago.
I have recently started learning python. I was playing around with simultaneous assignments today and came across some results that were produced by my code that I cannot understand.
x, y = 3, 5
x, y = y, (x+y)
print(y)
The output is 8.
I am not understanding why y = 8 instead of y = 10 despite x = y = 5 being evaluated first. Since y = 8, this tell us that x = 3 when y = x + y is being evaluated ? How is that possible if x = y is first evaluated, since it is evaluated left to right ?
I first debugged the code (which produced the same result), which I cannot understand. I also have tried looking at the python documentation, where it states "simultaneous overlaps within the collection of assigned-to variables occur left-to-right, sometimes resulting in confusion."
The example that it presents follows my logic:
x = [0, 1]
i = 0
i, x[i] = 1, 2
print(x)
It outputs 2.
(Evaluated left to right) Since i is updated to 1. Then, this updated i is used and hence, x[1] = 2 not x[0] = 2.
I really would appreciate some help.
I am not understanding why y = 8 instead of y = 10 despite x = y = 5 being evaluated first
The right side of an assignment is evaluated first.
In your assignment with x = 3 and y = 5
x, y = y, (x+y)
the right side is evaluated to a tuple (5, 8) first and then assigned to the values on the left side. Therefore y is 8.
You could also think of it as
x, y = 3, 5
temp = y, x + y
x, y = temp
To see what really happens internal, you can disassemble your code:
>>> import dis
>>> def f(x, y):
... x, y = y, x + y
...
>>> dis.dis(f)
Outputs
2 0 LOAD_FAST 1 (y)
2 LOAD_FAST 0 (x)
4 LOAD_FAST 1 (y)
6 BINARY_ADD
8 ROT_TWO
10 STORE_FAST 0 (x)
12 STORE_FAST 1 (y)
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
As you can see, the addition is performed before the assignment.
Python goes right-to-left, left-to-right. You can imagine that python pushes its operations to a stack-like data structure.
Looking at the first assignment: x, y = 3, 5
python will first push the right-hand side value onto the stack as a tuple.
run an unpacking-sequence for n values from the stack, and puts the values back onto the stack right-to-left. "push 5 to the stack, then 3". Current stack = [3, 5]
Finished with the right-hand side, python will assign values to the left-hand side left-to-right, while removing top-of-stack. So it will first tak 3 and store it in variable x, then 5 and store it in variable y.
You can inspect the operations python does in byte code using the dis module.
The following assignments:
x, y = 3, 5
x, y = y, (x + y)
Produce the following operations:
You can inspect the bytecode operations here: http://pyspanishdoc.sourceforge.net/lib/bytecodes.html

Changing multiple variables using other variables Python

My question is about Python variables. I have a bunch of variables such as p1, p2 and p3. If I wanted to make a loop that let me change all of them at once, how would I do that? Here is what I got so far.
p1 = 0
p2 = 0
p3 = 0
p4 = 0
p5 = 0
p6 = 0
p7 = 0
p8 = 0
p9 = 0
p10 = 0
x = 10
while(x < 0):
p+str(x) = p+str(x) + 1
x - 1
This code should change 10 variables called p1, p2, p3 (ect) by 1 each.
You cannot add a number to a variable like that, instead you might want to use a Object, array or a dictionary. What you are trying to do, can be done in a dictionary really easily. The code below shows how you can implement your code with dictionary.
dictionary = {} # it can hold P1 P2 p3...
x = 10
while(x < 0):
dictionary["p" + str(x)] = dictionary["p" + str(x)] + 1
x - 1
You may want to research more about this subject as this is just a quick example of how to use a dictionary in python
>>> p1=0
>>> p2=0
>>> p3=0
>>> x=3
>>> while(x>0):
exec("%s%d += 1"%("p",x))
x = x -1
>>> p1
1
>>> p2
1
>>> p3
1
Although this does the job. Don't follow this approach(really bad way), instead look for alternative like using lists or dicts.

Round a number to a given set of values [duplicate]

This question already has answers here:
From list of integers, get number closest to a given value
(10 answers)
Closed 5 years ago.
Talking Python 3 here.
I'm looking to round a number to a given set of values which can vary
Assume value_set = [x, y, z] and for the sake of the example x, y, z = 1, 3.12, 4 I'm looking for a function that will round a given float to the closest number
custom_round(0) --> 1
custom_round(2.7) --> 3.12
Notice that it should be generic enough that value_set length will vary also
You can use the min function in order to find the minimum in your list when the key is the absolute value of x-n (x is each item in the list).
value_set = [1, 3.12, 4]
def return_closest(n):
return min(value_set, key=lambda x:abs(x-n))
number_to_check = 3
print (return_closest(number_to_check))
>>> 3.12
You can do this by first sorting the list, and then use binary search:
from bisect import bisect_left
class CustomRound:
def __init__(self,iterable):
self.data = sorted(iterable)
def __call__(self,x):
data = self.data
ndata = len(data)
idx = bisect_left(data,x)
if idx <= 0:
return data[0]
elif idx >= ndata:
return data[ndata-1]
x0 = data[idx-1]
x1 = data[idx]
if abs(x-x0) < abs(x-x1):
return x0
return x1
You can than construct your CustomRound like:
values = [1,3.12,4]
custom_round = CustomRound(values)
and simply call it:
>>> custom_round(0)
1
>>> custom_round(0.5)
1
>>> custom_round(1.5)
1
>>> custom_round(2.5)
3.12
>>> custom_round(3.12)
3.12
>>> custom_round(3.9)
4
>>> custom_round(4.1)
4
>>> custom_round(4.99)
4
This approach will work in O(log n) for rounding and O(n log n) for construction. So you will invest some additional time to construct the custom_round, but if you call it often, it will eventually pay off in rounding individual numbers.

Get length of range in Python List Comprehension

I wonder if it is possible to get the length of the range in a list comprehension in python 3 in order to set up a conditional as such? this code doesn't work
b = [x**2 for x in range(10) if x % 2 == 0 and x > len/2]
>>> n = 10
>>> b = [x**2 for x in range(n) if x % 2 == 0 and x > n/2]
>>> b
[36, 64]

String concatenation queries

I have a list of characters, say x in number, denoted by b[1], b[2], b[3] ... b[x]. After x,
b[x+1] is the concatenation of b[1],b[2].... b[x] in that order. Similarly,
b[x+2] is the concatenation of b[2],b[3]....b[x],b[x+1].
So, basically, b[n] will be concatenation of last x terms of b[i], taken left from right.
Given parameters as p and q as queries, how can I find out which character among b[1], b[2], b[3]..... b[x] does the qth character of b[p] corresponds to?
Note: x and b[1], b[2], b[3]..... b[x] is fixed for all queries.
I tried brute-forcing but the string length increases exponentially for large x.(x<=100).
Example:
When x=3,
b[] = a, b, c, a b c, b c abc, c abc bcabc, abc bcabc cabcbcabc, //....
//Spaces for clarity, only commas separate array elements
So for a query where p=7, q=5, answer returned would be 3(corresponding to character 'c').
I am just having difficulty figuring out the maths behind it. Language is no issue
I wrote this answer as I figured it out, so please bear with me.
As you mentioned, it is much easier to find out where the character at b[p][q] comes from among the original x characters than to generate b[p] for large p. To do so, we will use a loop to find where the current b[p][q] came from, thereby reducing p until it is between 1 and x, and q until it is 1.
Let's look at an example for x=3 to see if we can get a formula:
p N(p) b[p]
- ---- ----
1 1 a
2 1 b
3 1 c
4 3 a b c
5 5 b c abc
6 9 c abc bcabc
7 17 abc bcabc cabcbcabc
8 31 bcabc cabcbcabc abcbcabccabcbcabc
9 57 cabcbcabc abcbcabccabcbcabc bcabccabcbcabcabcbcabccabcbcabc
The sequence is clear: N(p) = N(p-1) + N(p-2) + N(p-3), where N(p) is the number of characters in the pth element of b. Given p and x, you can just brute-force compute all the N for the range [1, p]. This will allow you to figure out which prior element of b b[p][q] came from.
To illustrate, say x=3, p=9 and q=45.
The chart above gives N(6)=9, N(7)=17 and N(8)=31. Since 45>9+17, you know that b[9][45] comes from b[8][45-(9+17)] = b[8][19].
Continuing iteratively/recursively, 19>9+5, so b[8][19] = b[7][19-(9+5)] = b[7][5].
Now 5>N(4) but 5<N(4)+N(5), so b[7][5] = b[5][5-3] = b[5][2].
b[5][2] = b[3][2-1] = b[3][1]
Since 3 <= x, we have our termination condition, and b[9][45] is c from b[3].
Something like this can very easily be computed either recursively or iteratively given starting p, q, x and b up to x. My method requires p array elements to compute N(p) for the entire sequence. This can be allocated in an array or on the stack if working recursively.
Here is a reference implementation in vanilla Python (no external imports, although numpy would probably help streamline this):
def so38509640(b, p, q):
"""
p, q are integers. b is a char sequence of length x.
list, string, or tuple are all valid choices for b.
"""
x = len(b)
# Trivial case
if p <= x:
if q != 1:
raise ValueError('q={} out of bounds for p={}'.format(q, p))
return p, b[p - 1]
# Construct list of counts
N = [1] * p
for i in range(x, p):
N[i] = sum(N[i - x:i])
print('N =', N)
# Error check
if q > N[-1]:
raise ValueError('q={} out of bounds for p={}'.format(q, p))
print('b[{}][{}]'.format(p, q), end='')
# Reduce p, q until it is p < x
while p > x:
# Find which previous element character q comes from
offset = 0
for i in range(p - x - 1, p):
if i == p - 1:
raise ValueError('q={} out of bounds for p={}'.format(q, p))
if offset + N[i] >= q:
q -= offset
p = i + 1
print(' = b[{}][{}]'.format(p, q), end='')
break
offset += N[i]
print()
return p, b[p - 1]
Calling so38509640('abc', 9, 45) produces
N = [1, 1, 1, 3, 5, 9, 17, 31, 57]
b[9][45] = b[8][19] = b[7][5] = b[5][2] = b[3][1]
(3, 'c') # <-- Final answer
Similarly, for the example in the question, so38509640('abc', 7, 5) produces the expected result:
N = [1, 1, 1, 3, 5, 9, 17]
b[7][5] = b[5][2] = b[3][1]
(3, 'c') # <-- Final answer
Sorry I couldn't come up with a better function name :) This is simple enough code that it should work equally well in Py2 and 3, despite differences in the range function/class.
I would be very curious to see if there is a non-iterative solution for this problem. Perhaps there is a way of doing this using modular arithmetic or something...

Resources