Empty list in Python class constructor causes error - python-3.x

I'm creating a simple tree where each node has any number of children in Python, and I created a Node class to help me.
Each node holds a reference to its parent node (int), and any children nodes (list).
However, explicitly adding an empty list to the Node constructor's argument gave me strange results, and I'd love an explanation as to why this behaviour changes when the list is explicit or not explicitly put in the constructor arguments:
Implementation #1:
class Node:
def __init__(self, value, parent, children=[]):
self.parent = parent
self.value = value
self.children = children
Implementation #2:
class Node:
def __init__(self, value, parent):
self.parent = parent
self.value = value
self.children = []
To populate the 'nodes' array:
parents = [4,-1,4,1,1]
nodes = [None] * n
for i in range(n):
nodes[i] = Node(i, parents[i])
To store the parent attribute of each node:
tree = Tree()
for i, node in enumerate(nodes):
parent_id = node.parent
if parent_id == -1:
tree.root = nodes[i]
else:
nodes[parent_id].children.append(node.value)
print([(node.value, node.children) for node in nodes])
With Implementation #1 I get:
[(0, [0, 2, 3, 4]), (1, [0, 2, 3, 4]), (2, [0, 2, 3, 4]), (3, [0, 2, 3, 4]), (4, [0, 2, 3, 4])]
but with Implementation #2 I (correctly) get:
[(0, []), (1, [3, 4]), (2, []), (3, []), (4, [0, 2])]
Why the difference? I don't understand why the list is fully populated for each node even with the if and else statements.
All help appreciated, including if you think there are better ways to do this.

Default arguments are bound once when the function is defined, so every object of Node gets the same list object in your first implementation.
Locals are evaluated when the function is run, so self.children=[] assigns a new list in each object.
A better approach if you want to allow an optional children argument would be to
class Node:
def __init__(self, value, parent, children=None):
self.parent = parent
self.value = value
self.children = children or []
This uses None as the default value. The or operator allows us to select children if the argument is truthy, and an empty list if it is falsey.
From the docs.

Related

How do i define a setter for a list with an index or slicing?

With the property and setter decorator I can define getter and setter functions. This is fine for primitives but how do I index a collection or a numpy array?
Setting values seems to work with an index, but the setter function doesn't get called. Otherwise the print function in the minimal example would be executed.
class Data:
def __init__(self):
self._arr = [0, 1, 2]
#property
def arr(self):
return self._arr
#arr.setter
def arr(self, value):
print("new value set") # I want this to be executed
self._arr = value
data = Data()
print(data.arr) # prints [0, 1, 2]
data.arr[2] = 5
print(data.arr) # prints [0, 1, 5]
If you want to do this just for one list of your class instance you can do this in a way by using the __set_item__ and __get_item__ dunder methods of the class:
class Data:
def __init__(self):
self._arr = [0, 1, 2]
#property
def arr(self):
return self._arr
#arr.setter
def arr(self, value):
print("new inner list set")
self._arr = value
def __setitem__(self, key, value):
print("new value set")
self._arr[key] = value
def __getitem__(self, key):
return self._arr[key]
data = Data()
print(data.arr)
data[2] = 5
print(data.arr)
data.arr = [42, 43]
print(data.arr)
Output:
[0, 1, 2]
new value set # by data[2] = 5 using __set_item__
[0, 1, 5]
new inner list set # by data.arr = [42, 43] using #arr.setter
[42, 43]
This would only work for one list member though, because the __set_item__ are working on the class instance itself, not the list that is a member of the class instance.

BFS traversal of sklearn decision tree

How do I do the breadth first search traversal of the sklearn decision tree?
In my code i have tried sklearn.tree_ library and used various function such as tree_.feature and tree_.threshold to understand the structure of the tree. But these functions do the dfs traversal of the tree if I want to do bfs how should i do it?
Suppose
clf1 = DecisionTreeClassifier( max_depth = 2 )
clf1 = clf1.fit(x_train, y_train)
this is my classifier and the decision tree produced is
Then I have traversed the tree using following function
def encoding(clf, features):
l1 = list()
l2 = list()
for i in range(len(clf.tree_.feature)):
if(clf.tree_.feature[i]>=0):
l1.append( features[clf.tree_.feature[i]])
l2.append(clf.tree_.threshold[i])
else:
l1.append(None)
print(np.max(clf.tree_.value))
l2.append(np.argmax(clf.tree_.value[i]))
l = [l1 , l2]
return np.array(l)
and the output produced is
array([['address', 'age', None, None, 'age', None, None],
[0.5, 17.5, 2, 1, 15.5, 1, 1]], dtype=object)
where 1st array is feature of node or if it leaf node then it is labelled as none and 2nd array is threshold for feature node and for class node it is class but this is dfs traversal of tree i want to do bfs traversal what should i do ?
The above part has been answered.
I wanted to know can we store the tree into array in way that it appears to be a complete binary tree so that children of ith node is stored at 2i + 1 th and 2i +2 th index?
For above tree output produced is
array([['address', 'age', None, None], [0.5, 15.5, 1, 1]], dtype=object)
but the desired output is
array([['address', None, 'age', None, None, None, None], [0.5, -1, 15.5, -1, -1, 1 , 1]], dtype=object)
If values is none in 1st array and -1 in 2nd array that would mean that node does not exist. So here age which is right child of address is found at 2 * 0 + 2 = 2
index in array and similarly left and right child of age are found at 2 * 2 + 1 = 5th index and 2 * 2 + 2 = 6th index of the array respectively.
Something like this?
def reformat_tree(clf):
tree = clf.tree_
feature_out = np.full((2 ** tree.max_depth), -1, dtype=tree.feature.dtype)
threshold_out = np.zeros((2 ** tree.max_depth), dtype=tree.threshold.dtype)
stack = []
stack.append((0, 0))
while stack:
current_node, new_node = stack.pop()
feature_out[new_node] = tree.feature[current_node]
threshold_out[new_node] = tree.threshold[current_node]
left_child = tree.children_left[current_node]
if left_child >= 0:
stack.append((left_child, 2 * current_node + 1))
right_child = tree.children_right[current_node]
if right_child >= 0:
stack.append((right_child, 2 * current_node + 2))
return feature_out, threshold_out
I can't test it on your tree since you still haven't given a way to reproduce it, but it should work.
The function returns features and thresholds in the desired format. Feature value is -1 is the node doesn't exist, and -2 if the node is a leaf.
This works by traversing the tree and keeping track of the current position.

Multiple Assignment output is not correct

I am trying to write a program where all the odd linked list nodes are pushed to the start of the linked list followed by the odd ones. What I have is below:
def oddEvenList(self, head: ListNode) -> ListNode:
sentinel = node1 = head
node2 = second = head.next
while node1.next:
node1.next, node1 = node1.next.next, node1.next
while node2.next:
node2.next, node2 = node2.next.next, node2.next
node1.next = second
return head
The definition for the linked list class is as below:
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, x):
# self.val = x
# self.next = None
My input is:
[1, 2, 3, 4, 5]
and my output is:
[1, 3, 5, 2]
What the output should be is:
[1, 3, 5, 2, 4]
Even though I assign second as the head of the second list, I still do not seem to add anything but the very head of the second list to the first list. I am doing the assignment here:
node2 = second = head.next
What am I doing wrong?

How to Get a List of Class Attribute

there is a list of instances from the same class, and i want to extract a certain attribute of every instance and build up a new list
class Test:
def __init__(self, x):
self.x = x
l = [Test(1), Test(2), Test(3), Test(4)]
something like that, and i want to get a list which result is [1, 2, 3, 4]
The best way to do it would probably be like this:
class Test:
def __init__(self, x):
self.x = x
l = [Test(1), Test(2), Test(3), Test(4)]
res = [inst.x for inst in l] # [1, 2, 3, 4]
or just do it from the start:
l = [Test(1).x, Test(2).x, Test(3).x, Test(4).x]

get self. values from class B to variables in function in class A, and then change the variable values without changing self. values in class B

I want to use values in class B(self.array_B) and assign them to variables(array_A) in class A while executing "step" function in class A. However, after I change the variable values(array_A) to be zeros in class A, the values of (self.array_B) are also changed to zeros which is very awkward. (self.array_B) should remain the same after I change the variable values in class A. Is there any way to solve this?
import numpy as np
class A:
def __init__(self):
self.B = B()
def step(self):
print("array_B:", self.B.array_B)
array_A = np.zeros((2, 2))
array_A = self.B.array_B
print("array_A:", array_A)
for i in range(2):
for j in range(2):
array_A[i][j] = 0
print("------------------")
print("after changing variable value:array_B:", self.B.array_B)
print("after changing variable value:array_A:", array_A)
return "done"
class B:
def __init__(self):
self.array_B = [[1, 2], [3, 4]]
def test_main():
env = A()
s = env.step()
print(s)
if __name__ == "__main__":
test_main()
output:
array_B: [[1, 2], [3, 4]]
array_A: [[1, 2], [3, 4]]
------------------
after changing variable value:array_B: [[0, 0], [0, 0]]
after changing variable value:array_A: [[0, 0], [0, 0]]
done
When assigning the list here:
array_A = self.B.array_B
you are only copying a reference to the original list. So A.array_A and B.array_B actually refer to the same list, and any changes to the list will be reflected in both references.
You can copy the list itself instead by using:
array_A = self.B.array_B.copy()
Now A.Array_A and B.Array_B refer to different lists, and can be changed independently.
If the list contains mutable objects itself, a simple copy() is not enough. Both lists will still contain references to the same mutable objects inside. In this case, a deepcopy() is needed, which also makes a copy of all elements inside the list:
import copy
array_A = copy.deepcopy(self.B.array_B)
This is quite an expensive operation and should only be used when needed.
just assign a copy of arrayB to arrayA:
array_A = self.B.array_B.copy()
This is because while arrayB is assigned to arrayA, it is the address of arrayB not the actual value of it is assigned to the name arrayA. Therefore, just use the method copy() to create a copy of arrayB and then assign.
The solution here would be to use deepcopy
Why does this happen?
Lists are mutable objects, which mean that array_A still points to the same object in memory that array_B.
If you had worked with a list of immutable values (like integers), a simple
array_A = list(self.B.array_B)
or even
array_A = self.B.array_B[:]
would do the trick, because it would force python to instantiate a new list.
But here, the items of array_B also are lists, so if you did that:
array_A = list(self.B.array_B)
array_A[0][0] = 3
print(self.B.array_B[0][0] == 3) # >> would print True

Resources