Find all possible modifications to a graph - python-3.x

I am now using lists to represent the graph, which would be similar to previous question. I found out that the dict approach would be very long and complex, so decided to go with the list approach. But I am still facing a few roadblocks.
So for example, the graph:
is now represented as:
nodes = ["1", "2", "3", "4", "5"]
edges = [
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[1, 0, 1, 0, 2],
[1, 2, 0, 0, 0],
]
Here, edge weights can only be 1 or 2 and 0 represents no edge from one node to other. The edges are directed, so every list in the matrix represents the edges coming toward the node.
Similar to the last question, I want all possible two-edge modifications on the graph. So, for example, if we add an edge from node "4" to "5" with weight of 1, and remove the edge with weight 1 coming from node "1" to "4", the new graph will look like:
edges = [
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[0, 0, 1, 0, 2],
[1, 2, 0, 1, 0],
]
and this is one of the possible modifications.
I want to build a generator that can create all such modifications sequentially and pass it to me so that I can use them to test.
My code so far is like this:
def all_modification_generation(graph: list[list], iter_count: int = 0):
possible_weights = {-1, 0, 1}
node_len = len(graph)
for i in range(node_len**2):
ix_x = i // node_len
ix_y = i % node_len
if i == ix_y:
continue
for possible_pertubs in possible_weights - {graph[ix_x][ix_y]}:
graph[ix_x][ix_y] = possible_pertubs
if iter_count == 0:
all_modification_generation(graph=graph, iter_count=iter_count + 1)
else:
yield all_modification_generation(graph=graph)
My logic is, once I do one change, I can then loop over all other elements that come after it in the matrix. So this problem could be solved recursively. And once a node is explored, we do not need to take it into consideration for next loops, because it will just give us a duplicate result that we have already found. And because I need to check for 2 modifications, I am increasing iter_count after first iteration and then yielding the next time. I am skipping ix_x == ix_y cases because a self-looping edge does not make any sense in this context, so that change is not required to be recorded.
But even then, this does not output any result. What am I doing wrong? Any help is appreciated, thanks!
Edit: I think I have figured out a way to do the double modification without repetitive generation of modified matrices. Now the only problem is that there is quite a bit of code repetition and a 4-level nested for-loop.
I'm not sure how to call a generator recursively, but I feel that should be the way to go! Thanks J_H for pointing me to the right direction.
The working code is:
def all_modification_generation(graph: list[list]):
possible_weights = {-1, 0, 1}
node_len = len(graph)
for i in range(node_len**2):
ix_x1 = i // node_len
ix_y1 = i % node_len
if ix_x1 == ix_y1:
continue
for possible_pertubs in possible_weights - {graph[ix_x1][ix_y1]}:
cc1_graph = deepcopy(graph)
cc1_graph[ix_x1][ix_y1] = possible_pertubs
for j in range(i + 1, node_len**2):
ix_x2 = j // node_len
ix_y2 = j % node_len
if ix_x2 == ix_y2:
continue
for possible_perturbs2 in possible_weights - {cc1_graph[ix_x2][ix_y2]}:
cc2_graph = deepcopy(cc1_graph)
cc2_graph[ix_x2][ix_y2] = possible_perturbs2
yield cc2_graph

The quadratic looping is an interesting technique.
We do wind up with quite a few repeated
division results, from // node_len, but that's fine.
I had a "base + edits" datastructure in mind for this problem.
Converting array to list-of-lists would be straightforward.
After overhead, a 5-node graph consumes 25 bytes -- pretty compact.
Numpy offers good support for several styles of sparse
graphs, should that become of interest.
from typing import Generator, Optional
import numpy as np
class GraphEdit:
"""A digraph with many base edge weights plus a handful of edited weights."""
def __init__(self, edge: np.ndarray, edit: Optional[dict] = None):
a, b = edge.shape
assert a == b, f"Expected square matrix, got {a}x{b}"
self.edge = edge # We treat these as immutable weights.
self.edit = edit or {}
#property
def num_nodes(self):
return len(self.edge)
def __getitem__(self, item):
return self.edit.get(item, self.edge[item])
def __setitem__(self, item, value):
self.edit[item] = value
def as_array(g: GraphEdit) -> np.ndarray:
return np.array([[g[i, j] for j in range(g.num_nodes)] for i in range(g.num_nodes)])
def all_single_mods(g: GraphEdit) -> Generator[GraphEdit, None, None]:
"""Generates all possible single-edge modifications to the graph."""
orig_edit = g.edit.copy()
for i in range(g.num_nodes):
for j in range(g.num_nodes):
if i == j: # not an edge -- we don't support self-loops
continue
valid_weights = {0, 1, 2} - {g[i, j]}
for w in sorted(valid_weights):
yield GraphEdit(g.edge, {**orig_edit, (i, j): w})
def all_mods(g: GraphEdit, depth: int) -> Generator[GraphEdit, None, None]:
assert depth >= 1
if depth == 1:
yield from all_single_mods(g)
else:
for gm in all_single_mods(g):
yield from all_mods(gm, depth - 1)
def all_double_mods(g: GraphEdit) -> Generator[GraphEdit, None, None]:
"""Generates all possible double-edge modifications to the graph."""
yield from all_mods(g, 2)
Here's the associated test suite.
import unittest
from numpy.testing import assert_array_equal
import numpy as np
from .graph_edit import GraphEdit, all_double_mods, all_single_mods, as_array
class GraphEditTest(unittest.TestCase):
def setUp(self):
self.g = GraphEdit(
np.array(
[
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[1, 0, 1, 0, 2],
[1, 2, 0, 0, 0],
],
dtype=np.uint8,
)
)
def test_graph_edit(self):
g = self.g
self.assertEqual(5, self.g.num_nodes)
self.assertEqual(2, g[0, 1])
g[0, 1] = 3
self.assertEqual(3, g[0, 1])
del g.edit[(0, 1)]
self.assertEqual(2, g[0, 1])
def test_non_square(self):
with self.assertRaises(AssertionError):
GraphEdit(np.array([[0, 0], [1, 1], [2, 2]]))
def test_all_single_mods(self):
g = GraphEdit(np.array([[0, 0], [1, 0]]))
self.assertEqual(4, len(list(all_single_mods(g))))
expected = [
np.array([[0, 1], [1, 0]]),
np.array([[0, 2], [1, 0]]),
np.array([[0, 0], [0, 0]]),
np.array([[0, 0], [2, 0]]),
]
for ex, actual in zip(
expected,
map(as_array, all_single_mods(g)),
):
assert_array_equal(ex, actual)
# Now verify that original graph is untouched.
assert_array_equal(
np.array([[0, 0], [1, 0]]),
as_array(g),
)
def test_all_double_mods(self):
g = GraphEdit(np.array([[0, 0], [1, 0]]))
self.assertEqual(16, len(list(all_double_mods(g))))
expected = [
np.array([[0, 0], [1, 0]]),
np.array([[0, 2], [1, 0]]),
np.array([[0, 1], [0, 0]]),
np.array([[0, 1], [2, 0]]),
np.array([[0, 0], [1, 0]]), # note the duplicate
np.array([[0, 1], [1, 0]]),
np.array([[0, 2], [0, 0]]), # and it continues on in this vein
]
for ex, actual in zip(
expected,
map(as_array, all_double_mods(g)),
):
assert_array_equal(ex, actual)
def test_many_mods(self):
self.assertEqual(40, len(list(all_single_mods(self.g))))
self.assertEqual(1_600, len(list(all_double_mods(self.g))))
self.assertEqual(1_600, len(list(all_mods(self.g, 2))))
self.assertEqual(64_000, len(list(all_mods(self.g, 3))))
self.assertEqual(2_560_000, len(list(all_mods(self.g, 4))))
One could quibble about the fact that
it produces duplicates, since inner and outer loops
know nothing of one another.
It feels like this algorithm wants to use an
itertools.combinations
approach, generating all modifications in lexicographic order.

Related

Simple Identity matrix function

Expected Output:
indenitiy_matrix(3)
[[1, 0, 0], [0, 1, 0], [0, 0, 1]]
Actual Output with Error:
indenitiy_matrix(3)
[[1, 1, 1], [1, 1, 1], [1, 1, 1]]
def identity_matrix(n):
list_template = [[]]
list_n = list_template*n
for sub_l in list_n:
sub_l.append(0)
for val in range(n):
# I have the feeling that the problem lies somewhere around here.
list_n[val][val]=1
return(list_n)
list_template*n does not create n copies, instead but all those n copies reference to only one copy. For example see this
a = [[0,0,0]]*2
# Now, lets change first element of the first sublist in `a`.
a[0][0] = 1
print (a)
# but since both the 2 sublists refer to same, both of them will be changed.
Output:
[[1, 0, 0], [1, 0, 0]]
Fix for your code
def identity_matrix(n):
list_n = [[0]*n for i in range(n)]
for val in range(n):
list_n[val][val]=1
return list_n
print (identity_matrix(5))
Output:
[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]]
No, the problem lies here:
list_template = [[]]
list_n = list_template*n
After this, try doing:
list_n[0].append(1) # let's change the first element
The result:
[[1], [1], [1], [1], [1]]
is probably not what you expect.
Briefly, the problem is that after its construction, your list consists of multiple references to same list. A detailed explanation is at the link given by #saint-jaeger : List of lists changes reflected across sublists unexpectedly
Finally, the numpy library is your friend for creating identity matrices and other N-dimensional arrays.

Flipping bits in nested lists

I have a project wherein I have to use bit-flip mutation of genetic algorithm.
The code I have so far looks like this:
def mutation(pop, mr):
for i in range(len(pop)):
if (random.random() < mr):
if (pop[i] == 1):
pop[i] = 0
else:
pop[i] = 1
else:
pop[i] = pop[i]
return pop
mut = mutation(populations, 0.3)
print(mut)
For example, I have the following (depending on my project, populations can look like populations_1 or populations_2):
populations_1 = [[1, 0], [1, 1], [0, 1], [1, 0]]
populations_2 = [[1], [1], [0], [1]]
What I am doing is assigning random generated numbers to elements in populations and check if it is less than mutation rate. If it is, then bit-flip mutation will happen, if not, it will remain as it is. For the case of populations_1, if populations_1 index 2 is less than mutation rate, then it should become [1, 0]. For populations_2 index 3, it should become [0] if it is less than mutation rate. This is the objective of the mutation function.
Can anyone help me with turning the code I have so far to adapt situations like in populations_1? I think the code I have so far only works for populations_2.
Any help/suggestion/readings would be very much appreciated! Thanks!
You can use list comprehensions to do what you want. The values in pop are updated only if r<mr. To update them, you can iterate over each element (a) in list pop[i], and if a == 0 it becomes 1, otherwise 0. See the code below:
def mutation(pop, mr):
for i in range(len(pop)):
r = random.random()
print(r) # you can remove this line, it is only for testing
if r < mr:
pop[i] = [1 if a == 0 else 0 for a in pop[i]]
return pop
Test 1:
populations_1 = [[1, 0], [1, 1], [0, 1], [1, 0], [0,0]]
mut = mutation(populations_1, 0.3)
print(mut)
#random number for each iteration
0.3952226177233832
0.11290933711515283
0.08131952363738537
0.8489702326753509
0.9598842135077205
#output:
[[1, 0], [0, 0], [1, 0], [1, 0], [0, 0]]
Test 2:
populations_2 = [[1], [1], [0], [1]]
mut = mutation(populations_2, 0.3)
print(mut)
0.3846024893833684
0.7680389523799874
0.19371896835988422
0.008814288533701364
[[1], [1], [1], [0]]

Using numba to randomly sample possible combinations of categories

I am trying to speed up a function that randomly samples a number of records with the possible combinations of a number of categories for a number of records and ensures they are unique (i.e. let's assume there's 3 records, any of them can be either 0 or 1 and I want 10 random samples of unique possible combinations of records).
If I did not use numba, I might would do something like this:
import numpy as np
def myfunc(categories, NumberOfRecords, maxsamples):
return np.unique( np.random.choice(np.arange(categories), size=(maxsamples*10, NumberOfRecords), replace=True), axis=0 )[0:maxsamples]
Annoyingly, numba does not support axis in np.unique, so I can do something like this, but some of the records may turn out to be non-unique.
from numba import njit, int64
import numpy as np
#njit(int64[:,:](int64, int64, int64), cache=True)
def myfunc(categories, NumberOfRecords, maxsamples):
return np.random.choice(np.arange(categories), size=(maxsamples, NumberOfRecords), replace=True)
myfunc(categories=2, NumberOfRecords=3, maxsamples=10)
E.g. in one call (obviously there's some randomness here), I got the below (for which the indices 1 and 6, and 3 and 4, and 7 and 9 are identical rows):
array([[0, 1, 1],
[1, 1, 0],
[0, 1, 0],
[1, 0, 1],
[1, 0, 1],
[1, 1, 1],
[1, 1, 0],
[1, 0, 0],
[0, 0, 0],
[1, 0, 0]])
My questions are:
Is this something where I would even expect a speed up from numba?
If so, how can I get a unique rows (this seems rather difficult with numba, but presumably there's a way)?
Perhaps there's a way to get at this more efficiently (perhaps without creating more random samples than I need in the end)?
In the following, I don't use numba, but all the operations use vectorized numpy functions.
Each row of the result that you generate can be interpreted as an integer expressed in base N, where N is the number of categories. With that interpretation, what you want is to sample without replacement from the integers [0, 1, ... N**R-1], where R is the number of "records". You can use the choice function for that, with the argument replace=False. Once you have that, you need to convert the chosen integers to base N. For that, I use the function int2base, which is a pared down version of a function that I wrote in a different answer.
Here's the code:
import numpy as np
def int2base(x, base, ndigits):
# x = np.asarray(x) # Uncomment this line for general purpose use.
powers = base ** np.arange(ndigits)
digits = (x.reshape(x.shape + (1,)) // powers) % base
return digits
def makesample(ncategories, nrecords, nsamples, rng=None):
if rng is None:
rng = np.random.default_rng()
n = ncategories ** nrecords
choices = rng.choice(n, replace=False, size=nsamples)
return int2base(choices, ncategories, nrecords)
In makesample, I included the optional argument rng. It allows you to specify the object that holds the choice function. If not provided, it uses np.random.default_rng().
Example:
In [118]: makesample(2, 3, 6)
Out[118]:
array([[0, 1, 1],
[0, 0, 1],
[1, 0, 1],
[0, 0, 0],
[1, 1, 0],
[1, 1, 1]])
In [119]: makesample(5, 4, 12)
Out[119]:
array([[3, 4, 0, 1],
[2, 0, 2, 0],
[4, 2, 4, 3],
[0, 1, 0, 4],
[0, 2, 0, 1],
[1, 2, 0, 1],
[0, 3, 0, 4],
[3, 3, 0, 3],
[3, 4, 1, 4],
[2, 4, 1, 1],
[3, 4, 1, 0],
[1, 1, 4, 4]])
makesample will raise an exception if you ask for too many samples:
In [120]: makesample(2, 3, 10)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-120-80044e78a60a> in <module>
----> 1 makesample(2, 3, 10)
~/code_snippets/python/numpy/random_samples_for_so_question.py in makesample(ncategories, nrecords, nsamples, rng)
17 rng = np.random.default_rng()
18 n = ncategories ** nrecords
---> 19 choices = rng.choice(n, replace=False, size=nsamples)
20 return int2base(choices, ncategories, nrecords)
_generator.pyx in numpy.random._generator.Generator.choice()
ValueError: Cannot take a larger sample than population when 'replace=False'

What is the pythonic solution to enumerate and update items from a matrix?

I did a for loop using enumerate from values in a matrix and tried assigning a value to the items that are different than 0 while appending to a list elements that are equal to 0. The fact is that original matrix don't get updated.
Sample code:
matrix = [[0, 0, 0], [0, 1, 0], [1, 1, 1]]
current = []
for x, i in enumerate(matrix):
for y, j in enumerate(i):
if j == 0:
current.append((x, y))
else:
#matrix[x][y] = -1 # This works
j = -1 # This doesn't
Since this doesn't work, there is no utility in using enumerate for that case. So I changed the code to:
matrix = [[0, 0, 0], [0, 1, 0], [1, 1, 1]]
current = []
for x in range(len(matrix)):
for y in range(len(matrix[0])):
if matrix[x][y] == 0:
current.append((x, y))
else:
matrix[x][y] = -1
The code above IMO is much less readble and also pylint suggests against using that with:
C0200: Consider using enumerate instead of iterating with range and
len (consider-using-enumerate)
You can't just update 2d array in-place through assigning to local variable j = -1 (which is reinitialized on each loop iteration for y, j in enumerate(i)).
In your simple case you can update your matrix with the following simple traversal:
matrix = [[0, 0, 0], [0, 1, 0], [1, 1, 1]]
for i, row in enumerate(matrix):
for j, val in enumerate(row):
if val != 0: matrix[i][j] = -1
print(matrix) # [[0, 0, 0], [0, -1, 0], [-1, -1, -1]]
Though Numpy provides a more powerful way for updating matrices:
import numpy as np
matrix = np.array([[0, 0, 0], [0, 1, 0], [1, 1, 1]])
matrix = np.where(matrix == 0, matrix, -1)
print(matrix)

Elements in a list are overwritten

I tried to program a function which creates the linear span of a list of independent vectors, but it seems that the last calculated vector overwrites all other elements. I'd be nice if someone could help me fixing it.
def span_generator(liste,n):
"""function to generate the span of a list of linear independent
vectors(in liste) in the n-dimensional vectorspace of a finite
field with characteristic 2, returns a list of all elements which
lie inside the span"""
results=[]
blank=[]
for i in range(n):
blank.append(0)
a=blank
if len(liste)>1:
listenwert=liste[-1]
liste.pop(-1)
values=span_generator(liste,n)
for i in range(2):
for j in range(len(values)):
for k in range(n):
a[k]=(i*listenwert[k]+values[j][k])%2
results.append(a)
else:
for i in range(2):
for j in range(n):
a[j]=(i*liste[0][j])
results.append(a)
print(results)
return results
print(span_generator([[1,0],[0,1]],2)) gives following results
[[1, 0], [1, 0]]
[[1, 1], [1, 1], [1, 1], [1, 1]]
[[1, 1], [1, 1], [1, 1], [1, 1]]
instead of the expected: [[0,0],[1,0],[0,1],[1,1]]
Edit: I tried to simplify the program with itertools.product, but it didn't solve the problem.
def span_generator(liste):
n=len(liste[0])
results=[]
coeff=list(itertools.product(range(2), repeat=n))
blank=[]
for i in range(n):
blank.append(0)
for i in range(len(coeff)):
a=blank
for j in range(len(coeff[0])):
for k in range(n):
a[k]=(a[k]+coeff[i][j]*liste[j][k])%2
results.append(a)
return results
Output: span_generator([[0,1],[1,0]])
[[0, 0], [0, 0], [0, 0], [0, 0]]
But it should give [[0,0],[0,1],[1,0],[1,1]]
Another example: span_generator([[0,1,1],[1,1,0]]) should give [[0,0,0],[0,1,1],[1,1,0],[1,0,1]] (2=0 since i'm calculating modulo 2)
Coefficients
You can use itertools.product to generate the coefficients:
n = len(liste[0])
coefficients = itertools.product(range(2), repeat=len(liste))
yields an iterator with this content:
[(0, 0), (0, 1), (1, 0), (1, 1)]
Linear combinations
You can then selectively multiply the results with the transpose of your liste (list(zip(*liste)))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) for i in range(n)]
which take for each dimensionality (for i in range(n)) the sum of the products
def span_generator3(liste):
n = len(liste[0])
transpose = list(zip(*liste))
coefficients = itertools.product(range(2), repeat=len(liste))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) % 2 for i in range(n)]
this produces an iterator. If you want the result in a list-form, just can list() on the iterator
Result
list(span_generator3([[1,2],[4,8]]))
output:
[[0, 0], [4, 8], [1, 2], [5, 10]]
Higher dimensions
list(sorted(span_generator3([[1,2, 4],[8, 16, 32], [64, 128, 256]])))
output:
[[0, 0, 0],
[1, 2, 4],
[8, 16, 32],
[9, 18, 36],
[64, 128, 256],
[65, 130, 260],
[72, 144, 288],
[73, 146, 292]]
Modulo 2
If you want the result modulo 2, that's just adding 2 characters in the right place
def span_generator3_mod2(liste):
n = len(liste[0])
transpose = list(zip(*liste))
coefficients = itertools.product(range(2), repeat=len(liste))
# print(list(itertools.product(range(2), repeat=len(liste))))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) % 2 for i in range(n)]
list(span_generator3_mod2([[0,1,1],[1,1,0]])) gives
[[0, 0, 0], [1, 1, 0], [0, 1, 1], [1, 0, 1]]

Resources