Related
I am now using lists to represent the graph, which would be similar to previous question. I found out that the dict approach would be very long and complex, so decided to go with the list approach. But I am still facing a few roadblocks.
So for example, the graph:
is now represented as:
nodes = ["1", "2", "3", "4", "5"]
edges = [
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[1, 0, 1, 0, 2],
[1, 2, 0, 0, 0],
]
Here, edge weights can only be 1 or 2 and 0 represents no edge from one node to other. The edges are directed, so every list in the matrix represents the edges coming toward the node.
Similar to the last question, I want all possible two-edge modifications on the graph. So, for example, if we add an edge from node "4" to "5" with weight of 1, and remove the edge with weight 1 coming from node "1" to "4", the new graph will look like:
edges = [
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[0, 0, 1, 0, 2],
[1, 2, 0, 1, 0],
]
and this is one of the possible modifications.
I want to build a generator that can create all such modifications sequentially and pass it to me so that I can use them to test.
My code so far is like this:
def all_modification_generation(graph: list[list], iter_count: int = 0):
possible_weights = {-1, 0, 1}
node_len = len(graph)
for i in range(node_len**2):
ix_x = i // node_len
ix_y = i % node_len
if i == ix_y:
continue
for possible_pertubs in possible_weights - {graph[ix_x][ix_y]}:
graph[ix_x][ix_y] = possible_pertubs
if iter_count == 0:
all_modification_generation(graph=graph, iter_count=iter_count + 1)
else:
yield all_modification_generation(graph=graph)
My logic is, once I do one change, I can then loop over all other elements that come after it in the matrix. So this problem could be solved recursively. And once a node is explored, we do not need to take it into consideration for next loops, because it will just give us a duplicate result that we have already found. And because I need to check for 2 modifications, I am increasing iter_count after first iteration and then yielding the next time. I am skipping ix_x == ix_y cases because a self-looping edge does not make any sense in this context, so that change is not required to be recorded.
But even then, this does not output any result. What am I doing wrong? Any help is appreciated, thanks!
Edit: I think I have figured out a way to do the double modification without repetitive generation of modified matrices. Now the only problem is that there is quite a bit of code repetition and a 4-level nested for-loop.
I'm not sure how to call a generator recursively, but I feel that should be the way to go! Thanks J_H for pointing me to the right direction.
The working code is:
def all_modification_generation(graph: list[list]):
possible_weights = {-1, 0, 1}
node_len = len(graph)
for i in range(node_len**2):
ix_x1 = i // node_len
ix_y1 = i % node_len
if ix_x1 == ix_y1:
continue
for possible_pertubs in possible_weights - {graph[ix_x1][ix_y1]}:
cc1_graph = deepcopy(graph)
cc1_graph[ix_x1][ix_y1] = possible_pertubs
for j in range(i + 1, node_len**2):
ix_x2 = j // node_len
ix_y2 = j % node_len
if ix_x2 == ix_y2:
continue
for possible_perturbs2 in possible_weights - {cc1_graph[ix_x2][ix_y2]}:
cc2_graph = deepcopy(cc1_graph)
cc2_graph[ix_x2][ix_y2] = possible_perturbs2
yield cc2_graph
The quadratic looping is an interesting technique.
We do wind up with quite a few repeated
division results, from // node_len, but that's fine.
I had a "base + edits" datastructure in mind for this problem.
Converting array to list-of-lists would be straightforward.
After overhead, a 5-node graph consumes 25 bytes -- pretty compact.
Numpy offers good support for several styles of sparse
graphs, should that become of interest.
from typing import Generator, Optional
import numpy as np
class GraphEdit:
"""A digraph with many base edge weights plus a handful of edited weights."""
def __init__(self, edge: np.ndarray, edit: Optional[dict] = None):
a, b = edge.shape
assert a == b, f"Expected square matrix, got {a}x{b}"
self.edge = edge # We treat these as immutable weights.
self.edit = edit or {}
#property
def num_nodes(self):
return len(self.edge)
def __getitem__(self, item):
return self.edit.get(item, self.edge[item])
def __setitem__(self, item, value):
self.edit[item] = value
def as_array(g: GraphEdit) -> np.ndarray:
return np.array([[g[i, j] for j in range(g.num_nodes)] for i in range(g.num_nodes)])
def all_single_mods(g: GraphEdit) -> Generator[GraphEdit, None, None]:
"""Generates all possible single-edge modifications to the graph."""
orig_edit = g.edit.copy()
for i in range(g.num_nodes):
for j in range(g.num_nodes):
if i == j: # not an edge -- we don't support self-loops
continue
valid_weights = {0, 1, 2} - {g[i, j]}
for w in sorted(valid_weights):
yield GraphEdit(g.edge, {**orig_edit, (i, j): w})
def all_mods(g: GraphEdit, depth: int) -> Generator[GraphEdit, None, None]:
assert depth >= 1
if depth == 1:
yield from all_single_mods(g)
else:
for gm in all_single_mods(g):
yield from all_mods(gm, depth - 1)
def all_double_mods(g: GraphEdit) -> Generator[GraphEdit, None, None]:
"""Generates all possible double-edge modifications to the graph."""
yield from all_mods(g, 2)
Here's the associated test suite.
import unittest
from numpy.testing import assert_array_equal
import numpy as np
from .graph_edit import GraphEdit, all_double_mods, all_single_mods, as_array
class GraphEditTest(unittest.TestCase):
def setUp(self):
self.g = GraphEdit(
np.array(
[
[0, 2, 1, 2, 0],
[1, 0, 1, 0, 0],
[0, 2, 0, 0, 0],
[1, 0, 1, 0, 2],
[1, 2, 0, 0, 0],
],
dtype=np.uint8,
)
)
def test_graph_edit(self):
g = self.g
self.assertEqual(5, self.g.num_nodes)
self.assertEqual(2, g[0, 1])
g[0, 1] = 3
self.assertEqual(3, g[0, 1])
del g.edit[(0, 1)]
self.assertEqual(2, g[0, 1])
def test_non_square(self):
with self.assertRaises(AssertionError):
GraphEdit(np.array([[0, 0], [1, 1], [2, 2]]))
def test_all_single_mods(self):
g = GraphEdit(np.array([[0, 0], [1, 0]]))
self.assertEqual(4, len(list(all_single_mods(g))))
expected = [
np.array([[0, 1], [1, 0]]),
np.array([[0, 2], [1, 0]]),
np.array([[0, 0], [0, 0]]),
np.array([[0, 0], [2, 0]]),
]
for ex, actual in zip(
expected,
map(as_array, all_single_mods(g)),
):
assert_array_equal(ex, actual)
# Now verify that original graph is untouched.
assert_array_equal(
np.array([[0, 0], [1, 0]]),
as_array(g),
)
def test_all_double_mods(self):
g = GraphEdit(np.array([[0, 0], [1, 0]]))
self.assertEqual(16, len(list(all_double_mods(g))))
expected = [
np.array([[0, 0], [1, 0]]),
np.array([[0, 2], [1, 0]]),
np.array([[0, 1], [0, 0]]),
np.array([[0, 1], [2, 0]]),
np.array([[0, 0], [1, 0]]), # note the duplicate
np.array([[0, 1], [1, 0]]),
np.array([[0, 2], [0, 0]]), # and it continues on in this vein
]
for ex, actual in zip(
expected,
map(as_array, all_double_mods(g)),
):
assert_array_equal(ex, actual)
def test_many_mods(self):
self.assertEqual(40, len(list(all_single_mods(self.g))))
self.assertEqual(1_600, len(list(all_double_mods(self.g))))
self.assertEqual(1_600, len(list(all_mods(self.g, 2))))
self.assertEqual(64_000, len(list(all_mods(self.g, 3))))
self.assertEqual(2_560_000, len(list(all_mods(self.g, 4))))
One could quibble about the fact that
it produces duplicates, since inner and outer loops
know nothing of one another.
It feels like this algorithm wants to use an
itertools.combinations
approach, generating all modifications in lexicographic order.
I have a A = 10x1000 tensor and a B = 10x1000 index tensor. The tensor B has values between 0-999 and it's used to gather values from A (B[0,:] gathers from A[0,:], B[1,:] from A[1,:], etc...).
However, if I use tf.gather(A, B) I get an array of shape (10, 1000, 1000) when I'm expecting a 10x1000 tensor back. Any ideas how I could fix this?
EDIT
Let's say A= [[1, 2, 3],[4,5,6]] and B = [[0, 1, 1],[2,1,0]] What I want is to be able to sample A using the corresponding B. This should result in C = [[1, 2, 2],[6,5,4]].
Dimensions of tensors are known in advance.
First we 'unstack' both the parameters and indices (A and B respectively) along the first dimension. Then we apply tf.gather() such that rows of A correspond to the rows of B. Finally, we stack together the result.
import tensorflow as tf
import numpy as np
def custom_gather(a, b):
unstacked_a = tf.unstack(a, axis=0)
unstacked_b = tf.unstack(b, axis=0)
gathered = [tf.gather(x, y) for x, y in zip(unstacked_a, unstacked_b)]
return tf.stack(gathered, axis=0)
a = tf.convert_to_tensor(np.array([[1, 2, 3], [4, 5, 6]]), tf.float32)
b = tf.convert_to_tensor(np.array([[0, 1, 1], [2, 1, 0]]), dtype=tf.int32)
gathered = custom_gather(a, b)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(gathered))
# [[1. 2. 2.]
# [6. 5. 4.]]
For you initial case with shapes 1000x10 we get:
a = tf.convert_to_tensor(np.random.normal(size=(10, 1000)), tf.float32)
b = tf.convert_to_tensor(np.random.randint(low=0, high=999, size=(10, 1000)), dtype=tf.int32)
gathered = custom_gather(a, b)
print(gathered.get_shape().as_list()) # [10, 1000]
Update
The first dimension is unknown (i.e. None)
The previous solution works only if the first dimension is known in advance. If the dimension is unknown we solve it as follows:
We stack together two tensors such that the rows of both tensors are stacked together:
# A = [[1, 2, 3], [4, 5, 6]] [[[1 2 3]
# ---> [0 1 1]]
# [[4 5 6]
# B = [[0, 1, 1], [2, 1, 0]] [2 1 0]]]
We iterate over the elements of this stacked tensor (which consists of stacked together rows of A and B) and using tf.map_fn() function we apply tf.gather().
We stack back the elements we get with tf.stack()
import tensorflow as tf
import numpy as np
def custom_gather_v2(a, b):
def apply_gather(x):
return tf.gather(x[0], tf.cast(x[1], tf.int32))
a = tf.cast(a, dtype=tf.float32)
b = tf.cast(b, dtype=tf.float32)
stacked = tf.stack([a, b], axis=1)
gathered = tf.map_fn(apply_gather, stacked)
return tf.stack(gathered, axis=0)
a = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32)
b = np.array([[0, 1, 1], [2, 1, 0]], dtype=np.int32)
x = tf.placeholder(tf.float32, shape=(None, 3))
y = tf.placeholder(tf.int32, shape=(None, 3))
gathered = custom_gather_v2(x, y)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(gathered, feed_dict={x:a, y:b}))
# [[1. 2. 2.]
# [6. 5. 4.]]
Use tf.gather with batch_dims=-1:
import numpy as np
import tensorflow as tf
rois = np.array([[1, 2, 3],[3, 2, 1]])
ind = np.array([[0, 2, 1, 1, 2, 0, 0, 1, 1, 2],
[0, 1, 2, 0, 2, 0, 1, 2, 2, 2]])
tf.gather(rois, ind, batch_dims=-1)
# output:
# <tf.Tensor: shape=(2, 10), dtype=int64, numpy=
# array([[1, 3, 2, 2, 3, 1, 1, 2, 2, 3],
# [3, 2, 1, 3, 1, 3, 2, 1, 1, 1]])>
I have been using this model with binary data to predict likely hood of play from this guide.
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
model = keras.Sequential()
input_layer = keras.layers.Dense(3, input_shape=[3], activation='tanh')
model.add(input_layer)
output_layer = keras.layers.Dense(1, activation='sigmoid')
model.add(output_layer)
gd = tf.train.GradientDescentOptimizer(0.01)
model.compile(optimizer=gd, loss='mse')
sess = tf.Session() #NEW LINE
training_x = np.array([[1, 1, 0], [1, 1, 1], [0, 1, 0], [-1, 1, 0], [-1, 0, 0], [-1, 0, 1],[0, 0, 1], [1, 1, 0], [1, 0, 0], [-1, 0, 0], [1, 0, 1], [0, 1, 1], [0, 0, 0], [-1, 1, 1]])
training_y = np.array([[0], [0], [1], [1], [1], [0], [1],[0], [1], [1], [1], [1], [1], [0]])
init_op = tf.initializers.global_variables()
sess.run(init_op) #NEW LINE
model.fit(training_x, training_y, epochs=1000, steps_per_epoch = 10)
text_x = np.array([[1, 0, 0]])
test_y = model.predict(text_x, verbose=0, steps=1)
print(test_y)
All the current data is binary and model works with binary, is there any model or way to convert non-binary data to binary predict likelihood of product_sold in the below data set?
dataset:
number_infants cost_of_infants estimated_cost_infants product_sold
5 1000 2000 0
6 8919 1222 1
7 10000 891 1
product_sold
1 = yes
0 = no
edit:
lst = array of the first three columns of the df
[[5,1000,2000],[6,8919,1222]]
lst_1 = array of only the 4th column
[[0,1,1]]
training_x = np.array(lst)
training_y = np.array(lst_1)
I encountered error 'Tensor' object has no attribute 'assign_add' when I try to use the assign_add or assign_sub function.
The code is shown below:
I defined two tensor t1 and t2, with the same shape, and same data type.
>>> t1 = tf.Variable(tf.ones([2,3,4],tf.int32))
>>> t2 = tf.Variable(tf.zeros([2,3,4],tf.int32))
>>> t1
<tf.Variable 'Variable_4:0' shape=(2, 3, 4) dtype=int32_ref>
>>> t2
<tf.Variable 'Variable_5:0' shape=(2, 3, 4) dtype=int32_ref>
then I use the assign_add on t1 and t2 to create t3
>>> t3 = tf.assign_add(t1,t2)
>>> t3
<tf.Tensor 'AssignAdd_4:0' shape=(2, 3, 4) dtype=int32_ref>
then I try to create a new tensor t4 using t1[1] and t2[1], which are tensors with same shape and same data type.
>>> t1[1]
<tf.Tensor 'strided_slice_23:0' shape=(3, 4) dtype=int32>
>>> t2[1]
<tf.Tensor 'strided_slice_24:0' shape=(3, 4) dtype=int32>
>>> t4 = tf.assign_add(t1[1],t2[1])
but got error,
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/admin/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 245, in assign_add
return ref.assign_add(value)
AttributeError: 'Tensor' object has no attribute 'assign_add'
same error when using assign_sub
>>> t4 = tf.assign_sub(t1[1],t2[1])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/admin/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 217, in assign_sub
return ref.assign_sub(value)
AttributeError: 'Tensor' object has no attribute 'assign_sub'
Any idea where is wrong?
Thanks.
The error is because t1 is a tf.Variable object , while t1[1] is a tf.Tensor.(you can see this in the outputs to your print statements.).Ditto for t2 and t[[2]]
As it happens, tf.Tensor can't be mutated(it's read only) whereas tf.Variable can be(read as well as write)
see here.
Since tf.scatter_add,does an inplace addtion, it doesn't work with t1[1] and t2[1] as inputs, while there's no such problem with t1 and t2 as inputs.
What you are trying to do here is a little bit confusing. I don't think you can update slices and create a new tensor at the same time/line.
If you want to update slices before creating t4, use tf.scatter_add() (or tf.scatter_sub() or tf.scatter_update() accordingly) as suggested here. For example:
sa = tf.scatter_add(t1, [1], t2[1:2])
Then if you want to get a new tensor t4 using new t1[1] and t2[1], you can do:
with tf.control_dependencies([sa]):
t4 = tf.add(t1[1],t2[1])
Here are some examples for using tf.scatter_add and tf.scatter_sub
>>> t1 = tf.Variable(tf.ones([2,3,4],tf.int32))
>>> t2 = tf.Variable(tf.zeros([2,3,4],tf.int32))
>>> init = tf.global_variables_initializer()
>>> sess.run(init)
>>> t1.eval()
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]], dtype=int32)
>>> t2.eval()
array([[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]], dtype=int32)
>>> t3 = tf.scatter_add(t1,[0],[[[2,2,2,2],[2,2,2,2],[2,2,2,2]]])
>>> sess.run(t3)
array([[[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]], dtype=int32)
>>>t4 = tf.scatter_sub(t1,[0,0,0],[t1[1],t1[1],t1[1]])
Following is another example, which can be found at https://blog.csdn.net/efforever/article/details/77073103
Because few examples illustrating scatter_xxx can be found on the web, I paste it below for reference.
import tensorflow as tf
import numpy as np
with tf.Session() as sess1:
c = tf.Variable([[1,2,0],[2,3,4]], dtype=tf.float32, name='biases')
cc = tf.Variable([[1,2,0],[2,3,4]], dtype=tf.float32, name='biases1')
ccc = tf.Variable([0,1], dtype=tf.int32, name='biases2')
#对应label的centers-diff[0--]
centers = tf.scatter_sub(c,ccc,cc)
#centers = tf.scatter_sub(c,[0,1],cc)
#centers = tf.scatter_sub(c,[0,1],[[1,2,0],[2,3,4]])
#centers = tf.scatter_sub(c,[0,0,0],[[1,2,0],[2,3,4],[1,1,1]])
#即c[0]-[1,2,0] \ c[0]-[2,3,4]\ c[0]-[1,1,1],updates要减完:indices与updates元素个数相同
a = tf.Variable(initial_value=[[0, 0, 0, 0],[0, 0, 0, 0]])
b = tf.scatter_update(a, [0, 1], [[1, 1, 0, 0], [1, 0, 4, 0]])
#b = tf.scatter_update(a, [0, 1,0], [[1, 1, 0, 0], [1, 0, 4, 0],[1, 1, 0, 1]])
init = tf.global_variables_initializer()
sess1.run(init)
print(sess1.run(centers))
print(sess1.run(b))
[[ 0. 0. 0.]
[ 0. 0. 0.]]
[[1 1 0 0]
[1 0 4 0]]
[[-3. -4. -5.]
[ 2. 3. 4.]]
[[1 1 0 1]
[1 0 4 0]]
You can also use tf.assign() as a workaround as sliced assign was implemented for it, unlike for tf.assign_add() or tf.assign_sub(), as of TensorFlow version 1.8. Please note, you can only do one slicing operation (slice into slice is not going to work) and also this is not atomic, so if there are multiple threads reading and writing to the same variable, you don't know which operation will be the last one to write unless you explicitly code for it. tf.assign_add() and tf.assign_sub() are guaranteed to be thread safe. Still, this is better that nothing: consider this code (tested):
import tensorflow as tf
t1 = tf.Variable(tf.zeros([2,3,4],tf.int32))
t2 = tf.Variable(tf.ones([2,3,4],tf.int32))
assign_op = tf.assign( t1[ 1 ], t1[ 1 ] + t2[ 1 ] )
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run( init_op )
res = sess.run( assign_op )
print( res )
will output:
[[[0 0 0 0]
[0 0 0 0]
[0 0 0 0]]
[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]]
as desired.
I tried to program a function which creates the linear span of a list of independent vectors, but it seems that the last calculated vector overwrites all other elements. I'd be nice if someone could help me fixing it.
def span_generator(liste,n):
"""function to generate the span of a list of linear independent
vectors(in liste) in the n-dimensional vectorspace of a finite
field with characteristic 2, returns a list of all elements which
lie inside the span"""
results=[]
blank=[]
for i in range(n):
blank.append(0)
a=blank
if len(liste)>1:
listenwert=liste[-1]
liste.pop(-1)
values=span_generator(liste,n)
for i in range(2):
for j in range(len(values)):
for k in range(n):
a[k]=(i*listenwert[k]+values[j][k])%2
results.append(a)
else:
for i in range(2):
for j in range(n):
a[j]=(i*liste[0][j])
results.append(a)
print(results)
return results
print(span_generator([[1,0],[0,1]],2)) gives following results
[[1, 0], [1, 0]]
[[1, 1], [1, 1], [1, 1], [1, 1]]
[[1, 1], [1, 1], [1, 1], [1, 1]]
instead of the expected: [[0,0],[1,0],[0,1],[1,1]]
Edit: I tried to simplify the program with itertools.product, but it didn't solve the problem.
def span_generator(liste):
n=len(liste[0])
results=[]
coeff=list(itertools.product(range(2), repeat=n))
blank=[]
for i in range(n):
blank.append(0)
for i in range(len(coeff)):
a=blank
for j in range(len(coeff[0])):
for k in range(n):
a[k]=(a[k]+coeff[i][j]*liste[j][k])%2
results.append(a)
return results
Output: span_generator([[0,1],[1,0]])
[[0, 0], [0, 0], [0, 0], [0, 0]]
But it should give [[0,0],[0,1],[1,0],[1,1]]
Another example: span_generator([[0,1,1],[1,1,0]]) should give [[0,0,0],[0,1,1],[1,1,0],[1,0,1]] (2=0 since i'm calculating modulo 2)
Coefficients
You can use itertools.product to generate the coefficients:
n = len(liste[0])
coefficients = itertools.product(range(2), repeat=len(liste))
yields an iterator with this content:
[(0, 0), (0, 1), (1, 0), (1, 1)]
Linear combinations
You can then selectively multiply the results with the transpose of your liste (list(zip(*liste)))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) for i in range(n)]
which take for each dimensionality (for i in range(n)) the sum of the products
def span_generator3(liste):
n = len(liste[0])
transpose = list(zip(*liste))
coefficients = itertools.product(range(2), repeat=len(liste))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) % 2 for i in range(n)]
this produces an iterator. If you want the result in a list-form, just can list() on the iterator
Result
list(span_generator3([[1,2],[4,8]]))
output:
[[0, 0], [4, 8], [1, 2], [5, 10]]
Higher dimensions
list(sorted(span_generator3([[1,2, 4],[8, 16, 32], [64, 128, 256]])))
output:
[[0, 0, 0],
[1, 2, 4],
[8, 16, 32],
[9, 18, 36],
[64, 128, 256],
[65, 130, 260],
[72, 144, 288],
[73, 146, 292]]
Modulo 2
If you want the result modulo 2, that's just adding 2 characters in the right place
def span_generator3_mod2(liste):
n = len(liste[0])
transpose = list(zip(*liste))
coefficients = itertools.product(range(2), repeat=len(liste))
# print(list(itertools.product(range(2), repeat=len(liste))))
for coeff in coefficients:
yield [sum((a * c) for a, c in zip(transpose[i], coeff)) % 2 for i in range(n)]
list(span_generator3_mod2([[0,1,1],[1,1,0]])) gives
[[0, 0, 0], [1, 1, 0], [0, 1, 1], [1, 0, 1]]