Pruned Dynamic Programming - dynamic-programming

Currently, I'm working a string alignment comparison. I'm confused on how to optimize DP by pruning.
DP can be represented as a matrix/table. The start point is (0, 0). For example, element at (3, 4) is pruned and its value marked as -1 or null. But when I compute location (4, 4), (3, 5) and (4, 5), I still need a if-statement to check whether the value of (3, 4) is invalid(pruned) or valid(not pruned). Can this implementation save time because pruning function brings extra running time???

Related

Formulating Constraints for an optimization Problem

Good day
I would be very glad for some help.
I am currently writing my master thesis. I have the following mixed integer linear optimization:
Optimization Problem
Sets and parameters
Finally, I want to minimize the start time of the last activity (dummy variable, end). Each activity has different modes. The activities in the different modes can take different time.
Now the task is to find a start solution with simple constraints, which does not have to be optimal yet. The parameters and variables are already defined and approved.
My idea would be for now the following constraints:
for each activity the mode with the shortest time duration is used.
activity i must be finished before i starts
PROBLEM 1 - 1st constraint:
A list created with the activities each in the mode with the minimum duration and sorted by it:
p_im_min = {i: np.min([p[i,m] for m in M_i[i]]) for i in V}
p_im_min[0] = 0
p_im_min[n+1] = 0
p_sort = list(sorted(p_im_min.items(), key = lambda kv: kv[1]))
p_sort = [(3, 1), (4, 1), (5, 1), (7, 1), (13, 1), (14, 1), (15, 1), (19, 1), (1, 2), (2, 2), (8, 2), (16, 2), (17, 2), (18, 2), (20, 2), (6, 3), (10, 3), (9, 4), (12, 4), (11, 5)]
with (i,m) -> i for the activity and m for the mode.
The variable x is already defined in my code as:
x = mdl.addVars([(i,m)
for i in V_ext
for m in M_i[i]],
vtype = grb.GRB.BINARY)
so =1, if activity i is executed in mode m / = 0, otherwise
Then I tried to add the constraint:
mdl.addConstrs(x[i,m] == 1
for (i,m) in p_sort)
But in doing so, I get the error message "Variable not in model". But i defined x, didn't I?
PROBLEM 2 - 2nd constraint:
The variable y is already defined in my code as:
y = mdl.addVars([(i,j)
for i in V_ext
for j in V_ext
if i != j],
vtype = grb.GRB.BINARY)
so =1, if activity i must be completed before the start of activity j / = 0, otherwise
Admittedly a bit unimaginative I created the list, for the activities (couldn't figure out a better way):
order_act = list[(3,4),
(4,5),
(5,7),
(7, 13),
(13,14),
(14, 15),
(15, 19),
(19, 1),
(1,2),
(2,8),
(8,16),
(16,17),
(17,18),
(18,20),
(20,6),
(6, 10),
(10,9),
(9,12),
(12,11)]
So, for example, (3,4) -> 3 (i) must be finished before 4 (j) starts.
My idea for the constraint would have been the following:
mdl.addConstrs(y[i,j] == 1 for (i,j) in order_act)
Again, I get an error message: types.GenericAlias' object is not iterable. Why is the object not iterable?
Can anyone help me with the problems? Where are my thinking errors? Probably it is totally easy for all of you, but unfortunately I am still a Python beginner, so I'd be really thankful for some help.
I'm not familiar with the package you are using, but I will allow myself to suggest you PulP, wich is another milp package, that in my opinion is easier to use. Also PulP has options to use specific solvers (Gurobi, Cplex and others). Given that, the code would look like this:
pairs = [(i, m) for i in V_ext for m in M_i[i]]]
x = pulp.LpVariable.dicts('x', pairs, cat='Binary')
In that order, you may want to check if changing the package may give you additional clues, like the need of a linking constraint or alike.

Condense list of nested tuples

I have an assignment that I have successfully solved using defaultdict(list).
In a nutshell, take two pairs of points (Ax, Ay) and (Bx, By) and compute the slope.
Then combine all points that have the same slope together.
Using defaultdict(list) I did this:
dic = defaultdict(list)
for elem in result:
x1 = elem[0][0]
y1 = elem[0][1]
x2 = elem[1][0]
y2 = elem[1][1]
si = slope_intercept(x1, y1, x2, y2)
temp = defaultdict(list)
temp[si].append(elem)
FullMergeDict(dic, temp)
temp.clear()
Works perfectly. (Yes, there's a lot more to the whole program not shown.)
However, I am being told to discard defaultdict(list) and that I must use a nested tuple based structure.
I have a list of tuples where the structure looks like: (((1, 2), 3), (2, 5))
(1, 2) is the first coordinate point
3 is the computed slope
(2, 5) is the second coordinate point
NOTE: These are just made up values to illustrate structure. The points almost certainly will not
generate the shown slopes.
If I start with this:
start = [(((1, 2), 3), (2, 5)), (((4, 5), 2), (3, 7)), (((2, 4), 1), (8, 9)), (((1, 2), 3), (4, 8))]
I need to end up with this:
end = [((1, 2), (2, 5), (1, 2), (4, 8)), ((4, 5), (3, 7)), ((2, 4), (8, 9))]
For every unique slope, I need a tuple of all the coordinates that share that same slope.
In the above example, the first and last tuples shared the same slope, 3, so all pairs of coordinates
with slope 3 are combined into one tuple. Yes I realize that (1, 2) is represented twice in my example. If there was another set of coordinates with slope 3, then the first tuple would contain
those additional coordinates, including duplicates. Note the embedded slope from 'start' is discarded.
defaultdict(list) made this quite straightforward. I made the key the slope and then merged the values (coordinates).
I can't seem to work through how to transform 'start' into 'end' using this required structure.
I'm not sure what you mean by "I must use the structure detailed above". You have start, you want end, so at some point there is a change to the structure. Do you mean that you are not allowed to use a dictionary or a list at all? How does your instructor expect that you go from start to end without using anything else? Here's an approach that uses only tuples (and the start and end lists).
end will be a list of tuples. We'll keep track of the slope in the a separate list. Expect end and lookup to look like so:
lookup = [ slope_1, , slope_2, ...]
end = [((p1_x, p1_y), (p2_x, p2_y), ...), ((p10_x, p10_y), (p11_x, p11_y)), ...]
start = [(((1, 2), 3), (2, 5)), (((4, 5), 2), (3, 7)), (((2, 4), 1), (8, 9)), (((1, 2), 3), (4, 8))]
end = []
lookup = []
def find_tuple_index_with_slope(needle_slope):
for index, item in enumerate(lookup):
if item == needle_slope:
return index
return None
for item in start:
p1 = item[0][0]
slope = item[0][1]
p2 = item[1]
# Check if end already contains this slope
slope_index = find_tuple_index_with_slope(slope)
if slope_index is None:
# If it doesn't exist, add an item to end
end.append(p1, p2))
# And add the slope to lookup
lookup.append(slope)
else:
# If it exists, append the new points to the existing value and
# reassign it to the correct index of end
end[slope_index] = (*end[slope_index], p1, p2)
Now, we have end looking like so:
[((1, 2), (2, 5), (1, 2), (4, 8)), ((4, 5), (3, 7)), ((2, 4), (8, 9))]
The reason this approach isn't great is the function find_tuple_index_with_slope() needs to iterate over all the elements in end to look up the correct one to append to. This increases the time complexity of the code, when you could use a dictionary to do this lookup and it would be much faster, especially if you have lots of points and lots of distinct values of slope.
A better way: replace the lookup function with a new dictionary, where the keys are the values of slope, and the values are the indices in end where the corresponding tuple is stored.
lookup = dict()
end = []
for item in start:
p1 = item[0][0]
slope = item[0][1]
p2 = item[1]
# Find the index of the tuple for `slope` using the lookup
slope_index = lookup.get(slope, None)
if slope_index is None:
# If it doesn't exist, add an item to end
end.append((p1, p2))
# And add that index to lookup
lookup[slope] = len(end) - 1
else:
end[slope_index] = (*end[slope_index], p1, p2)
The code looks almost the same as before, but looking up using a dictionary instead of a list is what saves you time.

How to pass list of tuples through a object method in python

Having this frustrating issue where i want to pass through the tuples in the following list
through a method on another list of instances of a class that i have created
list_1=[(0, 20), (10, 1), (0, 1), (0, 10), (5, 5), (10, 50)]
instances=[instance[0], instance[1],...instance[n]]
results=[]
pos_list=[]
for i in range(len(list_1)):
a,b=List_1[i]
result=sum(instance.method(a,b) for instance in instances)
results.append(result)
if result>=0:
pos_list.append((a,b))
print(results)
print(pos_list)
the issue is that all instances are taking the same tuple, where as i want the method on the first instance to take the first tuple and so on.
I ultimately want to see it append to the new list (pos_list) if the sum is >0.
Anyone know how i can iterate this properly?
EDIT
It will make it clearer if I print the result of the sum also.
Basically I want the sum to perform as follows:
result = instance[0].method(0,20), instance[1].method(10,1), instance[2].method(0,1), instance[3].method(0,10), instance[4].method(5,5), instance[5].method(10,50)
For info the method is just the +/- product of the two values depending on the attributes of the instance.
So results for above would be:
result = [0*20 - 10*1 - 0*1 + 0*10 - 5*5 + 10*50] = [465]
pos_list=[(0, 20), (10, 1), (0, 1), (0, 10), (5, 5), (10, 50)]
except what is actually doing is using the same tuple for all instances like this:
result = instance[0].method(0,20), instance[1].method(0,20), instance[2].method(0,20), instance[3].method(0,20), instance[4].method(0,20), instance[5].method(0,20)
result = [0*20 - 0*20 - 0*20 + 0*20 - 0*20 + 0*20] = [0]
pos_list=[]
and so on for (10,1) etc.
How do I make it work like the first example?
You can compute your sum using zip to generate all the pairs of correspondent instances and tuples.
result=sum(instance.payout(*t) for instance, t in zip(instances, List_1))
The zip will stop as soon as it reaches the end of the shortest of the two iterators. So if you have 10 instances and 100 tuples, zip will produce only 10 pairs, using the first 10 elements of both lists.
The problem I see in your code is that you are computing this sum for each element of List_1, so if payout produces always the same result with the same inputs (e.g., it has no memory or randomness), the value of result will be the same at each iteration. So, in the end, results will be composed by the same value repeated a number of times equal to the length of List_1, while pos_list will contain all (the sum is greater than 0) or none (the sum is less or equal to zero) of the input tuples.
Instead, it would make sense if items of List_1 were lists or tuples themselves:
List_1 = [
[(0, 1), (2, 3), (4, 5)],
[(6, 7), (8, 9), (10, 11)],
[(12, 13), (14, 15), (16, 17)],
]
So, in this case, supposing that your class for instances is something like this:
class Goofy:
def __init__(self, positive_sum=True):
self.positive_sum = positive_sum
def payout(self, *args):
if self.positive_sum:
return sum(args)
else:
return -1 * sum(args)
instances = [Goofy(i) for i in [True, True, False]]
you can rewrite your code in this way:
results=[]
pos_list=[]
for el in List_1:
result = sum(g.payout(*t) for g, t in zip(instances, el))
results.append(result)
if result >= 0:
pos_list.append(el)
Running the previous code, results will be:
[-3, 9, 21]
while pop_list:
[[(6, 7), (8, 9), (10, 11)], [(12, 13), (14, 15), (16, 17)]]
If you are interested only in pop_list, you can compact your code in only one line:
pop_list = list(filter(lambda el: sum(g.payout(*t) for g, t in zip(instances, el)) > 0, List_1))
many thanks for the above! I have it working now.
Wasn't able to use args given my method had a bit more to it but the use of zip is what made it click
import random
rand=random.choices(list_1, k=len(instances))
results=[]
pos_list=[]
for r in rand:
x,y=r
result=sum(instance.method(x,y) for instance,(x,y) in zip(instances, rand))
results.append(result)
if result>=0:
pos_list.append(rand)
print(results)
print(pos_list)
for list of e.g.
rand=[(20, 5), (0, 2), (0, 100), (2, 50), (5, 10), (50, 100)]
this returns the following
results=[147]
pos_list=[(20, 5), (0, 2), (0, 100), (2, 50), (5, 10), (50, 100)]
so exactly what I wanted. Thanks again!

Selecting sublists of a list of lists to define a relation

If I happen to have the following list of lists:
L=[[(1,3)],[(1,3),(2,4)],[(1,3),(1,4)],[(1,2)],[(1,2),(1,3)],[(1,3),(2,4),(1,2)]]
and what I wish to do, is to create a relation between lists in the following way:
I wish to say that
[(1,3)] and [(1,3),(1,4)]
are related, because the first is a sublist of the second, but then I would like to add this relation into a list as:
Relations=[([(1,3)],[(1,3),(1,4)])]
but, we can also see that:
[(1,3)] and [(1,3),(2,4)]
are related, because the first is a sublist of the second, so I would want this to also be a relation added into my Relations list:
Relations=[([(1,3)],[(1,3),(1,4)]),([(1,3)],[(1,3),(2,4)])]
The only thing I wish to be careful with, is that I am considering for a list to be a sublist of another if they only differ by ONE element. So in other words, we cannot have:
([(1,3)],[(1,3),(2,4),(1,2)])
as an element of my Relations list, but we SHOULD have:
([(1,3),(2,4)],[(1,3),(2,4),(1,2)])
as an element in my Relations list.
I hope there is an optimal way to do this, since in the original context I have to deal with a much bigger list of lists.
Any help given is much appreciated.
You really haven't provided enough information, so can't tell if you need itertools.combinations() or itertools.permutations(). Your examples work with itertools.combinations so will use that.
If x and y are two elements of the list then you just want all occurrences where the set(x).issubset(y) and the size of the set difference is <= 1 - len(set(y) - set(x)) <= 1, e.g.:
In []:
[[x, y] for x, y in it.combinations(L, r=2) if set(x).issubset(y) and len(set(y)-set(x)) <= 1]
Out[]:
[[[(1, 3)], [(1, 3), (2, 4)]],
[[(1, 3)], [(1, 3), (1, 4)]],
[[(1, 3)], [(1, 2), (1, 3)]],
[[(1, 3), (2, 4)], [(1, 3), (2, 4), (1, 2)]],
[[(1, 2)], [(1, 2), (1, 3)]],
[[(1, 2), (1, 3)], [(1, 3), (2, 4), (1, 2)]]]

how to get max value in spark rdd and remove it?

there is a RDD object:
//have some data in RDD[(Int, Int)] object
(1, 2)
(3, 2)
(2, 3)
(5, 4)
(2, 7)
(5, 2)
(5, 7)
I want to get max key and remove it, the max key is 5, so the result I want is:
//a new RDD object,RDD[(Int, Int)]
(1, 2)
(3, 2)
(2, 3)
(2, 7)
Could you help me? Thank you!
You need to first get the results sorted and then use RDD.max() to get the highest value and finally perform filter to filter the keys which are other than the highest key.
or
You can also register this as DataFrame and execute simple SQL query to get the results.

Resources