HOW TO SUM UP IN STRING SET IN PYOMO - string

I have a constraint that I called Prod_H2 and it depends on (i,s). This equation have sum of some variables, FIJ (i,j) and FIK (i,k).
***i=['U4241', 'U241', 'U241A']
HN_model.i=Set(initialize=[(len(i))])
j=['U4283', 'U283', 'U283A', 'U3283', 'U2280', 'U1280']
HN_model.j=Set(initialize=[(len(j))])
k=['PSA4241', 'PSA241', 'PSA241A', 'PSA3241']
HN_model.k=Set(initialize=[(len(k))])
s=[1]
HN_model.s=Set(initialize=range(len(s)))
HN_model.FIJ=Var(HN_model.i, HN_model.j, HN_model.s,domain=PositiveReals)
HN_model.FIW=Var(HN_model.i, HN_model.s, within=PositiveReals)
HN_model.FIK=Var(HN_model.i, HN_model.k, HN_model.s,within=PositiveReals)***
HN_model.Prod_H2=Constraint(HN_model.i, HN_model.s, expr=sum(HN_model.FIJ[i,j] for j in [len(HN_model.j)]) + sum(HN_model.FIK for k in [len(HN_model.k)]) + HN_model.FIW)
I think the error is because the sum is in j, to keep the equation in function i.
DeveloperError: Internal Pyomo implementation error: 'Unknown problem encountered when trying to retrieve index for component FIJ' Please report this to the Pyomo Developers.

As #AirSquid said in its comment, you are not constructing correctly the model. First at all, Pyomo allows index to be any datatype, int, float, str or even datetime. So, there's no problem to use the actual values of your list i, j, k and s. You do not need to use len to construct the set or the constraint index
Now, you got some mistakes in the modelling, besides the previous commented stuff. I'll try to show you:
In the constraint modeling, you do not have any equality or inequality sign. In pyomo you need whether to use an equality/inequality sign (==, <= or >=) or use other class to model such Equality or Inequality (pyomo.environ.Expression.inequality(2, model.x) is equivalent to 2 <= model.x)
When you're declaring the variables, you're modeling the variables with some index and calling them in the constraint with less index. For example:
HN_model.FIJ is modeled with index i,j,s but you're using i,j index in the constraint
Finally, I encourage you, as AirSquid told you, to check some examples about how to correctly modeling optimization problems with Pyomo. If you need any help, StackOverflow is always there.
I leave you an example of how to model the problem you just posted
from pyomo.environ import *
#set the model
HN_model = ConcreteModel()
i=['U4241', 'U241', 'U241A']
HN_model.i=Set(initialize=i)
j=['U4283', 'U283', 'U283A', 'U3283', 'U2280', 'U1280']
HN_model.j=Set(initialize=j)
k=['PSA4241', 'PSA241', 'PSA241A', 'PSA3241']
HN_model.k=Set(initialize=k)
s=[1]
HN_model.s=Set(initialize=s)
HN_model.FIJ=Var(HN_model.i, HN_model.j, HN_model.s,domain=PositiveReals)
HN_model.FIW=Var(HN_model.i, HN_model.s, within=PositiveReals)
HN_model.FIK=Var(HN_model.i, HN_model.k, HN_model.s,within=PositiveReals)
def constraint(HN_model, i, s):
'''Constraint modeling. I assume you are using the >='''
return sum(HN_model.FIJ[i,j,s] for j in HN_model.j) + sum(HN_model.FIK[i,k,s] for k in HN_model.k) + HN_model.FIW[i,s] >= 0
HN_model.Prod_H2=Constraint(HN_model.i, HN_model.s, rule=constraint)

Related

How to use conditions while operating on dataframes in julia

I am trying to find the mean value of the dataframe's elements in corresponding to particular column when either of the condition is true. For example:
Using Statistics
df = DataFrame(value, xi, xj)
resulted_mean = []
for i in range(ncol(df))
push!(resulted_mean, mean(df[:value], (:xi == i | :xj == i)))
Here, I am checking when either xi or xj is equal to i then find the mean of the all the corresponding values stored in [:value] column. This mean will later be pushed to the array -> resulted_mean
However, this code is not producing the desired output.
Please suggest the optimal approach to fix this code snippet.
Thanks in advance.
I agree with Bogumił's comment, you should really consult the Julia documentation to get a basic understanding of the language, and then run through the DataFrames tutorials. I will however annotate your code to point out some of the issues so you might be able to target your learning a bit better:
Using Statistics
Julia (like most other languages) is case sensitive, so writing Usingis not the same as the reserved keyword using which is used to bring package definitions into your namespace. The relevant docs entry is here
Note also that you are using the DataFrames package, so to make your code reproducible you would have had to do using DataFrames, Statistics.
df = DataFrame(value, xi, xj)
It's unclear what this line is supposed to do as the arguments passed to the constructor are undefined, but assuming value, xi and xj are vectors of numbers, this isn't a correct way to construct a DataFrame:
julia> value = rand(10); xi = repeat(1:2, 5); xj = rand(1:2, 10);
julia> df = DataFrame(value, xi, xj)
ERROR: MethodError: no method matching DataFrame(::Vector{Float64}, ::Vector{Int64}, ::Vector{Int64})
You can read about constructors in the docs here, the most common approach for a DataFrame with only few columns like here would probably be:
julia> df = DataFrame(value = value, xi = xi, xj = xj)
10×3 DataFrame
Row │ value xi xj
│ Float64 Int64 Int64
─────┼────────────────────────
1 │ 0.539533 1 2
2 │ 0.652752 2 1
3 │ 0.481461 1 2
...
Then you have
resulted_mean = []
I would say in this case the overall approach of preallocating a vector and pushing to it in a loop isn't ideal as it adds a lot of verbosity for no reason (see below), but as a general remark you should avoid untyped arrays in Julia:
julia> resulted_mean = []
Any[]
Here the Any means that the array can hold values of any type (floating point numbers, integers, strings, probability distributions...), which means the compiler cannot anticipate what the actual content will be from looking at the code, leading to suboptimal machine code being generated. In doing so, you negate the main advantage that Julia has over e.g. base Python: the rich type system combined with a lot of compiler optimizations allow generation of highly efficient machine code while keeping the language dynamic. In this case, you know that you want to push the results of the mean function to the results vector, which will be a floating point number, so you should use:
julia> resulted_mean = Float64[]
Float64[]
That said, I wouldn't recommend pushing in a loop here at all (see below).
Your loop is:
for i in range(ncol(df))
...
A few issues with this:
Loops in Julia require an end, unlike in Python where their end is determined based on code indentation
range is a different function in Julia than in Python:
julia> range(5)
ERROR: ArgumentError: At least one of `length` or `stop` must be specified
You can learn about functions using the REPL help mode (type ? at the REPL prompt to access it):
help?> range
search: range LinRange UnitRange StepRange StepRangeLen trailing_zeros AbstractRange trailing_ones OrdinalRange AbstractUnitRange AbstractString
range(start[, stop]; length, stop, step=1)
Given a starting value, construct a range either by length or from start to stop, optionally with a given step (defaults to 1, a UnitRange). One of length or stop is required. If length, stop, and step are all specified, they must
agree.
...
So you'd need to do something like
julia> range(1, 5, step = 1)
1:1:5
That said, for simple ranges like this you can use the colon operator: 1:5 is the same as `range(1, 5, step = 1).
You then iterate over integers from 1 to ncol(df) - you might want to check whether this is what you're actually after, as it seems unusual to me that the values in the xi and xj columns (on which you filter in the loop) would be related to the number of columns in your DataFrame (which is 3).
In the loop, you do
push!(resulted_mean, mean(df[:value], (:xi == i | :xj == i)))
which again has a few problems: first of all you are passing the subsetting condition for your DataFrame to the mean function, which doesn't work:
julia> mean(rand(10), rand(Bool, 10))
ERROR: MethodError: objects of type Vector{Float64} are not callable
The subsetting condition itself has two issues as well: when you write :xi, there is no way for Julia to know that you are referring to the DataFrame column xi, so all you're doing is comparing the Symbol :xi to the value of i, which will always return false:
julia> :xi == 2
false
Furthermore, note that | has a higher precedence than ==, so if you want to combine two equality checks with or you need brackets:
julia> 1 == 1 | 2 == 2
false
julia> (1 == 1) | (2 == 2)
true
More things could be said about your code snippet, but I hope this gives you an idea of where your gaps in understanding are and how you might go about closing them.
For completeness, here's how I would approach your problem - I'm interpreting your code to mean "calculate the mean of the value column, grouped by each value of xi and xj, but only where xi equals xj":
julia> combine(groupby(df[df.xi .== df.xj, :], [:xi, :xj], sort = true), :value => mean => :resulted_mean)
2×3 DataFrame
Row │ xi xj resulted_mean
│ Int64 Int64 Float64
─────┼─────────────────────────────
1 │ 1 1 0.356811
2 │ 2 2 0.977041
This is probably the most common analysis pattern for DataFrames, and is explained in the tutorial that Bogumił mentioned as well as in the DataFrames docs here.
As I said up front, if you want to use Julia productively, I recommend that you spend some time reading the documentation both for the language itself as well as for any of the key packages you're using. While Julia has some similarities to Python, and some bits in the DataFrames package have an API that resemble things you might have seen in R, it is a language in its own right that is fundamentally different from both Python and R (or any other language for that matter), and there's no way around familiarizing yourself with how it actually works.

On a dataset made up of dictionaries, how do I multiply the elements of each dictionary with Python'

I started coding in Python 4 days ago, so I'm a complete newbie. I have a dataset that comprises an undefined number of dictionaries. Each dictionary is the x and y of a point in the coordinates.
I'm trying to compute the summatory of xy by nesting the loop that multiplies xy within the loop that sums the products.
However I haven't been able to figure out how to multiply the values for the two keys in each dictionary (so far I only got to multiply all the x*y)
So far I've got this:
If my data set were to be d= [{'x':0, 'y':0}, {'x':1, 'y':1}, {'x':2, 'y':3}]
I've got the code for the function that calculates the product of each pair of x and y:
def product_xy (product_x_per_y):
prod_xy =[]
n = 0
for i in range (len(d)):
result = d[n]['x']*d[n]['y']
prod_xy.append(result)
n+1
return prod_xy
I also have the function to add up the elements of a list (like prod_xy):
def total_xy_prod (sum_prod):
all = 0
for s in sum_prod:
all+= s
return all
I've been trying to find a way to nest this two functions so that I can iterate through the multiplication of each x*y and then add up all the products.
Make sure your code works as expected
First, your functions have a few mistakes. For example, in product_xy, you assign n=0, and later do n + 1; you probably meant to do n += 1 instead of n + 1. But n is also completely unnecessary; you can simply use the i from the range iteration to replace n like so: result = d[i]['x']*d[i]['y']
Nesting these two functions: part 1
To answer your question, it's fairly straightforward to get the sum of the products of the elements from your current code:
coord_sum = total_xy_prod(product_xy(d))
Nesting these two functions: part 2
However, there is a much shorter and more efficient way to tackle this problem. For one, Python provides the built-in function sum() to sum the elements of a list (and other iterables), so there's no need create total_xy_prod. Our code could at this point read as follows:
coord_sum = sum(product_xy(d))
But product_xy is also unnecessarily long and inefficient, and we could also replace it entirely with a shorter expression. In this case, the shortening comes from generator expressions, which are basically compact for-loops. The Python docs give some of the basic details of how the syntax works at list comprehensions, which are distinct, but closely related to generator expressions. For the purposes of answering this question, I will simply present the final, most simplified form of your desired result:
coord_sum = sum(e['x'] * e['y'] for e in d)
Here, the generator expression iterates through every element in d (using for e in d), multiplies the numbers stored in the dictionary keys 'x' and 'y' of each element (using e['x'] * e['y']), and then sums each of those products from the entire sequence.
There is also some documentation on generator expressions, but it's a bit technical, so it's probably not approachable for the Python beginner.

Why is my merge sort algorithm not working?

I am implementing the merge sort algorithm in Python. Previously, I have implemented the same algorithm in C, it works fine there, but when I implement in Python, it outputs an unsorted array.
I've already rechecked the algorithm and code, but to my knowledge the code seems to be correct.
I think the issue is related to the scope of variables in Python, but I don't have any clue for how to solve it.
from random import shuffle
# Function to merge the arrays
def merge(a,beg,mid,end):
i = beg
j = mid+1
temp = []
while(i<=mid and j<=end):
if(a[i]<a[j]):
temp.append(a[i])
i += 1
else:
temp.append(a[j])
j += 1
if(i>mid):
while(j<=end):
temp.append(a[j])
j += 1
elif(j>end):
while(i<=mid):
temp.append(a[i])
i += 1
return temp
# Function to divide the arrays recursively
def merge_sort(a,beg,end):
if(beg<end):
mid = int((beg+end)/2)
merge_sort(a,beg,mid)
merge_sort(a,mid+1,end)
a = merge(a,beg,mid,end)
return a
a = [i for i in range(10)]
shuffle(a)
n = len(a)
a = merge_sort(a, 0, n-1)
print(a)
To make it work you need to change merge_sort declaration slightly:
def merge_sort(a,beg,end):
if(beg<end):
mid = int((beg+end)/2)
merge_sort(a,beg,mid)
merge_sort(a,mid+1,end)
a[beg:end+1] = merge(a,beg,mid,end) # < this line changed
return a
Why:
temp is constructed to be no longer than end-beg+1, but a is the initial full array, if you managed to replace all of it, it'd get borked quick. Therefore we take a "slice" of a and replace values in that slice.
Why not:
Your a luckily was not getting replaced, because of Python's inner workings, that is a bit tricky to explain but I'll try.
Every variable in Python is a reference. a is a reference to a list of variables a[i], which are in turn references to a constantant in memory.
When you pass a to a function it makes a new local variable a that points to the same list of variables. That means when you reassign it as a=*** it only changes where a points. You can only pass changes outside either via "slices" or via return statement
Why "slices" work:
Slices are tricky. As I said a points to an array of other variables (basically a[i]), that in turn are references to a constant data in memory, and when you reassign a slice it goes trough the slice element by element and changes where those individual variables are pointing, but as a inside and outside are still pointing to same old elements the changes go through.
Hope it makes sense.
You don't use the results of the recursive merges, so you essentially report the result of the merge of the two unsorted halves.

Increment variable array elements in Minizinc

I would like to perform a simple increment operation on specific array elements:
Minimal Not-Working Example:
array[1..2] of var 0..1: a = [0, 0];
constraint forall (i in 1..2) (
a[i] = a[i] + 1
);
output ["\(a)"];
solve satisfy;
This produces the minizinc output
WARNING: model inconsistency detected
stack.mzn:3:
in call 'forall'
in array comprehension expression
with i = 1
stack.mzn:4:
in binary '=' operator expression
=====UNSATISFIABLE=====
% stack.fzn:1: warning: model inconsistency detected before search.
Why is this an inconsistency in the model -- why can't I reference the old value of the current array element? Is there some other way to increase the current array element by 1?
I'm new to constraint solving, so I hope this is not a terribly stupid question.
It is important to know that MiniZinc is a declarative language. In a constraint you're not stating an instruction, but you're stating the "truth" as know to the solvers.
That means that an instruction like a = a + 1 will not work because you are stating that we're looking for a value for a that is its own value + 1. Since no such value exist we call the model inconsistent since no solutions can be found.
The idea of the constraint items is to express relations between different variables and parameters. You could for example write: constraint forall(i in N) (a[i] = a[i-1] + 1). This will mean we will look for a value a[i] which is 1 more than a[i-1] for all i in N. (Note that we should probably add an if-statement to make sure i-1 stays within the given bounds)
As a general rule: if there is a literal on one side of an equals signs, using that literal on the other side will create an inconsistent model.
If you still wanted to create a MiniZinc model that increases the values of a given array by one, you could use the following model:
set of int: N = 1..2
array[N] of int: a = [0,1];
array[N] of var int: b;
constraint forall(i in N) (
b[i] = a[i] + 1
);
Since the variables a are now expressed in terms of b, this doesn't violate our rule.

Sympy solver bug in a for loop?

So I'm playing with Sympy in an effort to build a generic solver/generator of physics problems. One component is that I'm going for a function that will take kwargs and, according to what it got, rearrange the equation and substitute values in it. Thanks to SO, I managed to find the things I need for that.
However..... I've tried putting sympy.solve in a for loop to generate all those expressions and I've ran into.... something.
import sympy
R, U, I, eq = sympy.symbols('R U I eq')
eq = R - U/I
for x in 'RUI':
print(x)
print(sympy.solve(eq, x))
The output?
R
[U/I]
U
[I*R]
I
[]
However, whenever I do sympy.solve(eq, I) it works and returns [U/R].
Now, I'm guessing the issue is with sympy using I for imaginary unit and with variable hiding in blocks, but even when I transfer the symbol declaration inside the for loop (and equation as well), I still get the same problem.
I'm not sure I'll need this badly in the end, but this is interesting to say the least.
It's more like an undocumented feature than a bug. The loop for x in 'RUI' is equivalent to for x in ['R', 'U', 'I'], meaning that x runs over one-character strings, not sympy symbols. Insert print(type(x)) in the loop to see this. And note that sympy.solve(eq, 'I') returns [].
The loop for x in [R, U, I] solves correctly for each variable. This is the right way to write this loop.
The surprising thing is that you get anything at all when passing a string as the second argument of solve. Sympy documentation does not list strings among acceptable arguments. Apparently, it tries to coerce the string to a sympy object and does not always guess your meaning correctly: works with sympy.solve(eq, 'R') but not with sympy.solve(eq, 'I')
The issue is that some sympy functions "accidentally" work with strings as input because they call sympify on their input. But sympify('I') gives the imaginary unit (sqrt(-1)), not Symbol('I').
You should always define your symbols explicitly like
R, U, I = symbols("R U I")
and use those instead of strings.
See https://github.com/sympy/sympy/wiki/Idioms-and-Antipatterns#strings-as-input for more information on why you should avoid using strings with SymPy.

Resources