How to use scipy's fminbound function when function returns a dictionary - python-3.x

I have simple function for which I want to calculate the minimum value
from scipy import optimize
def f(x):
return x**2
optimize.fminbound(f, -1, 2)
This works fine. Now I modify above function which now returns a dictionary
def f1(x):
y = x ** 2
return_list = {}
return_list['x'] = x
return_list['y'] = y
return return_list
While it is returning multiple objects x and y, I want to apply fminbound only on y, the other object x is just for informational purpose for other use of this function.
How can I use fminbound for this setup?

You need a simple wrapper function that extracts y:
def f1(x):
y = x ** 2
return_list = {}
return_list['x'] = x
return_list['y'] = y
return return_list
optimize.fminbound(lambda x: f1(x)['y'], -1, 2)

Related

Numpy Vectorization for Nested 'for' loop

I was trying to write a program which plots level set for any given function.
rmin = -5.0
rmax = 5.0
c = 4.0
x = np.arange(rmin,rmax,0.1)
y = np.arange(rmin,rmax,0.1)
x,y = np.meshgrid(x,y)
f = lambda x,y: y**2.0 - 4*x
realplots = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i,j],y[i,j])-c)< 1e-4:
realplots.append([x[i,j],y[i,j]])`
But it being a nested for loop, is taking lot of time. Any help in vectorizing the above code/new method of plotting level set is highly appreciated.(Note: The function 'f' will be changed at the time of running.So, the vectorization must be done without considering the function's properties)
I tried vectorizing through
ans = np.where(abs(f(x,y)-c)<1e-4,np.array([x,y]),[0,0])
but it was giving me operands could not be broadcast together with shapes (100,100) (2,100,100) (2,)
I was adding [0,0] as an escape from else condition in np.where which is indeed wrong.
Since you get the values rather than the indexes, you don't really need np.where.
You can directly use the mask to index x and y, look at the "Boolean array indexing" section of the documentation.
It is straightforward:
def vectorized(x, y, c, f, threshold):
mask = np.abs(f(x, y) - c) < threshold
x, y = x[mask], y[mask]
return np.stack([x, y], axis=-1)
Your function for reference:
def op(x, y, c, f, threshold):
res = []
for i in range(x.shape[0]):
for j in range(x.shape[1]):
if abs(f(x[i, j], y[i, j]) - c) < threshold:
res.append([x[i, j], y[i, j]])
return res
Tests:
rmin, rmax = -5.0, +5.0
c = 4.0
threshold = 1e-4
x = np.arange(rmin, rmax, 0.1)
y = np.arange(rmin, rmax, 0.1)
x, y = np.meshgrid(x, y)
f = lambda x, y: y**2 - 4 * x
res_op = op(x, y, c, f, threshold)
res_vec = vectorized(x, y, c, f, threshold)
assert np.allclose(res_op, res_vec)

Composing arbitrary functions in Python

I am following category theory for programmers and trying to implement a compose function in Python. What I have currently is
def compose(f, g):
def result(*args):
return f(g(*args))
return result
def square(x):
return x * x
def add(x, y):
return x + y
def f(x, y):
return min(x, y)
def g(x, y):
return x + y, x * y
result = compose(square, add)(1, 2)
print(result)
result = compose(f, g)(1, 2)
print(result)
But this will not work because I don't know whether g(*args) in compose will return an object which is an iterator or just a single return value. Is there a way to write compose in Python without using isinstance??
One option is to always return a list; then you can unpack g without checking:
def compose(f, g):
def result(*args):
return f(*g(*args)) # we can unpack g here if we know every function returns a list
return result
def square(x):
return [x * x]
def add(x, y):
return [x + y]
def f(x, y):
return [min(x, y)]
def g(x, y):
return x + y, x * y
result = compose(square, add)(1, 2)
print(result)
result = compose(f, g)(1, 2)
print(result)

Product feature optimization with constraints

I have trained a Lightgbm model on learning to rank dataset. The model predicts relevance score of a sample. So higher the prediction the better it is. Now that the model has learned I would like to find the best values of some features that gives me the highest prediction score.
So, lets say I have features u,v,w,x,y,z and the features I would like to optimize over are x,y,z.
maximize f(u,v,w,x,y,z) w.r.t features x,y,z where f is a lightgbm model
subject to constraints :
y = Ax + b
z = 4 if y < thresh_a else 4-0.5 if y >= thresh_b else 4-0.3
thresh_m < x <= thresh_n
The numbers are randomly made up but constraints are linear.
Objective function with respect to x looks like the following :
So the function is very spiky, non-smooth. I also don't have the gradient information as f is a lightgbm model.
Using Nathan's answer I wrote down the following class :
class ProductOptimization:
def __init__(self, estimator, features_to_change, row_fixed_values,
bnds=None):
self.estimator = estimator
self.features_to_change = features_to_change
self.row_fixed_values = row_fixed_values
self.bounds = bnds
def get_sample(self, x):
new_values = {k:v for k,v in zip(self.features_to_change, x)}
return self.row_fixed_values.replace({k:{self.row_fixed_values[k].iloc[0]:v}
for k,v in new_values.items()})
def _call_model(self, x):
pred = self.estimator.predict(self.get_sample(x))
return pred[0]
def constraint1(self, vector):
x = vector[0]
y = vector[2]
return # some float value
def constraint2(self, vector):
x = vector[0]
y = vector[3]
return #some float value
def optimize_slsqp(self, initial_values):
con1 = {'type': 'eq', 'fun': self.constraint1}
con2 = {'type': 'eq', 'fun': self.constraint2}
cons = ([con1,con2])
result = minimize(fun=self._call_model,
x0=np.array(initial_values),
method='SLSQP',
bounds=self.bounds,
constraints=cons)
return result
The results that I get are always around the initial guess. And I think its because of non-smoothness of the function and absence of any gradient information which is important for the SLSQP optimizer. Any advices how should I deal with this kind of problem ?
It's been a good minute since I last wrote some serious code, so I appologize if it's not entirely clear what everything does, please feel free to ask for more explanations
The imports:
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np
from scipy.optimize import minimize
from copy import copy
First I define a new class that allows me to easily redefine values. This class has 5 inputs:
value: this is the 'base' value. In your equation y=Ax + b it's the b part
minimum: this is the minimum value this type will evaluate as
maximum: this is the maximum value this type will evaluate as
multipliers: the first tricky one. It's a list of other InputType objects. The first is the input type and the second the multiplier. In your example y=Ax +b you would have [[x, A]], if the equation was y=Ax + Bz + Cd it would be [[x, A], [z, B], [d, C]]
relations: the most tricky one. It's also a list of other InputType objects, it has four items: the first is the input type, the second defines if it's an upper boundary you use min, if it's a lower boundary you use max. The third item in the list is the value of the boundary, and the fourth the output value connected to it
Watch out if you define your input values too strangely I'm sure there's weird behaviour.
class InputType:
def __init__(self, value=0, minimum=-1e99, maximum=1e99, multipliers=[], relations=[]):
"""
:param float value: base value
:param float minimum: value can never be lower than x
:param float maximum: value can never be higher than y
:param multipliers: [[InputType, multiplier], [InputType, multiplier]]
:param relations: [[InputType, min, threshold, output_value], [InputType, max, threshold, output_value]]
"""
self.val = value
self.min = minimum
self.max = maximum
self.multipliers = multipliers
self.relations = relations
def reset_val(self, value):
self.val = value
def evaluate(self):
"""
- relations to other variables are done first if there are none then the rest is evaluated
- at most self.max
- at least self.min
- self.val + i_x * w_x
i_x is input i, w_x is multiplier (weight) of i
"""
for term, min_max, value, output_value in self.relations:
# check for each term if it falls outside of the expected terms
if min_max(term.evaluate(), value) != term.evaluate():
return self.return_value(output_value)
output_value = self.val + sum([i[0].evaluate() * i[1] for i in self.multipliers])
return self.return_value(output_value)
def return_value(self, output_value):
return min(self.max, max(self.min, output_value))
Using this, you can fix the input types sent from the optimizer, as shown in _call_model:
class Example:
def __init__(self, lst_args):
self.lst_args = lst_args
self.X = np.random.random((10000, len(lst_args)))
self.y = self.get_y()
self.clf = GradientBoostingRegressor()
self.fit()
def get_y(self):
# sum of squares, is minimum at x = [0, 0, 0, 0, 0 ... ]
return np.array([[self._func(i)] for i in self.X])
def _func(self, i):
return sum(i * i)
def fit(self):
self.clf.fit(self.X, self.y)
def optimize(self):
x0 = [0.5 for i in self.lst_args]
initial_simplex = self._get_simplex(x0, 0.1)
result = minimize(fun=self._call_model,
x0=np.array(x0),
method='Nelder-Mead',
options={'xatol': 0.1,
'initial_simplex': np.array(initial_simplex)})
return result
def _get_simplex(self, x0, step):
simplex = []
for i in range(len(x0)):
point = copy(x0)
point[i] -= step
simplex.append(point)
point2 = copy(x0)
point2[-1] += step
simplex.append(point2)
return simplex
def _call_model(self, x):
print(x, type(x))
for i, value in enumerate(x):
self.lst_args[i].reset_val(value)
input_x = np.array([i.evaluate() for i in self.lst_args])
prediction = self.clf.predict([input_x])
return prediction[0]
I can define your problem as shown below (be sure to define the inputs in the same order as the final list, otherwise not all the values will get updated correctly in the optimizer!):
A = 5
b = 2
thresh_a = 5
thresh_b = 10
thresh_c = 10.1
thresh_m = 4
thresh_n = 6
u = InputType()
v = InputType()
w = InputType()
x = InputType(minimum=thresh_m, maximum=thresh_n)
y = InputType(value = b, multipliers=([[x, A]]))
z = InputType(relations=[[y, max, thresh_a, 4], [y, min, thresh_b, 3.5], [y, max, thresh_c, 3.7]])
example = Example([u, v, w, x, y, z])
Calling the results:
result = example.optimize()
for i, value in enumerate(result.x):
example.lst_args[i].reset_val(value)
print(f"final values are at: {[i.evaluate() for i in example.lst_args]}: {result.fun)}")

How do I return a value from a higher-order function?

guys how can I make it so that calling make_repeater(square, 0)(5) return 5 instead of 25? I'm guessing I would need to change the line "function_successor = h" because then I'm just getting square(5) but not sure what I need to change it to...
square = lambda x: x * x
def compose1(h, g):
"""Return a function f, such that f(x) = h(g(x))."""
def f(x):
return h(g(x))
return f
def make_repeater(h, n):
iterations = 1
function_successor = h
while iterations < n:
function_successor = compose1(h, function_successor)
iterations += 1
return function_successor
it needs to satisfy a bunch of other requirements like:
make_repeater(square, 2)(5) = square(square(5)) = 625
make_repeater(square, 4)(5) = square(square(square(square(5)))) = 152587890625
To achieve that, you have to use the identity function (f(x) = x) as the initial value for function_successor:
def compose1(h, g):
"""Return a function f, such that f(x) = h(g(x))."""
def f(x):
return h(g(x))
return f
IDENTITY_FUNCTION = lambda x: x
def make_repeater(function, n):
function_successor = IDENTITY_FUNCTION
# simplified loop
for i in range(n):
function_successor = compose1(function, function_successor)
return function_successor
if __name__ == "__main__":
square = lambda x: x * x
print(make_repeater(square, 0)(5))
print(make_repeater(square, 2)(5))
print(make_repeater(square, 4)(5))
and the output is
5
625
152587890625
This isn't most optimal for performance though since the identity function (which doesn't do anything useful) is always part of the composed function, so an optimized version would look like this:
def make_repeater(function, n):
if n <= 0:
return IDENTITY_FUNCTION
function_successor = function
for i in range(n - 1):
function_successor = compose1(function, function_successor)
return function_successor

Get output from functions which are run in parallel

I have two functions which I am running asynchronously using python multiprocessing. I need to store values from each function in an array, to use it later.
I have tried to store the output in a numpy array in each of the functions and return them. However, I am not getting any output. The print statements are working alright.
x = np.array([])
y = np.array([])
def func1():
for _ in range(1,150):
print((rstr.xeger(random.choice(sList))))
np.append(x,rstr.xeger(random.choice(sList)))
return x
def func2():
for _ in range(1,150):
print(fake.name())
np.append(y,fake.name())
return y
if __name__ == '__main__':
p1 = multiprocessing.Pool(5)
p2 = multiprocessing.Pool(5)
print("Started")
start = time.time()
x = p1.apply_async(func1)
y = p2.apply_async(func2)
print(time.time() - start)
print("Ended")
I need the arrays x and y which are being returned from the two functions.

Resources