For a project I am working on, I am creating a class of polynomials that I can operate on. The polynomial class can do addition, subtraction, multiplication, synthetic division, and more. It also represents it properly.
For the project, we are required to do create a class for Newton's Method. I was able to create a callable function class for f, such that
>f=polynomial(2,3,4)
>f
2+3x+4x^2
>f(3)
47
I have a derivative function polynomial.derivative(f) outputs 3+8x.
I want to define a function labeled Df so that in my Newtons Method code, I can say, Df(x). It would work so that if x=2:
>Df(2)
19
The derivative of a polynomial is still a polynomial. Thus, instead of returning the string 3+8x, your polynomial.derivative function should return a new polynomial.
class polynomial:
def __init__(c, b, a):
self.coefs = [c, b, a]
[...]
def derivative(self):
return polynomial(*[i*c for i,c in enumerate(self.coefs) if i > 0], 0)
Hence you can use it as follow:
> f = polynomial(2, 3, 4)
> Df = f.derivative()
> f
2+3x+4x^2
> Df
3+8x+0x^2
> f(3)
47
> Df(2)
19
Edit
Of course, it is enumerate and not enumerates. As well, the __init__ misses the self argument. I code this directly on SO without any syntax check.
Of course you can write this in a .py file. Here is a complete working example:
class Polynomial:
def __init__(self, c, b, a):
self.coefs = [c, b, a]
self._derivative = None
#property
def derivative(self):
if self._derivative is None:
self._derivative = Polynomial(*[i*c for i,c in enumerate(self.coefs) if i > 0], 0)
return self._derivative
def __str__(self):
return "+".join([
str(c) + ("x" if i > 0 else "") + (f"^{i}" if i > 1 else "")
for i, c in enumerate(self.coefs)
if c != 0
])
def __call__(self, x):
return sum([c * (x**i) for i, c in enumerate(self.coefs)])
if __name__ == '__main__':
f = Polynomial(2, 3, 4)
print(f"f: y={f}")
print(f"f(3) = {f(3)}")
print(f"f': y={f.derivative}")
print(f"f'(2) = {f.derivative(2)}")
f: y=2+3x+4x^2
f(3) = 47
f': y=3+8x
f'(2) = 19
You can rename the property with the name you prefer: derivative, Df, prime, etc.
Related
I have trained a Lightgbm model on learning to rank dataset. The model predicts relevance score of a sample. So higher the prediction the better it is. Now that the model has learned I would like to find the best values of some features that gives me the highest prediction score.
So, lets say I have features u,v,w,x,y,z and the features I would like to optimize over are x,y,z.
maximize f(u,v,w,x,y,z) w.r.t features x,y,z where f is a lightgbm model
subject to constraints :
y = Ax + b
z = 4 if y < thresh_a else 4-0.5 if y >= thresh_b else 4-0.3
thresh_m < x <= thresh_n
The numbers are randomly made up but constraints are linear.
Objective function with respect to x looks like the following :
So the function is very spiky, non-smooth. I also don't have the gradient information as f is a lightgbm model.
Using Nathan's answer I wrote down the following class :
class ProductOptimization:
def __init__(self, estimator, features_to_change, row_fixed_values,
bnds=None):
self.estimator = estimator
self.features_to_change = features_to_change
self.row_fixed_values = row_fixed_values
self.bounds = bnds
def get_sample(self, x):
new_values = {k:v for k,v in zip(self.features_to_change, x)}
return self.row_fixed_values.replace({k:{self.row_fixed_values[k].iloc[0]:v}
for k,v in new_values.items()})
def _call_model(self, x):
pred = self.estimator.predict(self.get_sample(x))
return pred[0]
def constraint1(self, vector):
x = vector[0]
y = vector[2]
return # some float value
def constraint2(self, vector):
x = vector[0]
y = vector[3]
return #some float value
def optimize_slsqp(self, initial_values):
con1 = {'type': 'eq', 'fun': self.constraint1}
con2 = {'type': 'eq', 'fun': self.constraint2}
cons = ([con1,con2])
result = minimize(fun=self._call_model,
x0=np.array(initial_values),
method='SLSQP',
bounds=self.bounds,
constraints=cons)
return result
The results that I get are always around the initial guess. And I think its because of non-smoothness of the function and absence of any gradient information which is important for the SLSQP optimizer. Any advices how should I deal with this kind of problem ?
It's been a good minute since I last wrote some serious code, so I appologize if it's not entirely clear what everything does, please feel free to ask for more explanations
The imports:
from sklearn.ensemble import GradientBoostingRegressor
import numpy as np
from scipy.optimize import minimize
from copy import copy
First I define a new class that allows me to easily redefine values. This class has 5 inputs:
value: this is the 'base' value. In your equation y=Ax + b it's the b part
minimum: this is the minimum value this type will evaluate as
maximum: this is the maximum value this type will evaluate as
multipliers: the first tricky one. It's a list of other InputType objects. The first is the input type and the second the multiplier. In your example y=Ax +b you would have [[x, A]], if the equation was y=Ax + Bz + Cd it would be [[x, A], [z, B], [d, C]]
relations: the most tricky one. It's also a list of other InputType objects, it has four items: the first is the input type, the second defines if it's an upper boundary you use min, if it's a lower boundary you use max. The third item in the list is the value of the boundary, and the fourth the output value connected to it
Watch out if you define your input values too strangely I'm sure there's weird behaviour.
class InputType:
def __init__(self, value=0, minimum=-1e99, maximum=1e99, multipliers=[], relations=[]):
"""
:param float value: base value
:param float minimum: value can never be lower than x
:param float maximum: value can never be higher than y
:param multipliers: [[InputType, multiplier], [InputType, multiplier]]
:param relations: [[InputType, min, threshold, output_value], [InputType, max, threshold, output_value]]
"""
self.val = value
self.min = minimum
self.max = maximum
self.multipliers = multipliers
self.relations = relations
def reset_val(self, value):
self.val = value
def evaluate(self):
"""
- relations to other variables are done first if there are none then the rest is evaluated
- at most self.max
- at least self.min
- self.val + i_x * w_x
i_x is input i, w_x is multiplier (weight) of i
"""
for term, min_max, value, output_value in self.relations:
# check for each term if it falls outside of the expected terms
if min_max(term.evaluate(), value) != term.evaluate():
return self.return_value(output_value)
output_value = self.val + sum([i[0].evaluate() * i[1] for i in self.multipliers])
return self.return_value(output_value)
def return_value(self, output_value):
return min(self.max, max(self.min, output_value))
Using this, you can fix the input types sent from the optimizer, as shown in _call_model:
class Example:
def __init__(self, lst_args):
self.lst_args = lst_args
self.X = np.random.random((10000, len(lst_args)))
self.y = self.get_y()
self.clf = GradientBoostingRegressor()
self.fit()
def get_y(self):
# sum of squares, is minimum at x = [0, 0, 0, 0, 0 ... ]
return np.array([[self._func(i)] for i in self.X])
def _func(self, i):
return sum(i * i)
def fit(self):
self.clf.fit(self.X, self.y)
def optimize(self):
x0 = [0.5 for i in self.lst_args]
initial_simplex = self._get_simplex(x0, 0.1)
result = minimize(fun=self._call_model,
x0=np.array(x0),
method='Nelder-Mead',
options={'xatol': 0.1,
'initial_simplex': np.array(initial_simplex)})
return result
def _get_simplex(self, x0, step):
simplex = []
for i in range(len(x0)):
point = copy(x0)
point[i] -= step
simplex.append(point)
point2 = copy(x0)
point2[-1] += step
simplex.append(point2)
return simplex
def _call_model(self, x):
print(x, type(x))
for i, value in enumerate(x):
self.lst_args[i].reset_val(value)
input_x = np.array([i.evaluate() for i in self.lst_args])
prediction = self.clf.predict([input_x])
return prediction[0]
I can define your problem as shown below (be sure to define the inputs in the same order as the final list, otherwise not all the values will get updated correctly in the optimizer!):
A = 5
b = 2
thresh_a = 5
thresh_b = 10
thresh_c = 10.1
thresh_m = 4
thresh_n = 6
u = InputType()
v = InputType()
w = InputType()
x = InputType(minimum=thresh_m, maximum=thresh_n)
y = InputType(value = b, multipliers=([[x, A]]))
z = InputType(relations=[[y, max, thresh_a, 4], [y, min, thresh_b, 3.5], [y, max, thresh_c, 3.7]])
example = Example([u, v, w, x, y, z])
Calling the results:
result = example.optimize()
for i, value in enumerate(result.x):
example.lst_args[i].reset_val(value)
print(f"final values are at: {[i.evaluate() for i in example.lst_args]}: {result.fun)}")
I want to use the ray task method rather than the ray actor method to parallelise a method within a class. The reason being the latter seems to need to change how a class is instantiated (as shown here). A toy code example is below, as well as the error
import numpy as np
import ray
class MyClass(object):
def __init__(self):
ray.init(num_cpus=4)
#ray.remote
def func(self, x, y):
return x * y
def my_func(self):
a = [1, 2, 3]
b = np.random.normal(0, 1, 10000)
result = []
# we wish to parallelise over the array `a`
for sub_array in np.array_split(a, 3):
result.append(self.func.remote(sub_array, b))
return result
mc = MyClass()
mc.my_func()
>>> TypeError: missing a required argument: 'y'
The error arises because ray does not seem to be "aware" of the class, and so it expects an argument self.
The code works fine if we do not use classes:
#ray.remote
def func(x, y):
return x * y
def my_func():
a = [1, 2, 3, 4]
b = np.random.normal(0, 1, 10000)
result = []
# we wish to parallelise over the list `a`
# split `a` and send each chunk to a different processor
for sub_array in np.array_split(a, 4):
result.append(func.remote(sub_array, b))
return result
res = my_func()
ray.get(res)
>>> [array([-0.41929678, -0.83227786, -2.69814232, ..., -0.67379119,
-0.79057845, -0.06862196]),
array([-0.83859356, -1.66455572, -5.39628463, ..., -1.34758239,
-1.5811569 , -0.13724391]),
array([-1.25789034, -2.49683358, -8.09442695, ..., -2.02137358,
-2.37173535, -0.20586587]),
array([ -1.67718712, -3.32911144, -10.79256927, ..., -2.69516478,
-3.1623138 , -0.27448782])]```
We see the output is a list of 4 arrays, as expected. How can I get MyClass to work with parallelism using ray?
a few tips:
It's generally recommended that you only use the ray.remote decorator on functions or classes in python (not bound methods).
You should be very very careful about calling ray.init inside the constructor of a function, since ray.init is not idempotent (which means your program will fail if you instantiate multiple instances of MyClass). Instead, you should make sure ray.init is only run once in your program.
I think there's 2 ways of achieving the results you're going for with Ray here.
You could move func out of the class, so it becomes a function instead of a bound method. Note that in this approach MyClass will be serialized, which means that changes that func makes to MyClass will not be reflected anywhere outside the function. In your simplified example, this doesn't appear to be a problem. This approach makes it easiest to achieve the most parallelism.
#ray.remote
def func(obj, x, y):
return x * y
class MyClass(object):
def my_func(self):
...
# we wish to parallelise over the array `a`
for sub_array in np.array_split(a, 3):
result.append(func.remote(self, sub_array, b))
return result
The other approach you could consider is to use async actors. In this approach, the ray actor will handle concurrency via asyncio, but this comes with the limitations of asyncio.
#ray.remote(num_cpus=4)
class MyClass(object):
async def func(self, x, y):
return x * y
def my_func(self):
a = [1, 2, 3]
b = np.random.normal(0, 1, 10000)
result = []
# we wish to parallelise over the array `a`
for sub_array in np.array_split(a, 3):
result.append(self.func.remote(sub_array, b))
return result
Please see this code:
#ray.remote
class Prime:
# Constructor
def __init__(self,number) :
self.num = number
def SumPrime(self,num) :
tot = 0
for i in range(3,num):
c = 0
for j in range(2, int(i**0.5)+1):
if i%j == 0:
c = c + 1
if c == 0:
tot = tot + i
return tot
num = 1000000
start = time.time()
# make an object of Check class
prime = Prime.remote(num)
print("duration =", time.time() - start, "\nsum_prime = ", ray.get(prime.SumPrime.remote(num)))
guys how can I make it so that calling make_repeater(square, 0)(5) return 5 instead of 25? I'm guessing I would need to change the line "function_successor = h" because then I'm just getting square(5) but not sure what I need to change it to...
square = lambda x: x * x
def compose1(h, g):
"""Return a function f, such that f(x) = h(g(x))."""
def f(x):
return h(g(x))
return f
def make_repeater(h, n):
iterations = 1
function_successor = h
while iterations < n:
function_successor = compose1(h, function_successor)
iterations += 1
return function_successor
it needs to satisfy a bunch of other requirements like:
make_repeater(square, 2)(5) = square(square(5)) = 625
make_repeater(square, 4)(5) = square(square(square(square(5)))) = 152587890625
To achieve that, you have to use the identity function (f(x) = x) as the initial value for function_successor:
def compose1(h, g):
"""Return a function f, such that f(x) = h(g(x))."""
def f(x):
return h(g(x))
return f
IDENTITY_FUNCTION = lambda x: x
def make_repeater(function, n):
function_successor = IDENTITY_FUNCTION
# simplified loop
for i in range(n):
function_successor = compose1(function, function_successor)
return function_successor
if __name__ == "__main__":
square = lambda x: x * x
print(make_repeater(square, 0)(5))
print(make_repeater(square, 2)(5))
print(make_repeater(square, 4)(5))
and the output is
5
625
152587890625
This isn't most optimal for performance though since the identity function (which doesn't do anything useful) is always part of the composed function, so an optimized version would look like this:
def make_repeater(function, n):
if n <= 0:
return IDENTITY_FUNCTION
function_successor = function
for i in range(n - 1):
function_successor = compose1(function, function_successor)
return function_successor
This is an algorithmic question. I want to have a higher order function that can repeatedly receive arguments, like func(3, 5)(4, 9)(1, 2)(...). I know all I need is to define a function that returns a function (an inner function), maybe like this (I'm not sure if this is the correct codes though)
def func(a, b = 0):
def inner_func(c, d):
nonlocal b
b += c + d
print(b)
...
return inner_func
return inner_func(a, b)
So when we run:
test0 = func(1, 2)
test1 = test0(2, 3)
we should get outputs at least:
5
10
But if we run:
test0 = func(1, 2)
test1 = test0(2, 3)
test_add = test0(2, 3)
the output will be:
5
10
15
However, I want test_add = test0(2, 3) to return exactly the same thing that test1 = test0(2, 3) returns, which is 10.
Expected outputs:
5
10
10
I should find a way to make the current function test0(2, 3) stick only to inputs of the previous function func(1, 2), which are 1 and 2. More examples:
test0 = func(1, 2)
test1 = test0(2, 3)
Expected outputs:
5
10
test_add = test0(3, 5)
Expected outputs: 13
But got: 18
So, how should I modify the code for this purpose?
To chain the outputs, your result must be a callable. But in order to make the call repeatable, your result also has to store an inner state.
It this point, using a function as the result type is the incorrect choice. Combining functionality and inner state is what classes are meant for.
Just write a quick helper class that stores the state:
class func_result:
def __init__(self, prev):
self.prev = prev
def __call__(self, a, b):
val = a+b+self.prev
print(val)
return func_result(val)
def func(a, b):
return func_result(b)(a,b)
test0 = func(1, 2)
test1 = test0(2, 3)
test_add1 = test1(2, 3)
test_add2 = test0(2, 3)
test_add3 = test0(3, 5)
5
10
15
10
13
If you don't like the fact that the func_result class is now exposed, you can nest it in the function:
def func(a, b):
class func_result:
def __init__(self, prev):
self.prev = prev
def __call__(self, a, b):
val = a+b+self.prev
print(val)
return func_result(val)
return func_result(b)(a,b)
I am fairly new to the concepts of caching & memoization. I've read some other discussions & resources here, here, and here, but haven't been able to follow them all that well.
Say that I have two member functions within a class. (Simplified example below.) Pretend that the first function total is computationally expensive. The second function subtotal is computationally simple, except that it uses the return from the first function, and so also becomes computationally expensive because of this, in that it currently needs to re-call total to get its returned result.
I want to cache the results of the first function and use this as the input to the second, if the input y to subtotal shares the input x to a recent call of total. That is:
If calling subtotal() where y is equal to the value of x in a
previous call of total, then use that cached result instead of
re-calling total.
Otherwise, just call total() using x = y.
Example:
class MyObject(object):
def __init__(self, a, b):
self.a, self.b = a, b
def total(self, x):
return (self.a + self.b) * x # some time-expensive calculation
def subtotal(self, y, z):
return self.total(x=y) + z # Don't want to have to re-run total() here
# IF y == x from a recent call of total(),
# otherwise, call total().
With Python3.2 or newer, you could use functools.lru_cache.
If you were to decorate the total with functools.lru_cache directly, then the lru_cache would cache the return values of total based on the value of both arguments, self and x. Since lru_cache's internal dict stores a reference to self, applying #lru_cache directly to a class method creates a circular reference to self which makes instances of the class un-dereferencable (hence a memory leak).
Here is a workaround which allows you to use lru_cache with class methods -- it caches results based on all arguments other than the first one, self, and uses a weakref to avoid the circular reference problem:
import functools
import weakref
def memoized_method(*lru_args, **lru_kwargs):
"""
https://stackoverflow.com/a/33672499/190597 (orly)
"""
def decorator(func):
#functools.wraps(func)
def wrapped_func(self, *args, **kwargs):
# We're storing the wrapped method inside the instance. If we had
# a strong reference to self the instance would never die.
self_weak = weakref.ref(self)
#functools.wraps(func)
#functools.lru_cache(*lru_args, **lru_kwargs)
def cached_method(*args, **kwargs):
return func(self_weak(), *args, **kwargs)
setattr(self, func.__name__, cached_method)
return cached_method(*args, **kwargs)
return wrapped_func
return decorator
class MyObject(object):
def __init__(self, a, b):
self.a, self.b = a, b
#memoized_method()
def total(self, x):
print('Calling total (x={})'.format(x))
return (self.a + self.b) * x
def subtotal(self, y, z):
return self.total(x=y) + z
mobj = MyObject(1,2)
mobj.subtotal(10, 20)
mobj.subtotal(10, 30)
prints
Calling total (x=10)
only once.
Alternatively, this is how you could roll your own cache using a dict:
class MyObject(object):
def __init__(self, a, b):
self.a, self.b = a, b
self._total = dict()
def total(self, x):
print('Calling total (x={})'.format(x))
self._total[x] = t = (self.a + self.b) * x
return t
def subtotal(self, y, z):
t = self._total[y] if y in self._total else self.total(y)
return t + z
mobj = MyObject(1,2)
mobj.subtotal(10, 20)
mobj.subtotal(10, 30)
One advantage of lru_cache over this dict-based cache is that the lru_cache
is thread-safe. The lru_cache also has a maxsize parameter which can help
protect against memory usage growing without bound (for example, due to a
long-running process calling total many times with different values of x).
Thank you all for the responses, it was helpful just to read them and see what's going on under the hood. As #Tadhg McDonald-Jensen said, it seems like I didn't need anything more here than #functools.lru_cache. (I'm in Python 3.5.) Regarding #unutbu's comment, I'm not getting an error from decorating total() with #lru_cache. Let me correct my own example, I'll keep this up here for other beginners:
from functools import lru_cache
from datetime import datetime as dt
class MyObject(object):
def __init__(self, a, b):
self.a, self.b = a, b
#lru_cache(maxsize=None)
def total(self, x):
lst = []
for i in range(int(1e7)):
val = self.a + self.b + x # time-expensive loop
lst.append(val)
return np.array(lst)
def subtotal(self, y, z):
return self.total(x=y) + z # if y==x from a previous call of
# total(), used cached result.
myobj = MyObject(1, 2)
# Call total() with x=20
a = dt.now()
myobj.total(x=20)
b = dt.now()
c = (b - a).total_seconds()
# Call subtotal() with y=21
a2 = dt.now()
myobj.subtotal(y=21, z=1)
b2 = dt.now()
c2 = (b2 - a2).total_seconds()
# Call subtotal() with y=20 - should take substantially less time
# with x=20 used in previous call of total().
a3 = dt.now()
myobj.subtotal(y=20, z=1)
b3 = dt.now()
c3 = (b3 - a3).total_seconds()
print('c: {}, c2: {}, c3: {}'.format(c, c2, c3))
c: 2.469753, c2: 2.355764, c3: 0.016998
In this case I would do something simple, maybe is not the most elegant way, but works for the problem:
class MyObject(object):
param_values = {}
def __init__(self, a, b):
self.a, self.b = a, b
def total(self, x):
if x not in MyObject.param_values:
MyObject.param_values[x] = (self.a + self.b) * x
print(str(x) + " was never called before")
return MyObject.param_values[x]
def subtotal(self, y, z):
if y in MyObject.param_values:
return MyObject.param_values[y] + z
else:
return self.total(y) + z