I have a for loop where there are a series of operations performed but one method in particular I discovered takes about 44% of the time for the entire iteration to execute. In order to increase the speed of this for loop, what should I do? Use asyncio, multiprocessing? My idea for increasing the execution speed is to have the next iteration of the loop begin when huge_method is called and have both iterations run at the same time. Any suggestions on how I can do this?
Example of what I mean:
for i in range(len(some_list)):
x = some_list[i]['model']
y = some_other_list[i]['prediction']
result = huge_method(x, y) # This is what's taking up most of the time in this loop
# some more code...
Note: anything calculated during an individual iteration is not dependent on anything calculated in a previous iteration. Also, each iteration appends a result to a list. Maintaining the order of each iterations results is important but I can make it work otherwise. What I'm referring to as huge_method here is a method from a 3rd party library and not code I would like to modify.
Edit: For clarity, here is the actual code I'm working with:
for ii_day in range(len(prediction_indices)):
model_idx = prediction_indices[ii_day]["model_idx"]
prediction_idx = prediction_indices[ii_day]["prediction_idx"]
# Fit model
regression_period_signal = historical_signal_levels[model_idx, :]
regression_period_price_change = historical_price_moves[model_idx]
# This is the huge_method that takes half the time of an iteration
rolling_regression_model = LinearRegression().fit(regression_period_signal, regression_period_price_change)
# Calculate model error
predictions = rolling_regression_model.predict(regression_period_signal)
forecast_horizon_model_error = np.sqrt(
mean_squared_error(regression_period_price_change, predictions))
# Predictions
forecast_distance = 1
current_research = historical_signal_levels[prediction_idx, :]
forecast_price_change = rolling_regression_model.predict(current_research)
# Calculate drift and volatility
volatility = ((1 + forecast_horizon_model_error) * (forecast_distance ** -0.5)) - 1
# Kelly recommended optimum
if volatility < 0:
raise ZeroDivisionError("Volatility needs to be positive value.")
if volatility == 0:
volatility = 0.01
kelly_recommended_optimum = forecast_price_change / volatility ** 2
rule_recommended_allocation = self.kelly_fraction * kelly_recommended_optimum
rule_recommended_allocation = np.zeros(len(prediction_idx))
# Apply the calculated allocation to the dataframe.
price_research_series.loc[prediction_idx, position_key] = rule_recommended_allocation


Deciding if all intervals are overlapping

I'm doing a problem that n people is standing on a line and each person knows their own position and speed. I'm asked to find the minimal time to have all people go to any spot.
Basically what I'm doing is finding the minimal time using binary search and have every ith person's furthest distance to go in that time in intervals. If all intervals overlap, there is a spot that everyone can go to.
I have a solution to this question but the time limit exceeded for it for my bad solution to find the intervals. My current solution runs too slow and I'm hoping to get a better solution.
my code:
people = int(input())
peoplel = [list(map(int, input().split())) for _ in range(people)] # first item in people[i] is the position of each person, the second item is the speed of each person
def good(time):
return checkoverlap([[i[0] - time *i[1], i[0] + time * i[1]] for i in peoplel])
# first item,second item = the range of distance a person can go to
def checkoverlap(l):
for i in range(len(l) - 1):
seg1 = l[i]
for i1 in range(i + 1, len(l)):
seg2 = l[i1]
if seg2[0] <= seg1[0] <= seg2[1] or seg1[0] <= seg2[0] <= seg1[1]:
elif seg2[0] <= seg1[1] <= seg2[1] or seg1[0] <= seg2[1] <= seg1[1]:
return False
return True
(this is my first time asking a question so please inform me about anything that is wrong)
One does simply go linear
A while after I finished the answer I found a simplification that removes the need for sorting and thus allows us to further reduce the complexity of finding if all the intervals are overlapping to O(N).
If we look at the steps that are being done after the initial sort we can see that we are basically checking
if max(lower_bounds) < min(upper_bounds):
return True
return False
And since both min and max are linear without the need for sorting, we can simplify the algorithm by:
Creating an array of lower bounds - one pass.
Creating an array of upper bounds - one pass.
Doing the comparison I mentioned above - two passes over the new arrays.
All this could be done together in one one pass to further optimize(and to prevent some unnecessary memory allocation), however this is clearer for the explanation's purpose.
Since the reasoning about the correctness and timing was done in the previous iteration, I will skip it this time and keep the section below since it nicely shows the thought process behind the optimization.
One sort to rule them all
Disclaimer: This section was obsoleted time-wise by the one above. However since it in fact allowed me to figure out the linear solution, I'm keeping it here.
As the title says, sorting is a rather straightforward way of going about this. It will require a little different data structure - instead of holding every interval as (min, max) I opted for holding every interval as (min, index), (max, index).
This allows me to sort these by the min and max values. What follows is a single linear pass over the sorted array. We also create a helper array of False values. These represent the fact that at the beginning all the intervals are closed.
Now comes the pass over the array:
Since the array is sorted, we first encounter the min of each interval. In such case, we increase the openInterval counter and a True value of the interval itself. Interval is now open - until we close the interval, the person can arrive at the party - we are within his(or her) range.
We go along the array. As long as we are opening the intervals, everything is ok and if we manage to open all the intervals, we have our party destination where all the social distancing collapses. If this happens, we return True.
If we close any of the intervals, we have found our party breaker who can't make it anymore. (Or we can discuss that the party breakers are those who didn't bother to arrive yet when someone has to go already). We return False.
The resulting complexity is O(Nlog(N)) caused by the initial sort since the pass itself is linear in nature. This is quite a bit better than the original O(n^2) caused by the "check all intervals pairwise" approach.
The code:
import numpy as np
import cProfile, pstats, io
#random data for a speed test. Not that useful for checking the correctness though.
testSize = 10000
x = np.random.randint(0, 10000, testSize)
y = np.random.randint(1, 100, testSize)
peopleTest = [x for x in zip(x, y)]
#Just a basic example to help with the reasoning about the correctness
peoplel = [(1, 2), (3, 1), (8, 1)]
# first item in people[i] is the position of each person, the second item is the speed of each person
def checkIntervals(people, time):
a = [(x[0] - x[1] * time, idx) for idx, x in enumerate(people)]
b = [(x[0] + x[1] * time, idx) for idx, x in enumerate(people)]
checks = [False for x in range(len(people))]
openCount = 0
intervals = [x for x in sorted(a + b, key=lambda x: x[0])]
for i in intervals:
if not checks[i[1]]:
checks[i[1]] = True
openCount += 1
if openCount == len(people):
return True
return False
def good(time, people):
return checkoverlap([[i[0] - time * i[1], i[0] + time * i[1]] for i in people])
# first item,second item = the range of distance a person can go to
def checkoverlap(l):
for i in range(len(l) - 1):
seg1 = l[i]
for i1 in range(i + 1, len(l)):
seg2 = l[i1]
if seg2[0] <= seg1[0] <= seg2[1] or seg1[0] <= seg2[0] <= seg1[1]:
elif seg2[0] <= seg1[1] <= seg2[1] or seg1[0] <= seg2[1] <= seg1[1]:
return False
return True
pr = cProfile.Profile()
print(checkIntervals(peopleTest, 10000))
print(good(10000, peopleTest))
s = io.StringIO()
sortby = "cumulative"
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
The profiling stats for the pass over test array with 10K random values:
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 8.933 8.933 (good)
1 8.925 8.925 8.926 8.926 (checkoverlap)
1 0.003 0.003 0.023 0.023 (checkIntervals)
1 0.008 0.008 0.010 0.010 {built-in method builtins.sorted}

Why the time taken to by an iterative algorithm to find the sum of list does not increase uniformly with size?

I wanted to see how drastic is the difference in time complexity between the iterative and recursive approaches to sum an array. So I plotted a 'time' versus 'size of the list' graph for a pretty decent range of values for size(995). What I got was pretty much what I wanted except something unexpected caught my eye.
The graph can be seen here 1
What's confusing me here are those bumps that the green line suddenly takes only for certain values and then comes back down. Why does that happen?
Here is the code I had written:
import matplotlib.pyplot as plt
from time import time
def sum_rec(lst): # Sums recursively
if len(lst) == 0:
return 0
return lst[0]+sum_rec(lst[1:])
def sum_iter(lst): # Sums iteratively
Sum = 0
for i in range(len(lst)):
Sum += i
return Sum
def check_time(lst): # Returns the time taken for both algorithms
start = time()
Sum = sum_iter(lst)
end = time()
t_iter = end - start
start = time()
Sum = sum_rec(lst)
end = time()
t_rec = end - start
return t_iter, t_rec
N = [n for n in range(995)]
T1 = [] # for iterative function
T2 = [] # for recursive function
for n in N: # values on the x-axis
lst = [i for i in range(n)]
t_iter, t_rec = check_time(lst)
plt.plot(N,T2) # Both plotted on graph
I'd say both the algorithms have a linear runtime but the recursive one has a higher constant factor, which causes the steeper slope.
Other than that:-
(1) You're mixing up the two plots.
The iterative one stays grounded while the recursive one increases.
One possible explanation may be that recursive calls make stack entries and require more computational time than iterative calls.
(2) You need to increase the size of the array as small sizes are more likely to cause spikes due to locality of reference.
(3) You need to repeat the experiment over multiple epochs to make sure random spikes due to some other process hogging the resource is distributed evenly.

Scipy Optimize Basin Hopping fails

I am working on a cost minimizing function to help with allocation/weights in a portfolio of stocks. I have the following code for the "Objective Function". This works when I tried it with 15 variables(stocks). However, when I tried it with 55 stocks it failed.
I have tried it with a smaller sample of stocks(15) and it works fine. The num_assets variable below is the number of stocks in the portfolio.
def get_metrics(weights):
weights = np.array(weights)
returnsR =, weights )
volatilityR = np.sqrt(,, weights)))
sharpeR = returnsR / volatilityR
drawdownR = np.multiply(weights, dailyDD).sum(axis=1, skipna =
drawdownR = f(drawdownR)
calmarR = returnsR / drawdownR
results = (sharpeR * 0.3) + (calmarR * 0.7)
return np.array([returnsR, volatilityR, sharpeR, drawdownR, calmarR,
def objective(weights):
# the number 5 is the index from the get_metrics array
return get_metrics(weights)[5] * -1
def check_sum(weights):
#return 0 if sum of the weights is 1
return np.sum(weights)-1
bound = (0.0,1.0)
bnds = tuple(bound for x in range (num_assets))
bx = list(bnds)
""" Custom step-function """
class RandomDisplacementBounds(object):
"""random displacement with bounds: see:
Modified! (dropped acceptance-rejection sampling for a more specialized approach)
def __init__(self, xmin, xmax, stepsize=0.5):
self.xmin = xmin
self.xmax = xmax
self.stepsize = stepsize
def __call__(self, x):
"""take a random step but ensure the new position is within the bounds """
min_step = np.maximum(self.xmin - x, -self.stepsize)
max_step = np.minimum(self.xmax - x, self.stepsize)
random_step = np.random.uniform(low=min_step, high=max_step, size=x.shape)
xnew = x + random_step
return xnew
bounded_step = RandomDisplacementBounds(np.array([b[0] for b in bx]), np.array([b[1] for b in bx]))
minimizer_kwargs = {"method":"L-BFGS-B", "bounds": bnds}
globmin = sco.basinhopping(objective,
The output should be an array of numbers that add up to 1 or 100%. However, this is not happening.
This function is a failure on my end as well. It failed to choose values which were lower -- ie., regardless of output from optimization function (negative or positive), it persisted until the parameter I was optimizing was as bad as it could possibly be. I suspect that since the function violates function encapsulation and relies on "function attributes" to adjust stepsize, the developer may not have respected encapsulated function scope elsewhere, and surprising behavior is happening as a result.
Regardless, in terms of theory, anything else is just a (dubious) estimated numerical partial second derivative (numerical Hessian, or "estimated curvature" for us mere mortals) based "performance" "gain", which reduces to a randomly-biased annealer in discrete, chaotic (phase space, continuous) or mixed (continuous and discrete) search spaces with volatile curvatures or planar areas (due to numerical underflow and loss of precision).
Anyways, import the following:
dual anneal

How to set LpVariable and Objective Function in pulp for LPP as per the formula?

I want to calculate the Maximised value of the particular user based on his Interest | Popularity | both Interest and Popularity using following Linear Programming Problem(LPP) equation
using pulp package in python3.7.
I have 4 lists
INTEREST = [5,10,15,20,25]
POPULARITY = [4,8,12,16,20]
USER = [1,2,3,4,5]
cost = [2,4,6,8,10]
and 2 variable values as
e=0.5 ; e may take (0 or 1 or 0.5)
i=0 to n ; n is length of the list
means, the summation want to perform for all list values.
Here, if e==0 means Interest will 0 ; if e==1 means Popularity will 0 ; if e==0.5 means Interest and Popularity will be consider for Max Value
Also xi takes 0 or 1; if xi==1 then the user will be consider else if xi==0 then the user will not be consider.
and my pulp code as below
from pulp import *
INTEREST = [5,10,15,20,25]
POPULARITY = [4,8,12,16,20]
USER = [1,2,3,4,5]
cost = [2,4,6,8,10]
prob = LpProblem("MaxValue", LpMaximize)
int_vars = LpVariable.dicts("Interest", INTEREST,0,4,LpContinuous)
pop_vars = LpVariable.dicts("Popularity",
user_vars = LpVariable.dicts("User",
prob += lpSum(USER(i)((INTEREST[i]*e for i in INTEREST) +
(POPULARITY[i]*(1-e) for i in POPULARITY)))
prob += USER(i)cost(i) <= budget
print("Status : ",LpStatus[prob.status])
print("The Max Value = ",value(prob.objective))
Now I am getting 2 errors as
1) line 714, in addInPlace for e in other:
2) line 23, in
prob += lpSum(INTEREST[i]e for i in INTEREST) +
lpSum(POPULARITY[i](1-e) for i in POPULARITY)
IndexError: list index out of range
What I did wrong in my code. Guide me to resolve this problem. Thanks in advance.
I think I finally understand what you are trying to achieve. I think the problem with your description is to do with terminology. In a linear program we reserve the term variable for those variables which we want to be selected or chosen as part of the optimisation.
If I understand your needs correctly your python variables e and budget would be considered parameters or constants of the linear program.
I believe this does what you want:
from pulp import *
import numpy as np
INTEREST = [5,10,15,20,25]
POPULARITY = [4,8,12,16,20]
COST = [2,4,6,8,10]
N = len(COST)
set_user = range(N)
prob = LpProblem("MaxValue", LpMaximize)
x = LpVariable.dicts("user_selected", set_user, 0, 1, LpBinary)
prob += lpSum([x[i]*(INTEREST[i]*e + POPULARITY[i]*(1-e)) for i in set_user])
prob += lpSum([x[i]*COST[i] for i in set_user]) <= budget
print("Status : ",LpStatus[prob.status])
print("The Max Value = ",value(prob.objective))
# Show which users selected
x_soln = np.array([x[i].varValue for i in set_user])
print("user_vars: ")
Which should return the following, i.e. with these particular parameters only the last user is selected for inclusion - but this decision will change - for example if you increase the budget to 100 all users will be selected.
Status : Optimal
The Max Value = 22.5
[0. 0. 0. 0. 1.]

Gini Coefficient in Julia: Efficient and Accurate Code

I'm trying to implement the following formula in Julia for calculating the Gini coefficient of a wage distribution:
Here's a simplified version of the code I'm using for this:
# Takes a array where first column is value of wages
# (y_i in formula), and second column is probability
# of wage value (f(y_i) in formula).
function gini(wagedistarray)
# First calculate S values in formula
for i in 1:length(wagedistarray[:,1])
for j in 1:i
# Now calculate value to subtract from 1 in gini formula
Gwages = Swages[1]*wagedistarray[1,2]
for i in 2:length(Swages)
Gwages += wagedistarray[i,2]*(Swages[i]+Swages[i-1])
# Final step of gini calculation
return giniwages=1-(Gwages/Swages[length(Swages)])
for i in 1:length(wagedistarray[:,1])
#time result=gini(wagedistarray)
It gives a value of near zero, which is what you expect for a completely equal wage distribution. However, it takes quite a long time: 6.796 secs.
Any ideas for improvement?
Try this:
function gini(wagedistarray)
nrows = size(wagedistarray,1)
Swages = zeros(nrows)
for i in 1:nrows
for j in 1:i
Swages[i] += wagedistarray[j,2]*wagedistarray[j,1]
for i in 2:nrows
return 1-(Gwages/Swages[length(Swages)])
for i in 1:size(wagedistarray,1)
#time result=gini(wagedistarray)
Time before: 5.913907256 seconds (4000481676 bytes allocated, 25.37% gc time)
Time after: 0.134799301 seconds (507260 bytes allocated)
Time after (second run): elapsed time: 0.123665107 seconds (80112 bytes allocated)
The primary problems are that Swages was a global variable (wasn't living in the function) which is not a good coding practice, but more importantly is a performance killer. The other thing I noticed was length(wagedistarray[:,1]), which makes a copy of that column and then asks its length - that was generating some extra "garbage". The second run is faster because there is some compilation time the very first time the function is run.
You crank performance even higher using #inbounds, i.e.
function gini(wagedistarray)
nrows = size(wagedistarray,1)
Swages = zeros(nrows)
#inbounds for i in 1:nrows
for j in 1:i
Swages[i] += wagedistarray[j,2]*wagedistarray[j,1]
#inbounds for i in 2:nrows
return 1-(Gwages/Swages[length(Swages)])
which gives me elapsed time: 0.042070662 seconds (80112 bytes allocated)
Finally, check out this version, which is actually faster than all and is also the most accurate I think:
function gini2(wagedistarray)
Swages = cumsum(wagedistarray[:,1].*wagedistarray[:,2])
Gwages = Swages[1]*wagedistarray[1,2] +
sum(wagedistarray[2:end,2] .*
return 1 - Gwages/Swages[end]
Which has elapsed time: 0.00041119 seconds (721664 bytes allocated). The main benefit was changing from a O(n^2) double for loop to the O(n) cumsum.
IainDunning has already provided a good answer with code that is fast enough for practical purposes (the function gini2). If one enjoys performance tweaking, one can get an additional speed increase of a factor 20 by avoiding temporary arrays (gini3). See the following code that compares the performance of the two implementations:
using TimeIt
for i in 1:size(wagedistarray,1)
wages = wagedistarray[:,1]
wagefrequencies = wagedistarray[:,2];
# original code
function gini2(wagedistarray)
Swages = cumsum(wagedistarray[:,1].*wagedistarray[:,2])
Gwages = Swages[1]*wagedistarray[1,2] +
sum(wagedistarray[2:end,2] .*
return 1 - Gwages/Swages[end]
# new code
function gini3(wages, wagefrequencies)
Swages_previous = wages[1]*wagefrequencies[1]
Gwages = Swages_previous*wagefrequencies[1]
#inbounds for i = 2:length(wages)
freq = wagefrequencies[i]
Swages_current = Swages_previous + wages[i]*freq
Gwages += freq * (Swages_current+Swages_previous)
Swages_previous = Swages_current
return 1.0 - Gwages/Swages_previous
result=gini2(wagedistarray) # warming up JIT
println("result with gini2: $result, time:")
#timeit result=gini2(wagedistarray)
result=gini3(wages, wagefrequencies) # warming up JIT
println("result with gini3: $result, time:")
#timeit result=gini3(wages, wagefrequencies)
The output is:
result with gini2: 0.0, time:
1000 loops, best of 3: 321.57 µs per loop
result with gini3: -1.4210854715202004e-14, time:
10000 loops, best of 3: 16.24 µs per loop
gini3 is somewhat less accurate than gini2 due to sequential summation, one would have to use a variant of pairwise summation to increase accuracy.
