I've been working on code to calculate the distance between 33 3D points and calculate the shortest route is between them. The initial code took in all 33 points and paired them consecutively and calculated the distances between the pairs using math.sqrt and sum them all up to get a final distance.
My problem is that with the sheer number of permutations of a list with 33 points (33 factorial!) the code is going to need to be at its absolute best to find the answer within a human lifetime (assuming I can use as many CPUs as I can get my hands on to increase the sheer computational power).
I've designed a simple web server to hand out an integer and convert it to a list and have the code perform a set number of lexicographical permutations from that point and send back the resulting shortest distance of that block. This part is fine but I have concerns over the code that does the distance calculations
I've put together a test version of my code so I could change things and see if it made the execution time faster or slower. This code starts at the beginning of the permutation list (0 to 32) in order and performs 50 million lexicographical iterations on it, checking the distance of the points at every iteration. the code is detailed below.
import json
import datetime
import math
def next_lexicographic_permutation(x):
i = len(x) - 2
while i >= 0:
if x[i] < x[i+1]:
break
else:
i -= 1
if i < 0:
return False
j = len(x) - 1
while j > i:
if x[j] > x[i]:
break
else:
j-= 1
x[i], x[j] = x[j], x[i]
reverse(x, i + 1)
return x
def reverse(arr, i):
if i > len(arr) - 1:
return
j = len(arr) - 1
while i < j:
arr[i], arr[j] = arr[j], arr[i]
i += 1
j -= 1
# ip for initial permutation
ip = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]
lookup = '{"0":{"name":"van Maanen\'s Star","x":-6.3125,"y":-11.6875,"z":-4.125},\
"1":{"name":"Wolf 124","x":-7.25,"y":-27.1562,"z":-19.0938},\
"2":{"name":"Midgcut","x":-14.625,"y":10.3438,"z":13.1562},\
"3":{"name":"PSPF-LF 2","x":-4.40625,"y":-17.1562,"z":-15.3438},\
"4":{"name":"Wolf 629","x":-4.0625,"y":7.6875,"z":20.0938},\
"5":{"name":"LHS 3531","x":1.4375,"y":-11.1875,"z":16.7812},\
"6":{"name":"Stein 2051","x":-9.46875,"y":2.4375,"z":-15.375},\
"7":{"name":"Wolf 25","x":-11.0625,"y":-20.4688,"z":-7.125},\
"8":{"name":"Wolf 1481","x":5.1875,"y":13.375,"z":13.5625},\
"9":{"name":"Wolf 562","x":1.46875,"y":12.8438,"z":15.5625},\
"10":{"name":"LP 532-81","x":-1.5625,"y":-27.375,"z":-32.3125},\
"11":{"name":"LP 525-39","x":-19.7188,"y":-31.125,"z":-9.09375},\
"12":{"name":"LP 804-27","x":3.3125,"y":17.8438,"z":43.2812},\
"13":{"name":"Ross 671","x":-17.5312,"y":-13.8438,"z":0.625},\
"14":{"name":"LHS 340","x":20.4688,"y":8.25,"z":12.5},\
"15":{"name":"Haghole","x":-5.875,"y":0.90625,"z":23.8438},\
"16":{"name":"Trepin","x":26.375,"y":10.5625,"z":9.78125},\
"17":{"name":"Kokary","x":3.5,"y":-10.3125,"z":-11.4375},\
"18":{"name":"Akkadia","x":-1.75,"y":-33.9062,"z":-32.9688},\
"19":{"name":"Hill Pa Hsi","x":29.4688,"y":-1.6875,"z":25.375},\
"20":{"name":"Luyten 145-141","x":13.4375,"y":-0.8125,"z":6.65625},\
"21":{"name":"WISE 0855-0714","x":6.53125,"y":-2.15625,"z":2.03125},\
"22":{"name":"Alpha Centauri","x":3.03125,"y":-0.09375,"z":3.15625},\
"23":{"name":"LHS 450","x":-12.4062,"y":7.8125,"z":-1.875},\
"24":{"name":"LP 245-10","x":-18.9688,"y":-13.875,"z":-24.2812},\
"25":{"name":"Epsilon Indi","x":3.125,"y":-8.875,"z":7.125},\
"26":{"name":"Barnard\'s Star","x":-3.03125,"y":1.375,"z":4.9375},\
"27":{"name":"Epsilon Eridani","x":1.9375,"y":-7.75,"z":-6.84375},\
"28":{"name":"Narenses","x":-1.15625,"y":-11.0312,"z":21.875},\
"29":{"name":"Wolf 359","x":3.875,"y":6.46875,"z":-1.90625},\
"30":{"name":"LAWD 26","x":20.9062,"y":-7.5,"z":3.75},\
"31":{"name":"Avik","x":13.9688,"y":-4.59375,"z":-6.0},\
"32":{"name":"George Pantazis","x":-12.0938,"y":-16.0,"z":-14.2188}}'
lookup = json.loads(lookup)
lowest_total = 9999
# create 2D array for the distances and called it b to keep code looking clean.
b = [[0 for i in range(33)] for j in range(33)]
for x in range(33):
for y in range(33):
if x == y:
continue
else:
b[x][y] = math.sqrt(((lookup[str(x)]["x"] - lookup[str(y)]['x']) ** 2) + ((lookup[str(x)]['y'] - lookup[str(y)]['y']) ** 2) + ((lookup[str(x)]['z'] - lookup[str(y)]['z']) ** 2))
# begin timer
start_date = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")
start = datetime.datetime.now()
print("[{}] Start".format(start_date))
# main iteration loop
for x in range(50_000_000):
distance = b[ip[0]][ip[1]] + b[ip[1]][ip[2]] + b[ip[2]][ip[3]] +\
b[ip[3]][ip[4]] + b[ip[4]][ip[5]] + b[ip[5]][ip[6]] +\
b[ip[6]][ip[7]] + b[ip[7]][ip[8]] + b[ip[8]][ip[9]] +\
b[ip[9]][ip[10]] + b[ip[10]][ip[11]] + b[ip[11]][ip[12]] +\
b[ip[12]][ip[13]] + b[ip[13]][ip[14]] + b[ip[14]][ip[15]] +\
b[ip[15]][ip[16]] + b[ip[16]][ip[17]] + b[ip[17]][ip[18]] +\
b[ip[18]][ip[19]] + b[ip[19]][ip[20]] + b[ip[20]][ip[21]] +\
b[ip[21]][ip[22]] + b[ip[22]][ip[23]] + b[ip[23]][ip[24]] +\
b[ip[24]][ip[25]] + b[ip[25]][ip[26]] + b[ip[26]][ip[27]] +\
b[ip[27]][ip[28]] + b[ip[28]][ip[29]] + b[ip[29]][ip[30]] +\
b[ip[30]][ip[31]] + b[ip[31]][ip[32]]
if distance < lowest_total:
lowest_total = distance
ip = next_lexicographic_permutation(ip)
# end timer
finish_date = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")
finish = datetime.datetime.now()
print("[{}] Finish".format(finish_date))
diff = finish - start
print("Time taken => {}".format(diff))
print("Lowest distance => {}".format(lowest_total))
This is the result of a lot of work to make things faster. I was initially using string look-ups to find the distance to be calculated with a dict having keys like "1-2", but very quickly found out that it was very slow, I then moved onto hashed versions of the "1-2" key and the speed increased but the fastest way I have found so far is using a 2D array and looking up the values from there.
I have also found that manually constructing the distance calculation saved time over having a for x in ranges(32): loop adding the distances up and incrementing a variable to get the total.
Another great speed up was using pypy3 instead of python3 to execute it.
This usually takes 11 seconds to complete using pypy3
running 50 million of the distance calculation on its own takes 5.2 seconds
running 50 million of the next_lexicographic_permutation function on its own takes 6 seconds
I can't think of any way to make this faster and I believe there may be optimizations to be made in the next_lexicographic_permutation function. From what I've read about this the main bottleneck seems to be the switching of positions in the array:
x[i], x[j] = x[j], x[i]
Edit : added clarification of lifetime to represent human lifetime
The brute-force approach of calculating all the distances is going to be slower than a partitioning approach. Here is a similar question for the 3D case.
I have a question about how to improve my simple Python file so that it does not exceed the time limit. My code should run in less than 2 seconds, but it takes a long time. I will be glad to know any advice about it. Code receives (n) as an integer from the user, then in n lines, I have to do the tasks. If the input is "Add" I have to add the given number and then arrange them from smallest to largest. If the input is "Ask", I have to return the asked index of added numbers.
This is
an example for inputs and outputs.
I guess the code works well for other examples, but the only problem is time ...
n = int(input())
def arrange(x):
for j in range(len(x)):
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i], x[i + 1] = x[i + 1], x[i]
tasks=[]
for i in range(n):
tasks.append(list(input().split()))
ref = []
for i in range(n):
if tasks[i][0] == 'Add':
ref.append(int(tasks[i][1]))
arrange(ref)
elif tasks[i][0] == 'Ask':
print(ref[int(tasks[i][1]) - 1])
For the given example, I get a "Time Limit Exceeded" Error.
First-off: Reimplementing list.sort will always be slower than just using it directly. If nothing else, getting rid of the arrange function and replacing the call to it with ref.sort() would improve performance (especially because Python's sorting algorithm is roughly O(n) when the input is largely sorted already, so you'll be reducing the work from the O(n**2) of your bubble-sorting arrange to roughly O(n), not just the O(n log n) of an optimized general purpose sort).
If that's not enough, note that list.sort is still theoretically O(n log n); if the list is getting large enough, that may cost more than it should. If so, take a look at the bisect module, to let you do the insertions with O(log n) lookup time (plus O(n) insertion time, but with very low constant factors) which might improve performance further.
Alternatively, if Ask operations are going to be infrequent, you might not sort at all when Adding, and only sort on demand when Ask occurs (possibly using a flag to indicate if it's already sorted so you don't call sort unnecessarily). That could make a meaningfully difference, especially if the inputs typically don't interleave Adds and Asks.
Lastly, in the realm of microoptimizations, you're needlessly wasting time on list copying and indexing you don't need to do, so stop doing it:
tasks=[]
for i in range(n):
tasks.append(input().split()) # Removed list() call; str.split already returns a list
ref = []
for action, value in tasks: # Don't iterate by index, iterate the raw list and unpack to useful
# names; it's meaningfully faster
if action == 'Add':
ref.append(int(value))
ref.sort()
elif action == 'Ask':
print(ref[int(value) - 1])
For me it runs in less than 0,005 seconds. Are you sure that you are measuring the right thing and you don't count in the time of giving the input for example?
python3 timer.py
Input:
7
Add 10
Add 2
Ask 1
Ask 2
Add 5
Ask 2
Ask 3
Output:
2
10
5
10
Elapsed time: 0.0033 seconds
My code:
import time
n = int(input('Input:\n'))
def arrange(x):
for j in range(len(x)):
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i], x[i + 1] = x[i + 1], x[i]
tasks=[]
for i in range(n):
tasks.append(list(input().split()))
tic = time.perf_counter()
ref = []
print('Output:')
for i in range(n):
if tasks[i][0] == 'Add':
ref.append(int(tasks[i][1]))
arrange(ref)
elif tasks[i][0] == 'Ask':
print(ref[int(tasks[i][1]) - 1])
toc = time.perf_counter()
print(f"Elapsed time: {toc - tic:0.4f} seconds")
I have a for loop where there are a series of operations performed but one method in particular I discovered takes about 44% of the time for the entire iteration to execute. In order to increase the speed of this for loop, what should I do? Use asyncio, multiprocessing? My idea for increasing the execution speed is to have the next iteration of the loop begin when huge_method is called and have both iterations run at the same time. Any suggestions on how I can do this?
Example of what I mean:
for i in range(len(some_list)):
x = some_list[i]['model']
y = some_other_list[i]['prediction']
result = huge_method(x, y) # This is what's taking up most of the time in this loop
# some more code...
list_of_results.append(result)
Note: anything calculated during an individual iteration is not dependent on anything calculated in a previous iteration. Also, each iteration appends a result to a list. Maintaining the order of each iterations results is important but I can make it work otherwise. What I'm referring to as huge_method here is a method from a 3rd party library and not code I would like to modify.
Edit: For clarity, here is the actual code I'm working with:
for ii_day in range(len(prediction_indices)):
model_idx = prediction_indices[ii_day]["model_idx"]
prediction_idx = prediction_indices[ii_day]["prediction_idx"]
# Fit model
regression_period_signal = historical_signal_levels[model_idx, :]
regression_period_price_change = historical_price_moves[model_idx]
try:
# This is the huge_method that takes half the time of an iteration
rolling_regression_model = LinearRegression().fit(regression_period_signal, regression_period_price_change)
self.coef_list.append(rolling_regression_model.coef_)
# Calculate model error
predictions = rolling_regression_model.predict(regression_period_signal)
forecast_horizon_model_error = np.sqrt(
mean_squared_error(regression_period_price_change, predictions))
# Predictions
forecast_distance = 1
current_research = historical_signal_levels[prediction_idx, :]
forecast_price_change = rolling_regression_model.predict(current_research)
# Calculate drift and volatility
volatility = ((1 + forecast_horizon_model_error) * (forecast_distance ** -0.5)) - 1
# Kelly recommended optimum
if volatility < 0:
raise ZeroDivisionError("Volatility needs to be positive value.")
if volatility == 0:
volatility = 0.01
kelly_recommended_optimum = forecast_price_change / volatility ** 2
rule_recommended_allocation = self.kelly_fraction * kelly_recommended_optimum
except:
rule_recommended_allocation = np.zeros(len(prediction_idx))
# Apply the calculated allocation to the dataframe.
price_research_series.loc[prediction_idx, position_key] = rule_recommended_allocation
I wanted to see how drastic is the difference in time complexity between the iterative and recursive approaches to sum an array. So I plotted a 'time' versus 'size of the list' graph for a pretty decent range of values for size(995). What I got was pretty much what I wanted except something unexpected caught my eye.
The graph can be seen here 1
What's confusing me here are those bumps that the green line suddenly takes only for certain values and then comes back down. Why does that happen?
Here is the code I had written:
import matplotlib.pyplot as plt
from time import time
def sum_rec(lst): # Sums recursively
if len(lst) == 0:
return 0
return lst[0]+sum_rec(lst[1:])
def sum_iter(lst): # Sums iteratively
Sum = 0
for i in range(len(lst)):
Sum += i
return Sum
def check_time(lst): # Returns the time taken for both algorithms
start = time()
Sum = sum_iter(lst)
end = time()
t_iter = end - start
start = time()
Sum = sum_rec(lst)
end = time()
t_rec = end - start
return t_iter, t_rec
N = [n for n in range(995)]
T1 = [] # for iterative function
T2 = [] # for recursive function
for n in N: # values on the x-axis
lst = [i for i in range(n)]
t_iter, t_rec = check_time(lst)
T1.append(t_iter)
T2.append(t_rec)
plt.plot(N,T1)
plt.plot(N,T2) # Both plotted on graph
plt.show()
I'd say both the algorithms have a linear runtime but the recursive one has a higher constant factor, which causes the steeper slope.
Other than that:-
(1) You're mixing up the two plots.
The iterative one stays grounded while the recursive one increases.
One possible explanation may be that recursive calls make stack entries and require more computational time than iterative calls.
(2) You need to increase the size of the array as small sizes are more likely to cause spikes due to locality of reference.
(3) You need to repeat the experiment over multiple epochs to make sure random spikes due to some other process hogging the resource is distributed evenly.
I've made a simple terrain generator, but it takes an excessive amount of time to generate anything bigger than 50x50. Is there anything I can do to optimise the code so that I can generate larger things? I know that things such as pygame or numpy might be better for doing this, but at my school they wont install those, so this is what I have to work with.
Here's the relevant code:
def InitMap(self):
aliveCells = []
for x in range(self.width):
for y in range(self.height):
if random.random() < self.aliveChance:
aliveCells.append(self.FindInGrid(x,y))
return aliveCells
def GenerateMap(self):
aliveCells = self.InitMap()
shallowCells=[]
self.count = 1
for i in range(self.steps):
aliveCells = self.DoGenStep(aliveCells)
for i in aliveCells:
self.canvas.itemconfig(i,fill="green")
for i in aliveCells:
for j in self.FindNeighbours(i):
if j not in aliveCells: self.canvas.itemconfig(i,fill="#0000FF")
def DoGenStep(self,oldAliveCells):
newAliveCells = []
for allCells in self.pos:
for cell in allCells:
self.root.title(str(round((self.count/(self.height*self.width)*100)/self.steps))+"%")
self.count += 1
aliveNeighbours = 0
for i in self.FindNeighbours(cell):
if i in oldAliveCells: aliveNeighbours += 1
if cell in oldAliveCells:
if aliveNeighbours < self.deathLimit:
pass
else:
newAliveCells.append(cell)
else:
if aliveNeighbours > self.birthLimit:
newAliveCells.append(cell)
return newAliveCells
def FindNeighbours(self,cell):
cellCoords = self.GetCoords(cell)
neighbours = []
for xMod in [-1,0,1]:
x = xMod+cellCoords[0]
for yMod in [-1,0,1]:
y = yMod+cellCoords[1]
if x < 0 or x >= self.width: pass
elif y < 0 or y >= self.height: pass
elif xMod == 0 and yMod == 0: pass
else: neighbours.append(self.FindInGrid(x,y))
return neighbours
NB: You didn't add the method "FindInGrid", so I'm making some assumptions. Please correct me if I'm wrong.
One thing which would help immensely for larger maps, and also when at high densities, is not to store just the alive cells, but the entire grid. By storing the alive cells, you make the behavior of your program in the order O( (x*y)^2), since you have to iterate over all alive cells for each alive cell. If you would store the entire grid, this would not be necessary, and the calculation can be performed with a time complexity linear to the surface of your grid, rather than a quadratic one.
Additional point:
self.root.title(str(round((self.count/(self.height*self.width)*100)/self.steps))+"%")
That's a string operation, which makes it relatively expensive. Are you sure you need to do this after each and every update of a single cell?