Faster way to simulate the crunch command behaviour on Python3.8 - python-3.x

I'm trying to simulate what the crunch command does in Linux, with the difference of yield the words instead of writing them into a file and i came up with something like this:
def wordlist(chars, min, max = None):
if max is None: # Means that the user want only a singular length
max = min
length = len(chars)
for n in range(min, max + 1):
indexes = [0] * n
for _ in range(length ** n): # The length of all the chars to the power of the places to fill return the number of words in the wordlist
for m in range(1, len(indexes) + 1): # This is the reporting system, like if indexes instead of a list is a number
if indexes[-m] == length:
indexes[-m] = 0
indexes[-m - 1] += 1
yield ''.join(chars[i] for i in indexes)
indexes[-1] += 1
It's a bit rude and not too much readable, maybe neither too much performing. Without using any external module like itertools, has someone got a better idea?
EDIT:
After a bit of struggling I have improved the math behind coming up to something like this:
def wordlist(chars, min, max = None):
if max is None:
max = min
if min <= 0 or max <= 0:
return
base = len(chars)
for n in range(min, max + 1):
for m in range(base ** n):
yield ''.join(chars[m // base ** (n - v - 1) % base] for v in range(n))
Anyway I measured the time taken by each of the two function and, while this new one is much more readable and pretty, the first one still faster. I still waiting for better ideas from you

Related

I'm Looking for ways to make this lexicographical code faster

I've been working on code to calculate the distance between 33 3D points and calculate the shortest route is between them. The initial code took in all 33 points and paired them consecutively and calculated the distances between the pairs using math.sqrt and sum them all up to get a final distance.
My problem is that with the sheer number of permutations of a list with 33 points (33 factorial!) the code is going to need to be at its absolute best to find the answer within a human lifetime (assuming I can use as many CPUs as I can get my hands on to increase the sheer computational power).
I've designed a simple web server to hand out an integer and convert it to a list and have the code perform a set number of lexicographical permutations from that point and send back the resulting shortest distance of that block. This part is fine but I have concerns over the code that does the distance calculations
I've put together a test version of my code so I could change things and see if it made the execution time faster or slower. This code starts at the beginning of the permutation list (0 to 32) in order and performs 50 million lexicographical iterations on it, checking the distance of the points at every iteration. the code is detailed below.
import json
import datetime
import math
def next_lexicographic_permutation(x):
i = len(x) - 2
while i >= 0:
if x[i] < x[i+1]:
break
else:
i -= 1
if i < 0:
return False
j = len(x) - 1
while j > i:
if x[j] > x[i]:
break
else:
j-= 1
x[i], x[j] = x[j], x[i]
reverse(x, i + 1)
return x
def reverse(arr, i):
if i > len(arr) - 1:
return
j = len(arr) - 1
while i < j:
arr[i], arr[j] = arr[j], arr[i]
i += 1
j -= 1
# ip for initial permutation
ip = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]
lookup = '{"0":{"name":"van Maanen\'s Star","x":-6.3125,"y":-11.6875,"z":-4.125},\
"1":{"name":"Wolf 124","x":-7.25,"y":-27.1562,"z":-19.0938},\
"2":{"name":"Midgcut","x":-14.625,"y":10.3438,"z":13.1562},\
"3":{"name":"PSPF-LF 2","x":-4.40625,"y":-17.1562,"z":-15.3438},\
"4":{"name":"Wolf 629","x":-4.0625,"y":7.6875,"z":20.0938},\
"5":{"name":"LHS 3531","x":1.4375,"y":-11.1875,"z":16.7812},\
"6":{"name":"Stein 2051","x":-9.46875,"y":2.4375,"z":-15.375},\
"7":{"name":"Wolf 25","x":-11.0625,"y":-20.4688,"z":-7.125},\
"8":{"name":"Wolf 1481","x":5.1875,"y":13.375,"z":13.5625},\
"9":{"name":"Wolf 562","x":1.46875,"y":12.8438,"z":15.5625},\
"10":{"name":"LP 532-81","x":-1.5625,"y":-27.375,"z":-32.3125},\
"11":{"name":"LP 525-39","x":-19.7188,"y":-31.125,"z":-9.09375},\
"12":{"name":"LP 804-27","x":3.3125,"y":17.8438,"z":43.2812},\
"13":{"name":"Ross 671","x":-17.5312,"y":-13.8438,"z":0.625},\
"14":{"name":"LHS 340","x":20.4688,"y":8.25,"z":12.5},\
"15":{"name":"Haghole","x":-5.875,"y":0.90625,"z":23.8438},\
"16":{"name":"Trepin","x":26.375,"y":10.5625,"z":9.78125},\
"17":{"name":"Kokary","x":3.5,"y":-10.3125,"z":-11.4375},\
"18":{"name":"Akkadia","x":-1.75,"y":-33.9062,"z":-32.9688},\
"19":{"name":"Hill Pa Hsi","x":29.4688,"y":-1.6875,"z":25.375},\
"20":{"name":"Luyten 145-141","x":13.4375,"y":-0.8125,"z":6.65625},\
"21":{"name":"WISE 0855-0714","x":6.53125,"y":-2.15625,"z":2.03125},\
"22":{"name":"Alpha Centauri","x":3.03125,"y":-0.09375,"z":3.15625},\
"23":{"name":"LHS 450","x":-12.4062,"y":7.8125,"z":-1.875},\
"24":{"name":"LP 245-10","x":-18.9688,"y":-13.875,"z":-24.2812},\
"25":{"name":"Epsilon Indi","x":3.125,"y":-8.875,"z":7.125},\
"26":{"name":"Barnard\'s Star","x":-3.03125,"y":1.375,"z":4.9375},\
"27":{"name":"Epsilon Eridani","x":1.9375,"y":-7.75,"z":-6.84375},\
"28":{"name":"Narenses","x":-1.15625,"y":-11.0312,"z":21.875},\
"29":{"name":"Wolf 359","x":3.875,"y":6.46875,"z":-1.90625},\
"30":{"name":"LAWD 26","x":20.9062,"y":-7.5,"z":3.75},\
"31":{"name":"Avik","x":13.9688,"y":-4.59375,"z":-6.0},\
"32":{"name":"George Pantazis","x":-12.0938,"y":-16.0,"z":-14.2188}}'
lookup = json.loads(lookup)
lowest_total = 9999
# create 2D array for the distances and called it b to keep code looking clean.
b = [[0 for i in range(33)] for j in range(33)]
for x in range(33):
for y in range(33):
if x == y:
continue
else:
b[x][y] = math.sqrt(((lookup[str(x)]["x"] - lookup[str(y)]['x']) ** 2) + ((lookup[str(x)]['y'] - lookup[str(y)]['y']) ** 2) + ((lookup[str(x)]['z'] - lookup[str(y)]['z']) ** 2))
# begin timer
start_date = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")
start = datetime.datetime.now()
print("[{}] Start".format(start_date))
# main iteration loop
for x in range(50_000_000):
distance = b[ip[0]][ip[1]] + b[ip[1]][ip[2]] + b[ip[2]][ip[3]] +\
b[ip[3]][ip[4]] + b[ip[4]][ip[5]] + b[ip[5]][ip[6]] +\
b[ip[6]][ip[7]] + b[ip[7]][ip[8]] + b[ip[8]][ip[9]] +\
b[ip[9]][ip[10]] + b[ip[10]][ip[11]] + b[ip[11]][ip[12]] +\
b[ip[12]][ip[13]] + b[ip[13]][ip[14]] + b[ip[14]][ip[15]] +\
b[ip[15]][ip[16]] + b[ip[16]][ip[17]] + b[ip[17]][ip[18]] +\
b[ip[18]][ip[19]] + b[ip[19]][ip[20]] + b[ip[20]][ip[21]] +\
b[ip[21]][ip[22]] + b[ip[22]][ip[23]] + b[ip[23]][ip[24]] +\
b[ip[24]][ip[25]] + b[ip[25]][ip[26]] + b[ip[26]][ip[27]] +\
b[ip[27]][ip[28]] + b[ip[28]][ip[29]] + b[ip[29]][ip[30]] +\
b[ip[30]][ip[31]] + b[ip[31]][ip[32]]
if distance < lowest_total:
lowest_total = distance
ip = next_lexicographic_permutation(ip)
# end timer
finish_date = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")
finish = datetime.datetime.now()
print("[{}] Finish".format(finish_date))
diff = finish - start
print("Time taken => {}".format(diff))
print("Lowest distance => {}".format(lowest_total))
This is the result of a lot of work to make things faster. I was initially using string look-ups to find the distance to be calculated with a dict having keys like "1-2", but very quickly found out that it was very slow, I then moved onto hashed versions of the "1-2" key and the speed increased but the fastest way I have found so far is using a 2D array and looking up the values from there.
I have also found that manually constructing the distance calculation saved time over having a for x in ranges(32): loop adding the distances up and incrementing a variable to get the total.
Another great speed up was using pypy3 instead of python3 to execute it.
This usually takes 11 seconds to complete using pypy3
running 50 million of the distance calculation on its own takes 5.2 seconds
running 50 million of the next_lexicographic_permutation function on its own takes 6 seconds
I can't think of any way to make this faster and I believe there may be optimizations to be made in the next_lexicographic_permutation function. From what I've read about this the main bottleneck seems to be the switching of positions in the array:
x[i], x[j] = x[j], x[i]
Edit : added clarification of lifetime to represent human lifetime
The brute-force approach of calculating all the distances is going to be slower than a partitioning approach. Here is a similar question for the 3D case.

"Time Limit Exceeded" error for python file

I have a question about how to improve my simple Python file so that it does not exceed the time limit. My code should run in less than 2 seconds, but it takes a long time. I will be glad to know any advice about it. Code receives (n) as an integer from the user, then in n lines, I have to do the tasks. If the input is "Add" I have to add the given number and then arrange them from smallest to largest. If the input is "Ask", I have to return the asked index of added numbers.
This is
an example for inputs and outputs.
I guess the code works well for other examples, but the only problem is time ...
n = int(input())
def arrange(x):
for j in range(len(x)):
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i], x[i + 1] = x[i + 1], x[i]
tasks=[]
for i in range(n):
tasks.append(list(input().split()))
ref = []
for i in range(n):
if tasks[i][0] == 'Add':
ref.append(int(tasks[i][1]))
arrange(ref)
elif tasks[i][0] == 'Ask':
print(ref[int(tasks[i][1]) - 1])
For the given example, I get a "Time Limit Exceeded" Error.
First-off: Reimplementing list.sort will always be slower than just using it directly. If nothing else, getting rid of the arrange function and replacing the call to it with ref.sort() would improve performance (especially because Python's sorting algorithm is roughly O(n) when the input is largely sorted already, so you'll be reducing the work from the O(n**2) of your bubble-sorting arrange to roughly O(n), not just the O(n log n) of an optimized general purpose sort).
If that's not enough, note that list.sort is still theoretically O(n log n); if the list is getting large enough, that may cost more than it should. If so, take a look at the bisect module, to let you do the insertions with O(log n) lookup time (plus O(n) insertion time, but with very low constant factors) which might improve performance further.
Alternatively, if Ask operations are going to be infrequent, you might not sort at all when Adding, and only sort on demand when Ask occurs (possibly using a flag to indicate if it's already sorted so you don't call sort unnecessarily). That could make a meaningfully difference, especially if the inputs typically don't interleave Adds and Asks.
Lastly, in the realm of microoptimizations, you're needlessly wasting time on list copying and indexing you don't need to do, so stop doing it:
tasks=[]
for i in range(n):
tasks.append(input().split()) # Removed list() call; str.split already returns a list
ref = []
for action, value in tasks: # Don't iterate by index, iterate the raw list and unpack to useful
# names; it's meaningfully faster
if action == 'Add':
ref.append(int(value))
ref.sort()
elif action == 'Ask':
print(ref[int(value) - 1])
For me it runs in less than 0,005 seconds. Are you sure that you are measuring the right thing and you don't count in the time of giving the input for example?
python3 timer.py
Input:
7
Add 10
Add 2
Ask 1
Ask 2
Add 5
Ask 2
Ask 3
Output:
2
10
5
10
Elapsed time: 0.0033 seconds
My code:
import time
n = int(input('Input:\n'))
def arrange(x):
for j in range(len(x)):
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i], x[i + 1] = x[i + 1], x[i]
tasks=[]
for i in range(n):
tasks.append(list(input().split()))
tic = time.perf_counter()
ref = []
print('Output:')
for i in range(n):
if tasks[i][0] == 'Add':
ref.append(int(tasks[i][1]))
arrange(ref)
elif tasks[i][0] == 'Ask':
print(ref[int(tasks[i][1]) - 1])
toc = time.perf_counter()
print(f"Elapsed time: {toc - tic:0.4f} seconds")

How can I solve finding consecutive factors problem in an optimal way?

I need to develop a function which finds consecutive factors of the given number and then the function will return the smallest of these consecutive numbers.
I tried to solve a Codility question. (I submitted my solution)
I need to develop the solution function.
def solution(N):
# write your code in Python 3.6
pass
An example:
If N is 6, the function will return 2 (because of 6 = 2 * 3)
If N is 20, the function will return 4 (because of 20 = 4 * 5)
If N is 29, the function will return 0
I developed the solution function (by checking all the numbers from 1 up to N, brute force search) and it works.
However, when the argument of the solution function is too big, the execution of the function takes too much time. Codility Python engine is running the function for a while and then it is throwing TIMEOUT ERROR.
What may be an optimal solution for this problem?
Thank you
I developed the function but it is not optimized.
def solution(N):
for i in range(1,N+1):
if i * (i+1) == N:
return i
return 0
When N is too big like 12,567,543, the function execution takes too much time.
After my comment, I thought a little bit about the question.
If you have an integer, N, and two consecutive factors, m and m+1, then it MUST be true that m < sqrt(N) and m + 1 > sqrt(N)
Therefore, all you have to do is check if the floor of the square root times the ceiling of the square root is equal to your original number..
import math
def solution(N):
n1 = math.floor(math.sqrt(N))
n2 = n1 + 1 # or n2 = math.ceil(math.sqrt(N))
if n1*n2 == N:
return n1
return 0
This has a run time of O(1).
import math
import math
def mysol(n):
s = math.sqrt(n)
if math.floor(s) * math.ceil(s) == n:
return math.floor(s)
else:
return 0

How can I reduce the time complexity of the given python code?

I have this python program which computes the "Square Free Numbers" of a given number. I'm facing problem regarding the time complexity that is I'm getting the error as "Time Limit Exceeded" in an online compiler.
number = int(input())
factors = []
perfectSquares = []
count = 0
total_len = 0
# Find All the Factors of the given number
for i in range(1, number):
if number%i == 0:
factors.append(i)
# Find total number of factors
total_len = len(factors)
for items in factors:
for i in range(1,total_len):
# Eleminate perfect square numbers
if items == i * i:
if items == 1:
factors.remove(items)
count += 1
else:
perfectSquares.append(items)
factors.remove(items)
count += 1
# Eleminate factors that are divisible by the perfect squares
for i in factors:
for j in perfectSquares:
if i%j == 0:
count +=1
# Print Total Square Free numbers
total_len -= count
print(total_len)
How can I reduce the time complexity of this program? That is how can I reduce the for loops so the program gets executed with a smaller time complexity?
Algorithmic Techniques for Reducing Time Complexity(TC) of a python code.
In order to reduce time complexity of a code, it's very much necessary to reduce the usage of loops whenever and wherever possible.
I'll divide your code's logic part into 5 sections and suggest optimization in each one of them.
Section 1 - Declaration of Variables and taking input
number = int(input())
factors = []
perfectSquares = []
count = 0
total_len = 0
You can easily omit declaration of perfect squares, count and total_length, as they aren't needed, as explained further. This will reduce both Time and Space complexities of your code.
Also, you can use Fast IO, in order to speed up INPUTS and OUTPUTS
This is done by using 'stdin.readline', and 'stdout.write'.
Section 2 - Finding All factors
for i in range(1, number):
if number%i == 0:
factors.append(i)
Here, you can use List comprehension technique to create the factor list, due to the fact that List comprehension is faster than looping statements.
Also, you can just iterate till square root of the Number, instead of looping till number itself, thereby reducing time complexity exponentially.
Above code section guns down to...
After applying '1' hack
factors = [for i in range(1, number) if number%i == 0]
After applying '2' hack - Use from_iterable to store more than 1 value in each iteration in list comprehension
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0
))
Section 3 - Eliminating Perfect Squares
# Find total number of factors
total_len = len(factors)
for items in factors:
for i in range(1,total_len):
# Eleminate perfect square numbers
if items == i * i:
if items == 1:
factors.remove(items)
count += 1
else:
perfectSquares.append(items)
factors.remove(items)
count += 1
Actually you can completely omit this part, and just add additional condition to the Section 2, namely ... type(i**0.5) != int, to eliminate those numbers which have integer square roots, hence being perfect squares themselves.
Implement as follows....
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0 and type(i**0.5) != int
))
Section 4 - I think this Section isn't needed because Square Free Numbers doesn't have such Restriction
Section 5 - Finalizing Count, Printing Count
There's absolutely no need of counter, you can just compute length of factors list, and use it as Count.
OPTIMISED CODES
Way 1 - Little Faster
number = int(input())
# Find Factors of the given number
factors = []
for i in range(2, int(number**0.5)+1):
if number%i == 0 and type(i**0.5) != int:
factors.extend([i, int(number/i)])
print([1] + factors)
Way 2 - Optimal Programming - Very Fast
from itertools import chain
from sys import stdin, stdout
number = int(stdin.readline())
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0 and type(i**0.5) != int
))
stdout.write(', '.join(map(str, [1] + factors)))
First of all, you only need to check for i in range(1, number/2):, since number/2 + 1 and greater cannot be factors.
Second, you can compute the number of perfect squares that could be factors in sublinear time:
squares = []
for i in range(1, math.floor(math.sqrt(number/2))):
squares.append(i**2)
Third, you can search for factors and when you find one, check that it is not divisible by a square, and only then add it to the list of factors.
This approach will save you all the time of your for items in factors nested loop block, as well as the next block. I'm not sure if it will definitely be faster, but it is less wasteful.
I used the code provided in the answer above but it didn't give me the correct answer. This actually computes the square free list of factors of a number.
number = int(input())
factors = [
i for i in range(2, int(number/2)+1)
if number%i == 0 and int(int(math.sqrt(i))**2)!=i
]
print([1] + factors)

make change in python (maximum recursion depth exceeded in comparison)

So I have a recursive solution to the make change problem that works sometimes. It is:
def change(n, c):
if (n == 0):
return 1
if (n < 0):
return 0;
if (c + 1 <= 0 and n >= 1):
return 0
return change(n, c - 1) + change(n - coins[c - 1], c);
where coins is my array of coins. For example [1,5,10,25]. n is the amount of coins, for example 1000, and c is the length of the coins array - 1. This solution works in some situations. But when I need it to run in under two seconds and I use:
coins: [1,5,10,25]
n: 1000
I get a:
RecursionError: maximum recursion depth exceeded in comparison
So my question is, what would be the best way to optimize this. Using some sort of flow control? I don't want to do something like.
# Set recursion limit
sys.setrecursionlimit(10000000000)
UPDATE:
I now have something like
def coinss(n, c):
if n == 0:
return 1
if n < 0:
return 0
nCombos = 0
for c in range(c, -1, -1):
nCombos += coinss(n - coins[c - 1], c)
return nCombos
but it takes forever. it'd be ideal to have this run under a second.
As suggested in the answers above you could use DP for a more optimal solution.
Also your conditional check -
if (c + 1 <= 0 and n >= 1)
should be
if (c <= 1 ):
as n will always be >=1 and c <= 1 will prevent any calculations if the number of coins is lesser than or equal to 1.
While using recursion you will always run into this. If you set the recursion limit higher, you may be able to use your algorithm on a bigger number, but you will always be limited. The recursion limit is there to keep you from getting a stack overflow.
The best way to solved for bigger change amounts would be to swap to an iterative approach. There are algorithms out there, wikipedia:
https://en.wikipedia.org/wiki/Change-making_problem
Note that you have a bug here:
if (c + 1 <= 0 and n >= 1):
is like
if (c <= -1 and n >= 1):
So c can be 0 and pass to the next step where you pass c-1 to the index, which works because python doesn't mind negative indexes but still false (coins[-1] yields 25), so your solution sometimes prints 1 combination too much.
I've rewritten your algorithm with recursive and stack approaches:
Recursive (fixed, no need for c at init thanks to an internal recursive method, but still overflows the stack):
coins = [1,5,10,25]
def change(n):
def change_recurse(n, c):
if n == 0:
return 1
if n < 0:
return 0;
if c <= 0:
return 0
return change_recurse(n, c - 1) + change_recurse(n - coins[c - 1], c)
return change_recurse(n,len(coins))
iterative/stack approach (not dynamic programming), doesn't recurse, just uses a "stack" to store the computations to perform:
def change2(n):
def change_iter(stack):
result = 0
# continue while the stack isn't empty
while stack:
# process one computation
n,c = stack.pop()
if n == 0:
# one solution found, increase counter
result += 1
if n > 0 and c > 0:
# not found, request 2 more computations
stack.append((n, c - 1))
stack.append((n - coins[c - 1], c))
return result
return change_iter([(n,len(coins))])
Both methods return the same values for low values of n.
for i in range(1,200):
a,b = change(i),change2(i)
if a != b:
print("error",i,a,b)
the code above runs without any error prints.
Now print(change2(1000)) takes a few seconds but prints 142511 without blowing the stack.

Resources