Create a dictionary of subcubes from larger cube in Python - python-3.x

I am examining every contiguous 8 x 8 x 8 cube within a 50 x 50 x 50 cube. I am trying to create a collection (in this case a dictionary) of the subcubes that contain the same sum and a count of how many subcubes share that same sum. So in essence, the result would look something like this:
{key = sum, value = number of cubes that have the same sum}
{256 : 3, 119 : 2, ...}
So in this example, there are 3 cubes that sum to 256 and 2 cubes that sum to 119, etc. Here is the code I have thus far, but it only sums (at least I think it does):
an_array = np.array([i for i in range(500)])
cube = np.reshape(an_array, (8, 8, 8))
c_size = 8 # cube size
sum = 0
idx = None
for i in range(cube.shape[0] - cs + 2):
for j in range(cube.shape[1] - cs + 2):
for k in range(cube.shape[2] - cs + 2):
cube_sum = np.sum(cube[i:i + cs, j:j + cs, k:k + cs])
new_list = {cube_sum : ?}
What I am trying to make this do is iterate the cube within cubes, sum all cubes then count the cubes that share the same sum. Any ideas would be appreciated.

from collections import defaultdict
an_array = np.array([i for i in range(500)])
cube = np.reshape(an_array, (8, 8, 8))
c_size = 8 # cube size
sum = 0
idx = None
result = defaultdict(int)
for i in range(cube.shape[0] - cs + 2):
for j in range(cube.shape[1] - cs + 2):
for k in range(cube.shape[2] - cs + 2):
cube_sum = np.sum(cube[i:i + cs, j:j + cs, k:k + cs])
result[cube_sum] += 1
Explanation
The defaultdict(int), can be read as a result.get(key, 0). Which means that if a key doesn't exists it will be initialized with 0. So the line result[cube_sum] += 1, will either contain 1, or add 1 to the current number of cube_sum.

Related

Numpy tensor implementation slower than loop

I have two functions that compute the same metric. One ends up using a list comprehension to cycle through a calculation, the other uses only numpy tensor operations. The functions take in a (N, 3) array, where N is the number of points in 3D space. When N <~ 3000 the tensor function is faster, when N >~ 3000 the list comprehension is faster. Both seem to have linear time complexity in terms of N i.e two time-N lines cross at N=~3000.
def approximate_area_loop(section, num_area_divisions):
n_a_d = num_area_divisions
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = lambda u: (1 - u) * a1 + u * a2
d = lambda u: (1 - u) * b1 + u * b2
x = lambda u, v: (1 - v) * c(u) + v * d(u)
area = np.sum([np.linalg.norm(np.cross((x((i + 1)/n_a_d, j/n_a_d) - x(i/n_a_d, j/n_a_d)),\
(x(i/n_a_d, (j +1)/n_a_d) - x(i/n_a_d, j/n_a_d))), axis = 1)\
for i in range(n_a_d) for j in range(n_a_d)])
Dt = section[-1, 0] - section[0, 0]
return area, Dt
def approximate_area_tensor(section, num_area_divisions):
divisors = np.linspace(0, 1, num_area_divisions + 1)
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = np.multiply.outer(a1, (1 - divisors)) + np.multiply.outer(a2, divisors) # c_areas_vecs_divs
d = np.multiply.outer(b1, (1 - divisors)) + np.multiply.outer(b2, divisors) # d_areas_vecs_divs
x = np.multiply.outer(c, (1 - divisors)) + np.multiply.outer(d, divisors) # x_areas_vecs_Divs_divs
u = x[:, :, 1:, :-1] - x[:, :, :-1, :-1] # u_areas_vecs_Divs_divs
v = x[:, :, :-1, 1:] - x[:, :, :-1, :-1] # v_areas_vecs_Divs_divs
sub_area_norm_vecs = np.cross(u, v, axis = 1) # areas_crosses_Divs_divs
sub_areas = np.linalg.norm(sub_area_norm_vecs, axis = 1) # areas_Divs_divs (values are now sub areas)
area = np.sum(sub_areas)
Dt = section[-1, 0] - section[0, 0]
return area, Dt
Why does the list comprehension version work faster at large N? Surely the tensor version should be faster? I'm wondering if it's something to do with the size of the calculations meaning it's too big to be done in cache? Please ask if I haven't included enough information, I'd really like to get to the bottom of this.
The bottleneck in the fully vectorized function was indeed in the np.linalg.norm as #hpauljs comment suggested.
Norm was used only to get the magnitude of all the vectors contained in axis 1. A much simpler and faster method was to just:
sub_areas = np.sqrt((sub_area_norm_vecs*sub_area_norm_vecs).sum(axis = 1))
This gives exactly the same results and sped up the code by up to 25 times faster than the loop implementation (even when the loop doesn't use linalg.norm either).

Find the minimum move between 2 list using Python?

Given a=[123,45] and b=[232,64] we need to determine moves from a to b
a[0] to b[0] leads to:
increment by 1(1 to 2),
increment by 1(2 to 3),
decrement by 1(3 to 2)
so 3 move in total (1+1+1).
a[1] to b[1] leads to
increment by 2(4 to 6),
decrement by 1(5 to 4)
3 move in total (2 + 1)
Min moves = 3 + 3 = 6.
So for given 2 list we need to find the total moves to reach next list?
My program which is wrong is below
def sub(a,b):
s = 0
for x, y in zip(a,b):
s += x-y
return s
sub([123,45],[232,64])
-128
This should work:
def sub(a,b):
s = 0
for x, y in zip(a,b):
s += sum(abs(int(n) - int(m)) for n, m in zip(str(x), str(y)))
return s
print(sub([123,45],[232,64]))
Output:
6

Optimization of CodeWars Python 3.6 code: Integers: Recreation One

I need help optimizing my python 3.6 code for the CodeWars Integers: Recreation One Kata.
We are given a range of numbers and we have to return the number and the sum of the divisors squared that is a square itself.
"Divisors of 42 are : 1, 2, 3, 6, 7, 14, 21, 42. These divisors squared are: 1, 4, 9, 36, 49, 196, 441, 1764. The sum of the squared divisors is 2500 which is 50 * 50, a square!
Given two integers m, n (1 <= m <= n) we want to find all integers between m and n whose sum of squared divisors is itself a square. 42 is such a number."
My code works for individual tests, but it times out when submitting:
def list_squared(m, n):
sqsq = []
for i in range(m, n):
divisors = [j**2 for j in range(1, i+1) if i % j == 0]
sq_divs = sum(divisors)
sq = sq_divs ** (1/2)
if int(sq) ** 2 == sq_divs:
sqsq.append([i, sq_divs])
return sqsq
You can reduce complexity of loop in list comprehension from O(N) to O(Log((N)) by setting the max range to sqrt(num)+1 instead of num.
By looping from 1 to sqrt(num)+1, we can conclude that if i (current item in the loop) is a factor of num then num divided by i must be another one.
Eg: 2 is a factor of 10, so is 5 (10/2)
The following code passes all the tests:
import math
def list_squared(m, n):
result = []
for num in range(m, n + 1):
divisors = set()
for i in range(1, int(math.sqrt(num)+1)):
if num % i == 0:
divisors.add(i**2)
divisors.add(int(num/i)**2)
total = sum(divisors)
sr = math.sqrt(total)
if sr - math.floor(sr) == 0:
result.append([num, total])
return result
It's more the math issue. Two maximum divisors for i is i itself and i/2. So you can speed up the code twice just using i // 2 + 1 as the range stop instead of i + 1. Just don't forget to increase sq_divs for i ** 2.
You may want to get some tiny performance improvements excluding sq variable and sq_divs ** (1/2).
BTW you should use n+1 stop in the first range.
def list_squared(m, n):
sqsq = []
for i in range(m, n+1):
divisors = [j * j for j in range(1, i // 2 + 1 #speed up twice
) if i % j == 0]
sq_divs = sum(divisors)
sq_divs += i * i #add i as divisor
if ((sq_divs) ** 0.5) % 1 == 0: #tiny speed up here
sqsq.append([i, sq_divs])
return sqsq
UPD: I've tried the Kata and it's still timeout. So we need even more math! If i could be divided by j then it's also could be divided by i/j so we can use sqrt(i) (int(math.sqrt(i)) + 1)) as the range stop. if i % j == 0 then append j * j to divisors array. AND if i / j != j then append (i / j) ** 2.

Time and memory limit exceeded - Python3 - Number Theory

I am trying to find the sum of the multiples of 3 or 5 of all the numbers upto N.
This is a practise question on HackerEarth. I was able to pass all the test cases except 1. I get a time and memory exceeded error. I looked up the documentation and learnt that int can handle large numbers and the type bignum was removed.
I am still learning python and would appreciate any constructive feedback.
Could you please point me in the right direction so I can optimise the code myself?
test_cases = int(input())
for i in range(test_cases):
user_input = int(input())
sum = 0
for j in range (0, user_input):
if j % 3 == 0:
sum = sum + j
elif j % 5 == 0:
sum = sum + j
print(sum)
In such problems, try to use some math to find a direct solution rather than brute-forcing it.
You can calculate the number of multiples of k less than n, and calculate the sum of the multiples.
For example, with k=3 and n=13, you have 13 // 3 = 4 multiples.
The sum of these 4 multiples of 3 is 3*1 + 3*2 + 3*3 + 3*4 = 3 * (1+2+3+4)
Then, use the relation: 1+2+....+n = n*(n+1)/2
To sum the multiples of 3 and 5, you can sum the multiples of 3, add the sum of the multiples of 5, and subtract the ones you counted twice: the multiples of 15.
So, you could do it like this:
def sum_of_multiples_of(k, n):
"""
Returns the sum of the multiples of k under n
"""
# number of multiples of k between 1 and n
m = n // k
return k * m * (m+1) // 2
def sum_under(n):
return (sum_of_multiples_of(3, n)
+ sum_of_multiples_of(5, n)
- sum_of_multiples_of(15, n))
# 3+5+6+9+10 = 33
print(sum_under(10))
# 33
# 3+5+6+9+10+12+15+18 = 78
print(sum_under(19))
# 78

Dynamic programming table - Finding the minimal cost to break a string

A certain string-processing language offers a primitive operation
which splits a string into two pieces. Since this operation involves
copying the original string, it takes n units of time for a string of
length n, regardless of the location of the cut. Suppose, now, that
you want to break a string into many pieces.
The order in which the breaks are made can affect the total running
time. For example, suppose we wish to break a 20-character string (for
example "abcdefghijklmnopqrst") after characters at indices 3, 8, and
10 to obtain for substrings: "abcd", "efghi", "jk" and "lmnopqrst". If
the breaks are made in left-right order, then the first break costs 20
units of time, the second break costs 16 units of time and the third
break costs 11 units of time, for a total of 47 steps. If the breaks
are made in right-left order, the first break costs 20 units of time,
the second break costs 11 units of time, and the third break costs 9
units of time, for a total of only 40 steps. However, the optimal
solution is 38 (and the order of the cuts is 10, 3, 8).
The input is the length of the string and an ascending-sorted array with the cut indexes. I need to design a dynamic programming table to find the minimal cost to break the string and the order in which the cuts should be performed.
I can't figure out how the table structure should look (certain cells should be the answer to certain sub-problems and should be computable from other entries etc.). Instead, I've written a recursive function to find the minimum cost to break the string: b0, b1, ..., bK are the indexes for the cuts that have to be made to the (sub)string between i and j.
totalCost(i, j, {b0, b1, ..., bK}) = j - i + 1 + min {
totalCost(b0 + 1, j, {b1, b2, ..., bK}),
totalCost(i, b1, {b0 }) + totalCost(b1 + 1, j, {b2, b3, ..., bK}),
totalCost(i, b2, {b0, b1 }) + totalCost(b2 + 1, j, {b3, b4, ..., bK}),
....................................................................................
totalCost(i, bK, {b0, b1, ..., b(k - 1)})
} if k + 1 (the number of cuts) > 1,
j - i + 1 otherwise.
Please help me figure out the structure of the table, thanks!
For example we have a string of length n = 20 and we need to break it in positions cuts = [3, 8, 10]. First of all let's add two fake cuts to our array: -1 and n - 1 (to avoid edge cases), now we have cuts = [-1, 3, 8, 10, 19]. Let's fill table M, where M[i, j] is a minimum units of time to make all breaks between i-th and j-th cuts. We can fill it by rule: M[i, j] = (cuts[j] - cuts[i]) + min(M[i, k] + M[k, j]) where i < k < j. The minimum time to make all cuts will be in the cell M[0, len(cuts) - 1]. Full code in python:
# input
n = 20
cuts = [3, 8, 10]
# add fake cuts
cuts = [-1] + cuts + [n - 1]
cuts_num = len(cuts)
# init table with zeros
table = []
for i in range(cuts_num):
table += [[0] * cuts_num]
# fill table
for diff in range(2, cuts_num):
for start in range(0, cuts_num - diff):
end = start + diff
table[start][end] = 1e9
for mid in range(start + 1, end):
table[start][end] = min(table[start][end], table[
start][mid] + table[mid][end])
table[start][end] += cuts[end] - cuts[start]
# print result: 38
print(table[0][cuts_num - 1])
Just in case you may feel easier to follow when everything is 1-based (same as DPV Dasgupta Algorithm book problem 6.9, and same as UdaCity Graduate Algorithm course initiated by GaTech), following is the python code that does the equivalent thing with the previous python code by Jemshit and Aleksei. It follows the chain multiply (binary tree) pattern as taught in the video lecture.
import numpy as np
# n is string len, P is of size m where P[i] is the split pos that split string into [1,i] and [i+1,n] (1-based)
def spliting_cost(P, n):
P = [0,] + P + [n,] # make sure pos list contains both ends of string
m = len(P)
P = [0,] + P # both C and P are 1-base indexed for easy reading
C = np.full((m+1,m+1), np.inf)
for i in range(1, m+1): C[i, i:i+2] = 0 # any segment <= 2 does not need split so is zero cost
for s in range(2, m): # s is split string len
for i in range(1, m-s+1):
j = i + s
for k in range(i, j+1):
C[i,j] = min(C[i,j], P[j] - P[i] + C[i,k] + C[k,j])
return C[1,m]
spliting_cost([3, 5, 10, 14, 16, 19], 20)
The output answer is 55, same as that with split points [2, 4, 9, 13, 15, 18] in the previous algorithm.

Resources