Iterate over list in random sequential multiples of n Python - python-3.x

I have the following example code:
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def batch_gen(data, batch_size):
for i in range(0, len(data), batch_size):
yield data[i:i+batch_size]
batch_size = randint(1,4)
n = 0
print("batch_size is {}".format(batch_size))
for i in batch_gen(l, batch_size):
for x in i:
print(x)
print("loop {}".format(n))
n += 1
batch_size = randint(1,4)
Which gives me the output of:
batch_size is 3
1
2
3
loop 0
4
5
6
loop 1
7
8
9
loop 2
10
loop 3
This is the output I'm looking for however batch_size is always set from batch_size = randint(1,4) outside of the for loop.
I'm looking at having a random batch_size for each iteration of the loop, and each next loop picking up where the last loop left off until completion of a random length list.
Any help would be greatly appreciated!
Example code taken from Iterate over a python sequence in multiples of n?

Related

Why is the function returning value of stack as None

I'm not able to figure out why the path variable on last line of the code is being printed out as None. As you can see the second last line of the code is calling the DFS function to find the path between two nodes in a tree (I'm giving a tree as input). I've printed out the value of the stack also before returning it to make sure that it is not None and while being printed inside DFS function it is not None. But I'm not able to understand why it is None when it is returned and stored in path variable. I gave this as input
1
6 1
4 2 1 3 5 2
1 2
2 3
2 4
1 5
5 6
And the out put came as
{0: [1, 4], 1: [0, 2, 3], 2: [1], 3: [1], 4: [0, 5], 5: [4]}
[0, 1, 3]
None
Here is the code for reference
def DFS(adj,x, y,stack,vis):
stack.append(x)
if (x == y):
print(stack)
return stack
vis[x] = 1
if (len(adj[x])>0):
for j in adj[x]:
if (vis[j]==0):
DFS(adj,j,y,stack,vis)
del stack[-1]
T = int(input())
for a in range(T):
N,Q = input().split()
N = int(N)
Q = int(Q)
wt = [int(num) for num in input().split(" ")]
adj = {}
for i in range(N):
adj[i] = []
for b in range(N-1):
u,v = input().split()
u = int(u) - 1
v = int(v) - 1
adj[u].append(v)
adj[v].append(u)
print(adj)
vis = [0]*N
stack = []
path = DFS(adj,0,3,stack,vis)
print(path)
Simple equivalent of your code:
def recursive_func(x):
if x > 0:
return x
else:
x += 1
recursive_func(x)
x = 5
x = recursive_func(x)
print(x)
x = 0
x = recursive_func(x)
print(x)
Output:
5
None
What's happening here?
x, with a value of 5 is sent to recursive_func.
x is greater than 0, so 5 is returned. This is seen in the output.
x, with a value of -5 is sent to recursive_func.
x is not greater than 0, so 1 is added to x.
x, with a value of 1, is then sent to a different recursive_func.
This recursive_func returns 1 because 1 > 0.
This response gets passed to the first recursive_func where the line recursive_func(x) becomes 1, but we don't do anything with it.
recursive_function hits the end of its code, without returning a value. By default None is returned to our main body.
x = recursive_func(x) has become x = None
None is output.
Given this information, why does the following code perform differently?
Simple modification of your code:
def recursive_func_v2(x):
if x > 0:
return x
else:
x += 1
return recursive_func_v2(x)
x = 5
x = recursive_func_v2(x)
print(x)
x = 0
x = recursive_func_v2(x)
print(x)
Output:
5
1

How is this possible? Maximum subarray sum

If there is arr = [1, 2, 3] so len(arr) is 3 right?
for i in range(0, len(arr)+1):
print(arr[i])
It is no secret that you can not do that, simply IndexError: list index out of range.
So how is this possible?
def max_sequence(arr):
if arr:
li = []
x = {sum(arr[i:j]): arr[i:j] for i in range(0, len(arr))
for j in range(1, len(arr)+1)}
li.append(max(x.items()))
for ii in li:
print(ii)
return li[0][0]
else:
return 0
print(max_sequence([26, 5, 3, 30, -15, -7, 10, 20, 22, 4]))
I simply had to find the maximum sum of a contiguous subsequence in a list of integers.
If I write this part:
x = {sum(arr[i:j]): arr[i:j] for i in range(0, len(arr))
for j in range(1, len(arr))}
It shows that maximum sum is 94, that is incorrect.
If I write this:
x = {sum(arr[i:j]): arr[i:j] for i in range(0, len(arr))
for j in range(1, len(arr)+1)}
Maximum sum is 98, it is correct. But why is so? If I write "for j in range(1, len(arr)+1)" why there is no IndexError?
We can generate a sequence of numbers using range() function. Range(10) will generate numbers from 0 to 9 (10 numbers).
We can also define the start, stop and step size as range(start, stop,step_size).
Here in your example, "for j in range(1, len(arr)+1)"
len(arr) is 10.
So range will generate numbers from 1 to 10.
Also, your li is an empty array so its length can be varied. You are storing the result i.e. the sum in li array as well as it will store the original array and this j will help it to store it.

Printing fibonacci series using lists in python

I am relatively a noobie in programming and trying to learn python. I was trying to implement a Fibonacci series into a list within 10.
fibo= [0,1]
for k in range(11):
i= fibo[-1]
j = fibo[-2]
k= fibo[i]+fibo[j]
fibo.append(k)
k=+1
print(fibo)
Not sure what I did wrong? Any help is really appreciated!
Output:
[0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
You can use this code to print fibonacci series upto N.
N = int(input()) # Length of fibonacci series
fibo = [0, 1]
a = fibo[0]
b = fibo[1]
for each in range(2, N):
c = a + b
a, b = b, c
fibo.append(c)
print(fibo[:N])
OUTPUT
N = 10
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
I can see few issues here:
fibo = [0,1]
# Your loop should start at index 2, otherwise
# you will try to access fibo[-2] and fibo[-1] with k=0
# and fibo[-1] and fibo[0] with k=1
# Also, if you only want 10 values, you should change 11
# to 10 since fibo already has two values.
# for k in range(11):
for k in range(2, 10):
# You don't want the content of fibo, but the indexes!
# i = fibo[-1]
# j = fibo[-2]
i = k - 1
j = k - 2
# You are already using k, so you need to use an other variable
# k = fibo[i] + fibo[j]
v = fibo[i] + fibo[j]
fibo.append(v)
# You don't need k+=1 because it will be incremented
# according to range(2, 10). The loop will start with
# k = 2 and it will stop when k = 9
# k += 1
print(fibo)
Your code did not crash because you technically can access fibo[-1] and fibo[-2]. It will respectively return the last value of your array, and the value before the last one.

Optimization of CodeWars Python 3.6 code: Integers: Recreation One

I need help optimizing my python 3.6 code for the CodeWars Integers: Recreation One Kata.
We are given a range of numbers and we have to return the number and the sum of the divisors squared that is a square itself.
"Divisors of 42 are : 1, 2, 3, 6, 7, 14, 21, 42. These divisors squared are: 1, 4, 9, 36, 49, 196, 441, 1764. The sum of the squared divisors is 2500 which is 50 * 50, a square!
Given two integers m, n (1 <= m <= n) we want to find all integers between m and n whose sum of squared divisors is itself a square. 42 is such a number."
My code works for individual tests, but it times out when submitting:
def list_squared(m, n):
sqsq = []
for i in range(m, n):
divisors = [j**2 for j in range(1, i+1) if i % j == 0]
sq_divs = sum(divisors)
sq = sq_divs ** (1/2)
if int(sq) ** 2 == sq_divs:
sqsq.append([i, sq_divs])
return sqsq
You can reduce complexity of loop in list comprehension from O(N) to O(Log((N)) by setting the max range to sqrt(num)+1 instead of num.
By looping from 1 to sqrt(num)+1, we can conclude that if i (current item in the loop) is a factor of num then num divided by i must be another one.
Eg: 2 is a factor of 10, so is 5 (10/2)
The following code passes all the tests:
import math
def list_squared(m, n):
result = []
for num in range(m, n + 1):
divisors = set()
for i in range(1, int(math.sqrt(num)+1)):
if num % i == 0:
divisors.add(i**2)
divisors.add(int(num/i)**2)
total = sum(divisors)
sr = math.sqrt(total)
if sr - math.floor(sr) == 0:
result.append([num, total])
return result
It's more the math issue. Two maximum divisors for i is i itself and i/2. So you can speed up the code twice just using i // 2 + 1 as the range stop instead of i + 1. Just don't forget to increase sq_divs for i ** 2.
You may want to get some tiny performance improvements excluding sq variable and sq_divs ** (1/2).
BTW you should use n+1 stop in the first range.
def list_squared(m, n):
sqsq = []
for i in range(m, n+1):
divisors = [j * j for j in range(1, i // 2 + 1 #speed up twice
) if i % j == 0]
sq_divs = sum(divisors)
sq_divs += i * i #add i as divisor
if ((sq_divs) ** 0.5) % 1 == 0: #tiny speed up here
sqsq.append([i, sq_divs])
return sqsq
UPD: I've tried the Kata and it's still timeout. So we need even more math! If i could be divided by j then it's also could be divided by i/j so we can use sqrt(i) (int(math.sqrt(i)) + 1)) as the range stop. if i % j == 0 then append j * j to divisors array. AND if i / j != j then append (i / j) ** 2.

Keras aggregated objective function

How to add aggregated error to keras model?
Having table:
g x y
0 1 1 1
1 1 2 2
2 1 3 3
3 2 1 2
4 2 2 1
I would like to be able to minimize sum((y - y_pred) ** 2) error along with
sum((sum(y) - sum(y_pred)) ** 2) per group.
I'm fine to have bigger individual sample errors, but it is crucial for me to have right totals.
SciPy example:
import pandas as pd
from scipy.optimize import differential_evolution
df = pd.DataFrame({'g': [1, 1, 1, 2, 2], 'x': [1, 2, 3, 1, 2], 'y': [1, 2, 3, 2, 1]})
g = df.groupby('g')
def linear(pars, fit=False):
a, b = pars
df['y_pred'] = a + b * df['x']
if fit:
sample_errors = sum((df['y'] - df['y_pred']) ** 2)
group_errors = sum((g['y'].sum() - g['y_pred'].sum()) ** 2)
total_error = sum(df['y'] - df['y_pred']) ** 2
return sample_errors + group_errors + total_error
else:
return df['y_pred']
pars = differential_evolution(linear, [[0, 10]] * 2, args=[('fit', True)])['x']
print('SAMPLES:\n', df, '\nGROUPS:\n', g.sum(), '\nTOTALS:\n', df.sum())
Output:
SAMPLES:
g x y y_pred
0 1 1 1 1.232
1 1 2 2 1.947
2 1 3 3 2.662
3 2 1 2 1.232
4 2 2 1 1.947
GROUPS:
x y y_pred
g
1 6 6 5.841
2 3 3 3.179
TOTALS:
g 7.000
x 9.000
y 9.000
y_pred 9.020
For grouping, as long as you keep the same groups throughout training, your loss function will not have problems about being not differentiable.
As a naive form of grouping, you can simply separate the batches.
I suggest a generator for that.
#suppose you have these three numpy arrays:
gTrain
xTrain
yTrain
#create this generator
def grouper(g,x,y):
while True:
for gr in range(1,g.max()+1):
indices = g == gr
yield (x[indices],y[indices])
For the loss function, you can make your own:
import keras.backend as K
def customLoss(yTrue,yPred):
return K.sum(K.square(yTrue-yPred)) + K.sum(K.sum(yTrue) - K.sum(yPred))
model.compile(loss=customLoss, ....)
Just be careful with the second term if you have negative values.
Now you train using the method fit_generator:
model.fit_generator(grouper(gTrain,xTrain, yTrain), steps_per_epoch=gTrain.max(), epochs=...)

Resources