How to determine which nested generator produces StopIteration exception? - python-3.x

I bumped into a situation where I need to determine in my try/except code which nested generator is raising a StopIteration exception. How do I do it? The following is a dummy example:
def genOne(iMax, jMax):
i = 0;
g2 = genTwo(jMax)
while i <= iMax:
print('genOne: ' + str(i))
next(g2)
yield
i = i + 1
def genTwo(jMax):
j = 0;
while j <= jMax:
print('genTwo: ' + str(j))
yield
j = j + 1
g1 = genOne(6, 3) # The inputs are arbitrary numbers
try:
while True:
next(g1)
except:
# Do some processing depending on who generates the StopIteration exception
Thanks!

This can be generalized to the problem of finding the origin of an arbitrary exception.
Use the traceback module to inspect the stacktrace of your exception object.
Here is a previous answer on a similar subject.
Some example code:
g1 = genOne(6, 3) # The inputs are arbitrary numbers
try:
while True:
next(g1)
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
print(traceback.extract_tb(exc_traceback)[-1])
Shell output:
> ./test.py
genOne: 0
genTwo: 0
genOne: 1
genTwo: 1
genOne: 2
genTwo: 2
genOne: 3
genTwo: 3
genOne: 4
('./test.py', 12, 'genOne', 'next(g2)')
Note that the [-1] in the extract_tb() call explicitly checks only the first lower level of the stacktrace. With the print you can see which element of that output you'd need to check (genOne -> item index #2 in that list). In your particular example you'd probably want to check if the lowest level generator string genTwo exists in any of the elements of the traceback.extract_tb(exc_traceback) array.
Those hardcoded checks relying on internal code details are been frowned upon, especially since in your particular example you do not have control over their implementation.

Related

Print statement does not get executed when MPI Barrier introduced

I am using python and mpi4py, and have encountered a scenario I do not understand. The below code is a minimal working example mwe.py.
import numpy as np
from mpi4py import MPI
import time
import itertools
N=8
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
N_sub = comm.Get_size()-1
get_sub = itertools.cycle(range(1, N_slaves+1))
if rank == 0:
print("I am rank {}".format(rank))
data = []
for i in range(N):
nums = np.random.normal(size=6)
data.append(nums)
sleep_short = 0.0001
sleep_long = 0.1
sleep_dict = {}
for r in range(1, N_sub+1):
sleep_dict[str(r)] = 1
data_idx = 0
while len(data) > data_idx:
r = next(get_sub)
if comm.iprobe(r):
useless_info = comm.recv(source=r)
print(useless_info)
comm.send(data[data_idx], dest=r)
data_idx += 1
print("data_idx {}".format(data_idx))
sleep_dict[str(r)] = 1
else:
sleep_dict[str(r)] = 0
if all(value == 0 for value in sleep_dict.values()):
time.sleep(sleep_long)
else:
time.sleep(sleep_short)
for r in range(1, N_sub+1):
comm.send('Done', dest=r)
else:
print("rank {}".format(rank))
######################
# vvv This is the statement in question
######################
comm.Barrier()
while True:
comm.send("I am done", dest=0)
model = comm.recv(source=0)
if type(model).__name__ == 'str':
break
MPI.Finalize()
sys.exit(0)
When run with mpirun -np 4 python mwe.py, this code generates an array containing lists of random numbers, and then distributes these lists to the "sub" ranks until all arrays have been sent. Understandably, if I insert a comm.Barrier() call near the bottom (where I have indicated in the code), the code no longer completes execution, as the sub ranks (not equal to 0) never get to the statement where they are to receive what is being sent from rank 0. Rank 0 keeps trying to find a rank to pass the array to, but never does since the other ranks are held up, and the code hangs.
This makes sense to me. What doesn't make sense to me is that with the comm.Barrier() statement included, the preceding print statement also does not execute. Based on my understanding, the sub ranks should proceed normally until they hit the barrier statement, and then wait there until all of the ranks 'catch up', which in this case never happens because rank 0 is in its own loop. If this is the case, the preceding print statement should be executed, as it comes before those ranks have gotten to the barrier line. So why does the statement not get printed? Can anyone explain where my understanding fails?

Why is my if-else block not running more than twice when it should?

I am trying to find the multiplicative digital root MDR and multiplicative persistence MR of a given input value. For this, I have to use three functions
MDR- to find the multiplicative digital root
MPersistence - to find multiplicative persistence
prodDigits - to find the product of digits of the input number.
I have used the if-else block to give a recursive call to prodDigits function. But it only runs twice.
For example if I give input 86, It should run as 86 -> 48 -> 32 -> 6 (MDR 6, Mpersistence 3)
But my code only gives the output till 32 and persistence as 1
num = input("Enter the number")
def MDR(num):
c = 0
def prodDigits(num):
individual_ele = map(int, str(num))
productOfDigits = 1
for i in individual_ele:
productOfDigits *= i
print(productOfDigits)
return productOfDigits
res = prodDigits(num)
c += 1
if res>=10:
res1 = str(res)
prodDigits(res1)
else:
print(res)
print("Mpersistence is",c)
def Mpersistence():
MDR(num)
Mpersistence()
The output is
Enter the number86
48
32
Mpersistence = 1
How can I fix this?
An if/else block shouldn't run "more than once". An if/else block runs just once.
What you are looking for is called a loop and can be achieved with the while keyword.
Compare these two functions:
def run_once(num):
if (num > 9):
num = prodDigits(num)
return num
def run_morethanonce(num):
while (num > 9):
num = prodDigits(num)
return num
x = run_once(86)
# = 48
y = run_morethanonce(num)
# = 6
Note that "if" and "while" are both English words. Read the code of these two functions in English. The first one calls prodDigits if some condition is satisfied. The second one keeps calling prodDigits as long as some condition is satisfied.

How can I avoid an “out of index” error with 2D arrays in Python?

The task is to iterate over all elements within a two-dimensional list and do some specific calculations on each element and its nearest neighbors:
count = 0
for i in range(0,len(arr)):
for j in range(0,len(arr)):
if arr[i][j] == 7 and is_perfect_cube(arr[i-1][j] + arr[i+1][j] + arr[i][j-1] + arr[i][j+1]):
count += 1
Unfortunately, I keep getting an index out of range error. Basing on what I’ve managed to debug so far is that the error occurs for the first and last elements of the collection. I know I could use float[int], but I’m not sure how to apply it to my particular implementation. I couldn’t find any similar questions.
Your code explicitly tries to use elements that are out of array bounds. In this call,
is_perfect_cube(arr[i-1][j]+arr[i+1][j]+arr[i][j-1]+arr[i][j+1])
you are asking for arr[i-1][j] which equals to -1 for i=0 - and fetches the last row - this may be desirable in some cases. Then you also ask for arr[i+1][j] which for i equal to the last element, i.e. len(arr)-1 is clearly out of bounds and causes an error. The same holds for arr[i][j+1] - just in terms of columns and not rows.
To handle this, you need to either ignore the end points (loop from 1 to size-2 in both dimensions) or modify your algorithm for the end points. The choice depends on the problem you are trying to solve. All established algorithms consider end points, you can check what are the solutions for yours.
count = 0
for i in range(0,len(arr)):
for j in range(0,len(arr)):
if arr[i][j] == 7:
try:
up = arr[i-1][j]
except:
up = 0
try:
down = arr[i+1][j]
except:
down = 0
try:
left = arr[i][j-1]
except:
left = 0
try:
right = arr[i][j+1]
except:
right = 0
if is_perfect_cube(up+down+left+right):
count += 1
This is my solutions for my question. I don't know if this is the correct way to solve out of boundary problems. I am still open for advice

find the first occurrence of a number greater than k in a sorted array

For the given sorted list,the program should return the index of the number in the list which is greater than the number which is given as input.
Now when i run code and check if it is working i am getting 2 outputs. One is the value and other output is None.
If say i gave a input of 3 for the below code.The expected output is index of 20 i.e., 1 instead i am getting 1 followed by None.
If i give any value that is greater than the one present in the list i am getting correct output i.e., "The entered number is greater than the numbers in the list"
num_to_find = int(input("Enter the number to be found"))
a=[2,20,30]
def occur1(a,num_to_find):
j = i = 0
while j==0:
if a[len(a)-1] > num_to_find:
if num_to_find < a[i]:
j=1
print(i)
break
else:
i = i + 1
else:
ret_state = "The entered number is greater than the numbers in the list"
return ret_state
print(occur1(a,num_to_find))
This code is difficult to reason about due to extra variables, poor variable names (j is typically used as an index, not a bool flag), usage of break, nested conditionals and side effect. It's also inefficient because it needs to visit each element in the list in the worst case scenario and fails to take advantage of the sorted nature of the list to the fullest. However, it appears working.
Your first misunderstanding is likely that print(i) is printing the index of the next largest element rather than the element itself. In your example call of occur1([2, 20, 30], 3)), 1 is where 20 lives in the array.
Secondly, once the found element is printed, the function returns None after it breaks from the loop, and print dutifully prints None. Hopefully this explains your output--you can use return a[i] in place of break to fix your immediate problem and meet your expectations.
Having said that, Python has a builtin module for this: bisect. Here's an example:
from bisect import bisect_right
a = [1, 2, 5, 6, 8, 9, 15]
index_of_next_largest = bisect_right(a, 6)
print(a[index_of_next_largest]) # => 8
If the next number greater than k is out of bounds, you can try/except that or use a conditional to report the failure as you see fit. This function takes advantage of the fact that the list is sorted using a binary search algorithm, which cuts the search space in half on every step. The time complexity is O(log(n)), which is very fast.
If you do wish to stick with a linear algorithm similar to your solution, you can simplify your logic to:
def occur1(a, num_to_find):
for n in a:
if n > num_to_find:
return n
# test it...
a = [2, 5, 10]
for i in range(11):
print(i, " -> ", occur1(a, i))
Output:
0 -> 2
1 -> 2
2 -> 5
3 -> 5
4 -> 5
5 -> 10
6 -> 10
7 -> 10
8 -> 10
9 -> 10
10 -> None
Or, if you want the index of the next largest number:
def occur1(a, num_to_find):
for i, n in enumerate(a):
if n > num_to_find:
return i
But I want to stress that the binary search is, by every measure, far superior to the linear search. For a list of a billion elements, the binary search will make about 20 comparisons in the worst case where the linear version will make a billion comparisons. The only reason not to use it is if the list can't be guaranteed to be pre-sorted, which isn't the case here.
To make this more concrete, you can play with this program (but use the builtin module in practice):
import random
def bisect_right(a, target, lo=0, hi=None, cmps=0):
if hi is None:
hi = len(a)
mid = (hi - lo) // 2 + lo
cmps += 1
if lo <= hi and mid < len(a):
if a[mid] < target:
return bisect_right(a, target, mid + 1, hi, cmps)
elif a[mid] > target:
return bisect_right(a, target, lo, mid - 1, cmps)
else:
return cmps, mid + 1
return cmps, mid + 1
def linear_search(a, target, cmps=0):
for i, n in enumerate(a):
cmps += 1
if n > target:
return cmps, i
return cmps, i
if __name__ == "__main__":
random.seed(42)
trials = 10**3
list_size = 10**4
binary_search_cmps = 0
linear_search_cmps = 0
for n in range(trials):
test_list = sorted([random.randint(0, list_size) for _ in range(list_size)])
test_target = random.randint(0, list_size)
res = bisect_right(test_list, test_target)[0]
binary_search_cmps += res
linear_search_cmps += linear_search(test_list, test_target)[0]
binary_search_avg = binary_search_cmps / trials
linear_search_avg = linear_search_cmps / trials
s = "%s search made %d comparisons across \n%d searches on random lists of %d elements\n(found the element in an average of %d comparisons\nper search)\n"
print(s % ("binary", binary_search_cmps, trials, list_size, binary_search_avg))
print(s % ("linear", linear_search_cmps, trials, list_size, linear_search_avg))
Output:
binary search made 12820 comparisons across
1000 searches on random lists of 10000 elements
(found the element in an average of 12 comparisons
per search)
linear search made 5013525 comparisons across
1000 searches on random lists of 10000 elements
(found the element in an average of 5013 comparisons
per search)
The more elements you add, the worse the situation looks for the linear search.
I would do something along the lines of:
num_to_find = int(input("Enter the number to be found"))
a=[2,20,30]
def occur1(a, num_to_find):
for i in a:
if not i <= num_to_find:
return a.index(i)
return "The entered number is greater than the numbers in the list"
print(occur1(a, num_to_find))
Which gives the output of 1 (when inputting 3).
The reason yours gives you 2 outputs, is because you have 2 print statements inside your code.

Python 3 index is len(l) conditional evaluation error

I have the following merge sort code. When the line if ib is len(b) or ... is changed to use double equal ==: if ib == len(b) or ..., the code does not raise an IndexError exception.
This is very unexpected because:
len(b) is evaluated to a number and is is equivalent to == for integers. You can test it out: a python expression
(1 is len([0]) )
is evaluated to be True.
the input to the function is range(1500, -1, -1), and range objects are handled differently in python3. I was suspecting that since the input was handled as a range instance, the length evaluation might have been an instance instead of a integer primitive. This is again strange because
1 is len(range(1))
also gives you True as the result.
Is this a bug with the conditional evaluation in Python3?
Tom Caswell supplied this following useful express in our discussion, I'm copy pasting it here for your notice:
tt = [j is int(str(j)) for j in range(15000)]
only the first 256 items are True. The rest are False hahahaha.
The original script:
def merge_sort(arr):
if len(arr) >= 2:
s = int(len(arr)/2)
a = merge_sort(arr[:s])
b = merge_sort(arr[s:])
ia = 0
ib = 0
new_arr = []
while len(new_arr) < len(arr):
try:
if ib is len(b) or a[ia] <= b[ib]:
new_arr.append(a[ia])
ia += 1
else:
new_arr.append(b[ib])
ib += 1
except IndexError:
print(len(a), len(b), ia, ib)
raise IndexError
return new_arr
else:
return arr
print(merge_sort(range(1500, -1, -1)))
Python does not guarantee that two integer instances with equal value are the same instance. In the example below, the reason the first 256 comparisons return equal is because Python caches -5 to 256 in Long.
This behavior is described here: https://docs.python.org/3/c-api/long.html#c.PyLong_FromLong
example:
tt = [j is int(str(j)) for j in range(500)]
plt.plot(tt)
IIRC that any of them pass the is test is an implementation-specific optimization detail.
is checks whether 2 arguments refer to the same object, == checks whether 2 arguments have the same value. You cannot assume they mean the same thing, they have different uses, and you'll get an error thrown if you attempt to use them interchangeably.

Resources