k way merge sort divide and conquer - python-3.x

from math import ceil
def merge(all_lst):
sorted_lst = []
while all_lst:
min_value,index = all_lst[0][0],0
for lst in all_lst:
if lst[0]<min_value:
min_value = lst[0]
index = all_lst.index(lst)
sorted_lst.append(min_value)
all_lst[index].pop(0)
if not all_lst[index]:
all_lst.remove(all_lst[index])
return sorted_lst
def merge_sort(lst, k):
def split(lst):
split_lst = []
j = ceil(len(lst)/k) if len(lst)>=k else 1
for i in range(0,len(lst),j):
split_lst.append(lst[i:i+j])
return split_lst
lst=split(lst)
if len(lst[0])==1:
return lst
else:
for i in range(len(lst)):
lst[i]=merge(merge_sort(lst[i],k))
return merge(lst)
Above is my code for k-way merge sort. Basically what it does is split the list into k smaller list by calling the split function until each sublist in the list is a single element. Then the list containing sublists will be merged into one single list.
My code works fine when splitting is done twice. (eg.[3,6,8,5,2,1,4,7] -> [3,6,8],[5,2,1],[4,7] -> [3],[6],[8],[5],[2],[1],[4],[7]). But when the splitting is done more than twice, (eg,[3,6,8,5,2,1,4,7] -> [3,6,8,5],[2,1,4,7] -> [3,6],[8,5],[2,1],[4,7] -> [3],[6],[8],[5],[2],[1],[4],[7]), the code will fail. Can anyone help find me find out what goes wrong in my code? Thanks in advance.

I believe the problem you're having is that merge_sort sometimes returns a flattened list and other times returns a list of lists. You should probably return a flat list in all cases. There's some other cruft: You don't need split to be its own function, since you only call it the one time.
Here's a greatly simplified version of your code:
def merge_sort(lst, k):
if len(lst) == 1: # simpler base case
return lst
j = ceil(len(lst)/k) # no need to check for k < len(lst) (ceil handles it)
#split and recursively sort in one step
lst = [merge_sort(lst[i:i+j], k) for i in range(0, len(lst), j)]
return merge(lst) # always return a merged list (never a list of lists)

Related

Number of sub sequences of length K having total sum S, given 2d array

I wish to find Number of sub sequences of length K having total sum S, given an array.
Sample Input:
a=[1,1,1,2,2] & K=2 & S=2
Sample Output:
3 {because a[0],a[1]; a[1]a[2]; a[0]a[2] are only three possible for the case}
I have tried to write a recursive loop in Python for starter but it isn't giving output as expected.Please can you help me find a loophole I might be missing on.
def rec(k, sum1, arr, i=0):
#print('k: '+str(k)+' '+'sum1: '+str(sum1)) #(1) BaseCase:
if(sum1==0 and k!=0): # Both sum(sum1) required and
return 0 # numbers from which sum is required(k)
if(k==0 and sum1 !=0): # should be simultaneously zero
return 0 # Then required subsequences are 1
if(k==0 and sum1==0 ): #
return 1 #
base_check = sum1!=0 or k!=0 #(2) if iterator i reaches final element
if(i==len(arr) and base_check): # in array we should return 0 if both k
return 0 # and sum1 aren't zero
# func rec for getting sum1 from k elements
if(sum1<arr[0]): # takes either first element or rejects it
ans=rec(k-1,sum1,arr[i+1:len(arr)],i+1) # so 2 cases in else loop
print(ans) # i is taken in as iterator to provide array
else: # input to rec func from 2nd element of array
ans=rec(k-1, sum1-arr[0], arr[i+1:len(arr)],i+1)+rec(k, sum1, arr[i+1:len(arr)],i+1)
#print('i: '+str(i)+' ans: '+str(ans))
return(ans)
a=[1,1,1,2,2]
print(rec(2,2,a))
I am still unable to process how to make changes. Once this normal recursive code is written I might go to DP approach accordinlgy.
Using itertools.combinations
Function itertools.combinations returns all the subsequences of a given lengths. Then we filter to keep only subsequences who sum up to the desired value.
import itertools
def countsubsum(a, k, s):
return sum(1 for c in itertools.combinations(a,k) if sum(c)==s)
Fixing your code
Your code looks pretty good, but there are two things that appear wrong about it.
What is this if for?
At first I was a bit confused about if(sum1<arr[0]):. I think you can (and should) always go to the else branch. After thinking about it some more, I understand you are trying to get rid of one of the two recursive calls if arr[0] is too large to be taken, which is smart, but this makes the assumption that all elements in the array are nonnegative. If the array is allowed to contain negative numbers, then you can include a large a[0] in the subsequence, and hope for a negative element to compensate. So if the array can contain negative numbers, you should get rid of this if/else and always execute the two recursive calls from the else branch.
You are slicing wrong
You maintain a variable i to remember where to start in the array; but you also slice the array. Pretty soon your indices become wrong. You should use slices, or use an index i, but not both.
# WRONG
ans=rec(k-1, sum1-arr[0], arr[i+1:len(arr)],i+1)+rec(k, sum1, arr[i+1:len(arr)],i+1)
# CORRECT
ans = rec(k-1, sum1-arr[i], arr, i+1) + rec(k, sum1, arr, i+1)
# CORRECT
ans = rec(k-1, sum1-arr[0], arr[1:]) + rec(k, sum1, arr[1:])
To understand why using both slicing and an index gives wrong results, run the following code:
def iter_array_wrong(a, i=0):
if (a):
print(i, a)
iter_array_wrong(a[i:], i+1)
def iter_array_index(a, i=0):
if i < len(a):
print(i, a)
iter_array_index(a, i+1)
def iter_array_slice(a):
if a:
print(a)
iter_array_slice(a[1:])
print('WRONG')
iter_array_wrong(list(range(10)))
print()
print('INDEX')
iter_array_index(list(range(10)))
print()
print('SLICE')
iter_array_slice(list(range(10)))
Also note that a[i:len(a)] is exactly equivalent to a[i:] and a[0:j] is equivalent to a[:j].
Clean version of the recursion
Recursively count the subsequences who use the first element of the array, and the subsequences who don't use the first element of the array, and add the two counts. To avoid explicitly slicing the array repeatedly, which is an expensive operation, we keep a variable start to remember we are only working on subarray a[start:].
def countsubsum(a, k, s, start=0):
if k == 0:
return (1 if s == 0 else 0)
elif start == len(a):
return 0
else:
using_first_element = countsubsum(a, k-1, s-a[start], start+1)
notusing_first_elem = countsubsum(a, k, s, start+1)
return using_first_element + notusing_first_elem

generating a list but only the first index showed

i want to generate a list, in which only odd number get factorial application. However only the first number will be execute, can you help me? Thanks.
def factorial(x):
if x<=0:
return 1
else:
return x*factorial(x-1)
def odd(x):
if x%2 ==0:
return x
else:
return factorial(x)
def apply_if(factorial,odd,xs):
#xs is a list
i=0
mlst=[]
for x in xs:
if i<len(xs):
return odd(xs[i])
i+=1
mlst=mlst.append(odd(x))
return mlst
You should change apply_if function.
def apply_if(factorial,odd,xs):
mlst=[]
for x in xs:
mlst.append(odd(x))
return mlst
Because at first iteration of loop i will always be smaller than lenght of ws list (if it's not empty). Also append method doesn't return anything so you shouldn't use a = a.append() as it just appends element to given list.

saving the result of the recursion iterations

This is a standart permutation function. Im tring to return the list of the lists of the permutations)
Could you help me with storaging the result of the recursion iterations? for example this code returns nonsense. It would be perfect if there was no global variable and rezulting list was inside the func
Thanks!
'''
z=[]
def func(N,M=-1,pref=None):
global z
if M == -1:
M = N
pref = pref or []
if M==0:
z.append(pref)
print(pref)
for i in range(N):
if i not in pref:
pref.append(i)
func(N,M-1,pref)
pref.pop()
func(3)
print(z)
'''
You are passing a list (pref variable in for loop) reference to your function and you are removing a single item from that and that's why you are ending with an empty list z.
Create a new list or copy the list before passing it to the function to avoid this situation.
z = []
def func(N, M=-1, pref=None):
global z
if M == -1:
M = N
pref = pref or []
if M == 0:
z.append(pref)
print(pref)
for i in range(N):
if i not in pref:
pref.append(i)
func(N, M - 1, pref[:])
pref.pop()
func(3)
print(z)
For better understand please read this one. List changes unexpectedly after assignment. How do I clone or copy it to prevent this?
If you want to have some kind of accumulator you must pass it to the recursion function, beware it could be a little nightmare.

Python generator that returns group of items

I am trying to make a generator that can return a number of consecutive items in a list which "moves" only by one index. Something similar to a moving average filter in DSP. For instance if I have list:
l = [1,2,3,4,5,6,7,8,9]
I would expect this output:
[(1,2,3),(2,3,4),(3,4,5),(4,5,6),(5,6,7),(6,7,8),(7,8,9)]
I have made code but it does not work with filters and generators etc. I am afraid it will also break due to memory if I need to provide a large list of words.
Function gen:
def gen(enumobj, n):
for idx,val in enumerate(enumobj):
try:
yield tuple(enumobj[i] for i in range(idx, idx + n))
except:
break
and the example code:
words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo']
w = filter(lambda x: len(x) > 1, words)
# It's working with list
print('\nList:')
g = gen(words, 4)
for i in g: print(i)
# It's not working with filetrs / generators etc.
print('\nFilter:')
g = gen(w, 4)
for i in g: print(i)
The list for does not produce anything. The code should break because it is not possible to index a filter object. Of course one of the answers is forcing a list: list(w). However, I am looking for better code for the function. How can I change it so that function can accept filters as well etc. I am worried about memory to a huge number of data in a list.
Thanks
With iterators you need to keep track of values that have already been read. An n sized list does the trick. Append the next value to the list and discard the top item after each yield.
import itertools
def gen(enumobj, n):
# we need an iterator for the `next` call below. this creates
# an iterator from an iterable such as a list, but leaves
# iterators alone.
enumobj = iter(enumobj)
# cache the first n objects (fewer if iterator is exhausted)
cache = list(itertools.islice(enumobj, n))
# while we still have something in the cache...
while cache:
yield cache
# drop stale item
cache.pop(0)
# try to get one new item, stopping when iterator is done
try:
cache.append(next(enumobj))
except StopIteration:
# pass to emit progressively smaller units
#pass
# break to stop when fewer than `n` items remain
break
words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo']
w = filter(lambda x: len(x) > 1, words)
# It's working with list
print('\nList:')
g = gen(words, 4)
for i in g: print(i)
# now it works with iterators
print('\nFilter:')
g = gen(w, 4)
for i in g: print(i)

A Pythonic approach to a list comparison/generation script

Consider I've got a list of 2-tuples named listuple and another simple list named list0. I want to generate a list of 1s and -1s based on comparing my two given lists.
def Vectomparison (listuple, list0):
result = []
for EachElement in listuple:
if EachElement [0] in list0:
result.append (1)
else:
result.append (-1)
return result
But I really think that this not a Pythonic approach. Any idea for making this Pythonically more compressed?
A direct rewrite using a list comprehesion, and ternary operator:
def Vectomparison (listuple, list0):
return [1 if item[0] in list0 else -1 for item in listuple]
Note that item[0] in list0 uses a linear search if list0 is a list. That makes the algorithms time complexity O(N*M) where N = len(listuple), M = len(list0)
You can make it faster:
def Vectomparison (listuple, list0):
set0 = set(list0)
return [1 if item[0] in set0 else -1 for item in listuple]
This version has a time complexity of O(N+M)

Resources