set from the union of elements contained in two lists - programming-languages

this is for a pre-interview questioner. i believe i have the answer just wanted to get confirmation that im right.
Part 1 - Tell me what this code does, and its big-O performance
Part 2 - Re-write it yourself and tell me the big-O performance of your solution
def foo(a, b):
""" a and b are both lists """
c = []
for i in a:
if is_bar(b, i):
c.append(i)
return unique(c)
def is_bar(a, b):
for i in a:
if i == b:
return True
return False
def unique(arr):
b = {}
for i in arr:
b[i] = 1
return b.keys()
ANSWERS:
It creates a set from the union of elements contained in two lists. It big O performance is O(n2)
my solution which i believe achieves O(n)
Set A = getSetA();
Set B = getSetB();
Set UnionAB = new Set(A);
UnionAB.addAll(B);
for (Object inA : a)
if(B.contains(inA))
UnionAB.remove(inA);

It seems like the original code is doing an intersection not a union. It's traversing all the elements in the first list (a) and checking if it exists in the second list (b), in which case it is adding it to list c. Then it is returning the unique elements from c. Performance of O(n^2) seems right.

Related

Is there an efficient way to lower time complexity of this problem ? current T(n) = O(N^3)

need to choose the value such that the value of the equation abs('a' - 'b') + abs('b' - 'c') + abs('c' - 'a') is minimized.
def minimumValue(n: int, a: List[int], b: List[int], c: List[int]) -> int :
# Write your code here.
ans=10000000000000
for i in range (n):
for j in range (n):
for k in range (n):
ans = min(ans, abs(a[i] - b[j]) + abs(b[j] - c[k]) + abs(c[k] - a[i]))
return ans
Here is a O(nlogn) solution. You can sort the three lists, and then do this:
get the first value from each of the three lists
Repeat while we have three values:
sort these three values (and keep track of where they came from)
calculate the target expression with those three, and check if it improves on the best result so far
replace the least of these three values with the next value from the same list as this value came from. If there is no more next value in that list, return the result (quit)
Note also that the formula of the expression to evaluate is the same as doing (max(x,y,z)-min(x,y,z))*2, and this is easy to do when the values x, y and z are sorted, as then it becomes (z-x)*2. To find the minimum that this expression can take, we can leave out the *2 and only do that multiplication at the very end.
Here is the code for implementing that idea:
def minimumValue(n: int, a: List[int], b: List[int], c: List[int]) -> int:
queues = map(iter, map(sorted, (a, b, c)))
three = [[next(q), q] for q in queues]
least = float("inf")
while True:
three.sort()
least = min(least, three[2][0] - three[0][0])
try:
three[0][0] = next(three[0][1])
except:
return least*2
The time complexity for initially sorting the three input lists is O(nlogn). The loop will iterate 3n-2 times, which is O(n). Each of the actions in one loop iteration executes in constant time.
So the overall complexity is determined by the initial sorting: O(nlogn)
Without any further knowledge/assumption on the content of the 3 lists, and if you need to obtain the true minimum value (and not an approximate value), then there's no other choice than using brute force. Some optimisations are possible, but still with a N^3 complexity (and without any speed-up in the worst case).
for i in range (n):
for j in range (n):
v = abs(a[i] - b[j])
if v < ans:
for k in range (n):
ans = min(ans, v + abs(b[j] - c[k]) + abs(c[k] - a[i]))

Mark Element in List

I have an excercise about prime numbers that requires me to write a function which takes a list of elements and a number p and marks elements False which are in the range 2p, 3p...N
First I create a list of True and False:
true_value = [False, False] + [True for x in range(n-1)] #Let assumme that n=16
And then I write the function that find the even number in this list (with p = 2)
def mark_false(bool_list, p):
range_new = [x for x in range(len(bool_list))]
for i in range(2, len(range_new)):
for j in range(p, len(range_new), p):
if (i*p == range_new[j]) & (i*p <= len(range_new)):
bool_list[j] = False
return bool_list
This function help me to find the location of the even number (>2) and return to False
Example: a = list_true(16)
a = [False,False,True,True,True,True,True,True,True,True,True,True,True,True,True,True,True]
b = mark_false(a, 2)
b = [False,False,True,True,False,True,False,True,False,True,False,True,False,True,False,True]
This function mark_false does work but the problem is everytime I run it I have to create a list range_new which takes a lot of time to calculate. So how do I rewrite this function so it can run faster without creating new lists?
You seem to be doing things the long way around, searching for the j value that matches the multiple of p you want to set to False. But since you already know that value already, there's no need to search for it, just set it directly.
I'd do:
def mark_false(bool_list, p):
for i in range(p, len(bool_list), p): # p, 2*p, 3*p, ...
bool_list[i] = False # do the assignment unconditionally
You probably shouldn't need a return statement, since you're modifying the list you are passed in-place. Returning the list could make the API misleading, as it might suggest that the returned list is a new one (e.g. a modified copy).
If you did want to return a new list, you could create one with a list comprehension, rather than modifying the existing list:
def mark_false_copy(bool_list, p):
return [x if i % p else False for i, x in enumerate(bool_list)]

How can i optimise my code and make it readable?

The task is:
User enters a number, you take 1 number from the left, one from the right and sum it. Then you take the rest of this number and sum every digit in it. then you get two answers. You have to sort them from biggest to lowest and make them into a one solid number. I solved it, but i don't like how it looks like. i mean the task is pretty simple but my code looks like trash. Maybe i should use some more built-in functions and libraries. If so, could you please advise me some? Thank you
a = int(input())
b = [int(i) for i in str(a)]
closesum = 0
d = []
e = ""
farsum = b[0] + b[-1]
print(farsum)
b.pop(0)
b.pop(-1)
print(b)
for i in b:
closesum += i
print(closesum)
d.append(int(closesum))
d.append(int(farsum))
print(d)
for i in sorted(d, reverse = True):
e += str(i)
print(int(e))
input()
You can use reduce
from functools import reduce
a = [0,1,2,3,4,5,6,7,8,9]
print(reduce(lambda x, y: x + y, a))
# 45
and you can just pass in a shortened list instead of poping elements: b[1:-1]
The first two lines:
str_input = input() # input will always read strings
num_list = [int(i) for i in str_input]
the for loop at the end is useless and there is no need to sort only 2 elements. You can just use a simple if..else condition to print what you want.
You don't need a loop to sum a slice of a list. You can also use join to concatenate a list of strings without looping. This implementation converts to string before sorting (the result would be the same). You could convert to string after sorting using map(str,...)
farsum = b[0] + b[-1]
closesum = sum(b[1:-2])
"".join(sorted((str(farsum),str(closesum)),reverse=True))

Python losing track of index location in for loop when my list has duplicate values

I'm trying to iterate over pairs of integers in a list. I'd like to return pairs where the sum equals some variable value.
This seems to be working just fine when the list of integers doesn't have repeat numbers. However, once I add repeat numbers to the list the loop seems to be getting confused about where it is. I'm guessing this based on my statements:
print(list.index(item))
print(list.index(item2))
Here is my code:
working_list = [1,2,3,4,5]
broken_list = [1,3,3,4,5]
def find_pairs(list, k):
pairs_list = []
for item in list:
for item2 in list:
print(list.index(item))
print(list.index(item2))
if list.index(item) < list.index(item2):
sum = item + item2;
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
### First parameter is the name is the list to check.
### Second parameter is the integer you're looking for each pair to sum to.
find_pairs(broken_list, 6)
working_list is fine. When I run broken_list looking for pairs which sum to 6, I'm getting back (1,5) but I should also get back (3,3) and I'm not.
You are trying to use list.index(item) < list.index(item2) to ensure that you do not double count the pairs. However, broken_list.index(3) returns 1 for both the first and second 3 in the list. I.e. the return value is not the actual index you want (unless the list only contains unique elements, like working_list). To get the actual index, use enumerate. The simplest implementation would be
def find_pairs(list, k):
pairs_list = []
for i, item in enumerate(list):
for j, item2 in enumerate(list):
if i < j:
sum = item + item2
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
For small lists this is fine, but we could be more efficient by only looping over the elements we want using slicing, hence eliminating the if statement:
def find_pairs(list, k):
pairs_list = []
for i, item in enumerate(list):
for item2 in list[i+1:]:
sum = item + item2
if sum == k:
pair = (item, item2)
pairs_list.append(pair)
return pairs_list
Note on variable names
Finally, I have to comment on your choice of variable names: list and sum are already defined by Python, and so it's bad style to use these as variable names. Furthermore, 'items' are commonly used to refer to a key-value pair of objects, and so I would refrain from using this name for a single value as well (I guess something like 'element' is more suitable).

comparing two arrays and get the values which are not common

I am doing this problem a friend gave me where you are given 2 arrays say (a[1,2,3,4] and b[8,7,9,2,1]) and you have to find not common elements.
Expected output is [3,4,8,7,9]. Code below.
def disjoint(e,f):
c = e[:]
d = f[:]
for i in range(len(e)):
for j in range(len(f)):
if e[i] == f[j]:
c.remove(e[i])
d.remove(d[j])
final = c + d
print(final)
print(disjoint(a,b))
I tried with nested loops and creating copies of given arrays to modify them then add them but...
def disjoint(e,f):
c = e[:] # list copies
d = f[:]
for i in range(len(e)):
for j in range(len(f)):
if e[i] == f[j]:
c.remove(c[i]) # edited this line
d.remove(d[j])
final = c + d
print(final)
print(disjoint(a,b))
when I try removing common element from list copies, I get different output [2,4,8,7,9]. why ??
This is my first question in this website. I'll be thankful if anyone can clear my doubts.
Using sets you can do:
a = [1,2,3,4]
b = [8,7,9,2,1]
diff = (set(a) | set(b)) - (set(a) & set(b))
(set(a) | set(b)) is the union, set(a) & set(b) is the intersection and finally you do the difference between the two sets using -.
Your bug comes when you remove the elements in the lines c.remove(c[i]) and d.remove(d[j]). Indeed, the common elements are e[i]and f[j] while c and d are the lists you are updating.
To fix your bug you only need to change these lines to c.remove(e[i]) and d.remove(f[j]).
Note also that your method to delete items in both lists will not work if a list may contain duplicates.
Consider for instance the case a = [1,1,2,3,4] and b = [8,7,9,2,1].
You can simplify your code to make it works:
def disjoint(e,f):
c = e.copy() # [:] works also, but I think this is clearer
d = f.copy()
for i in e: # no need for index. just walk each items in the array
for j in f:
if i == j: # if there is a match, remove the match.
c.remove(i)
d.remove(j)
return c + d
print(disjoint([1,2,3,4],[8,7,9,2,1]))
Try it online!
There are a lot of more effecient way to achieve this. Check this stack overflow question to discover them: Get difference between two lists. My favorite way is to use set (like in #newbie's answer). What is a set? Lets check the documentation:
A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. (For other containers see the built-in dict, list, and tuple classes, and the collections module.)
emphasis mine
Symmetric difference is perfect for our need!
Returns a new set with elements in either the set or the specified iterable but not both.
Ok here how to use it in your case:
def disjoint(e,f):
return list(set(e).symmetric_difference(set(f)))
print(disjoint([1,2,3,4],[8,7,9,2,1]))
Try it online!

Resources