Select numbers from a list with given probability p - python-3.x

Lets say I have a list of numbers [1, 2, 3, ..., 100]. Now I want to select numbers from the list where each number is either accepted or rejected with a given probability 0 < p < 1 . The accepted numbers are then stored in a separate list. How can I do that?
The main problem is choosing the number with probability p. Is there an inbuilt function for that?
The value of p is given by the user.

You can use random.random() and a list comprehension:
import random
l = [1,2,3,4,5,6,7,8,9]
k = [x for x in l if random.random() > 0.23] # supply user input value here as 0.23
print(l)
print(k)
Output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[2, 3, 4, 5, 6, 7]
to check each element of the list if it has a probability of > 0.23 of staying in the list.
Sidenote:
random.choices() has the ability to accept weights:
random.choices(population, weights=None, *, cum_weights=None, k=1)
but those only change the probability inside the given list for drawing one of the elements (absolute or relative weights are possible) - thats not working for "independent" probabilities though.

Related

Pair wise permutation of two lists in Python

I have a list with 10 numerical values. I want to return all possible combination of this list such that each element can take value +/- element value.
The approach I had in mind was to take a binary variable which takes in value from 0 to 1023. 1 in this variable corresponds to positive d[i] and 0 to negative d[i].
e.g. for bin(8) = 0000001000 implies that d7 will take value -d7 and rest will be positive. Repeat this for all 0 to 1023 to get all combinations.
For example, if D = [d1,d2,...d10], we will have 1024 (2^10) combinations such that:
D1 = [-d1,d2,d3,....d10]
D2 = [-d1,-d2,d3,....d10]
D3 = [d1,-d2,d3,....d10] ...
D1024 = [-d1,-d1,-d3,....-d10]
Thank You!
you can just use the builtin itertools.product to make all combinations of positive and negative values.
from itertools import product
inputs = list(range(10)) # [1,2,3,4,5,6,7,8,9]
outputs = []
choices = [(x,-x) for x in inputs]
for item in product(*choices):
outputs.append(item)
print(outputs[:3])
print(len(outputs))
# [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, -9), (0, 1, 2, 3, 4, 5, 6, 7, -8, 9)]
# 1024
in a compressed form:
outputs = [item for item in product(*[(x,-x) for x in inputs])]

Count sets of increasing trend in a list

I have to count the number of subsets where there is an increasing trend based on user input of the list and the length of the subset.
List:[1,2,3,4,5,6,7,8,9]
k: The length of the trend.
for example...if k=3 and the data points increase for three consecutive numbers, then it is counted as one.
Here the input is the list and the length of the list and k
Example:
Input: List:{1,2,3,4,5,6}
k=3
Output: 4. {(1,2,3),(2,3,4),(3,4,5),(4,5,6)}
You can use the following :
def group_by_trend(data, k):
return [data[i*k :min((i+1)*k, len(data))]
for i in range(round((len(data)+1)/k))]
# test
k = 3
List = [1,2,3,4,5,6,7,8,9]
print(group_by_trend(List, k))
output:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

LIS on two arrays

I feel lost on how to approach this question,
Given two integer array of size 𝑛
, 𝑚
, I want to merge these two arrays into one such that order of element in each array doesn't change and size of their Longest Increasing Subsequence become maximum.
Once we choose an element of A or B, we cannot choose an earlier element of that sequence
My goal is to find maximum possible length of longest increasing subsequence.
This is what I have so far:
def sequences(a, b, start_index=0, min_val=None):
limits = a[start_index], b[start_index]
lower = min(limits)
higher = max(limits)
if min_val is not None and min_val > lower:
lower = min_val
options = range(lower, higher + 1)
is_last = start_index == len(a) - 1
for val in options:
if is_last:
yield [val]
else:
for seq in sequences(a, b, start_index+1, min_val=val+1):
yield [val, *seq]
for seq in sequences([1,3,1,6], [6,5,4,4]):
print(seq)
However, this results in: [1, 3, 4, 5], [1, 3, 4, 6], [2, 3, 4, 5], [2, 3, 4, 6].
The expected output should be:
array1: [1,3,1,6]
array2: [6,5,4,4]
We take 1(from array1), 3(from array1), 4(from array2), 6(from array1)
Giving us LIS: [1,3,4,6].
We got this by not choosing an earlier element from a sequence once we are at a certain value.
How do I stop it from unwanted recursion?

understanding the working principle of sorted function python [duplicate]

I have the following Python dict:
[(2, [3, 4, 5]), (3, [1, 0, 0, 0, 1]), (4, [-1]), (10, [1, 2, 3])]
Now I want to sort them on the basis of sum of values of the values of dictionary, so for the first key the sum of values is 3+4+5=12.
I have written the following code that does the job:
def myComparator(a,b):
print "Values(a,b): ",(a,b)
sum_a=sum(a[1])
sum_b=sum(b[1])
print sum_a,sum_b
print "Comparision Returns:",cmp(sum_a,sum_b)
return cmp(sum_a,sum_b)
items.sort(myComparator)
print items
This is what the output that I get after running above:
Values(a,b): ((3, [1, 0, 0, 0, 1]), (2, [3, 4, 5]))
2 12
Comparision Returns: -1
Values(a,b): ((4, [-1]), (3, [1, 0, 0, 0, 1]))
-1 2
Comparision Returns: -1
Values(a,b): ((10, [1, 2, 3]), (4, [-1]))
6 -1
Comparision Returns: 1
Values(a,b): ((10, [1, 2, 3]), (3, [1, 0, 0, 0, 1]))
6 2
Comparision Returns: 1
Values(a,b): ((10, [1, 2, 3]), (2, [3, 4, 5]))
6 12
Comparision Returns: -1
[(4, [-1]), (3, [1, 0, 0, 0, 1]), (10, [1, 2, 3]), (2, [3, 4, 5])]
Now I am unable to understand as to how the comparator is working, which two values are being passed and how many such comparisons would happen? Is it creating a sorted list of keys internally where it keeps track of each comparison made? Also the behavior seems to be very random. I am confused, any help would be appreciated.
The number and which comparisons are done is not documented and in fact, it can freely change from different implementations. The only guarantee is that if the comparison function makes sense the method will sort the list.
CPython uses the Timsort algorithm to sort lists, so what you see is the order in which that algorithm is performing the comparisons (if I'm not mistaken for very short lists Timsort just uses insertion sort)
Python is not keeping track of "keys". It just calls your comparison function every time a comparison is made. So your function can be called many more than len(items) times.
If you want to use keys you should use the key argument. In fact you could do:
items.sort(key=lambda x: sum(x[1]))
This will create the keys and then sort using the usual comparison operator on the keys. This is guaranteed to call the function passed by key only len(items) times.
Given that your list is:
[a,b,c,d]
The sequence of comparisons you are seeing is:
b < a # -1 true --> [b, a, c, d]
c < b # -1 true --> [c, b, a, d]
d < c # 1 false
d < b # 1 false
d < a # -1 true --> [c, b, d, a]
how the comparator is working
This is well documented:
Compare the two objects x and y and return an integer according to the outcome. The return value is negative if x < y, zero if x == y and strictly positive if x > y.
Instead of calling the cmp function you could have written:
sum_a=sum(a[1])
sum_b=sum(b[1])
if sum_a < sum_b:
return -1
elif sum_a == sum_b:
return 0
else:
return 1
which two values are being passed
From your print statements you can see the two values that are passed. Let's look at the first iteration:
((3, [1, 0, 0, 0, 1]), (2, [3, 4, 5]))
What you are printing here is a tuple (a, b), so the actual values passed into your comparison functions are
a = (3, [1, 0, 0, 0, 1])
b = (2, [3, 4, 5]))
By means of your function, you then compare the sum of the two lists in each tuple, which you denote sum_a and sum_b in your code.
and how many such comparisons would happen?
I guess what you are really asking: How does the sort work, by just calling a single function?
The short answer is: it uses the Timsort algorithm, and it calls the comparison function O(n * log n) times (note that the actual number of calls is c * n * log n, where c > 0).
To understand what is happening, picture yourself sorting a list of values, say v = [4,2,6,3]. If you go about this systematically, you might do this:
start at the first value, at index i = 0
compare v[i] with v[i+1]
If v[i+1] < v[i], swap them
increase i, repeat from 2 until i == len(v) - 2
start at 1 until no further swaps occurred
So you get, i =
0: 2 < 4 => [2, 4, 6, 3] (swap)
1: 6 < 4 => [2, 4, 6, 3] (no swap)
2: 3 < 6 => [2, 4, 3, 6] (swap)
Start again:
0: 4 < 2 => [2, 4, 3, 6] (no swap)
1: 3 < 4 => [2, 3, 4, 6] (swap)
2: 6 < 4 => [2, 3, 4, 6] (no swap)
Start again - there will be no further swaps, so stop. Your list is sorted. In this example we have run through the list 3 times, and there were 3 * 3 = 9 comparisons.
Obviously this is not very efficient -- the sort() method only calls your comparator function 5 times. The reason is that it employs a more efficient sort algorithm than the simple one explained above.
Also the behavior seems to be very random.
Note that the sequence of values passed to your comparator function is not, in general, defined. However, the sort function does all the necessary comparisons between any two values of the iterable it receives.
Is it creating a sorted list of keys internally where it keeps track of each comparison made?
No, it is not keeping a list of keys internally. Rather the sorting algorithm essentially iterates over the list you give it. In fact it builds subsets of lists to avoid doing too many comparisons - there is a nice visualization of how the sorting algorithm works at Visualising Sorting Algorithms: Python's timsort by Aldo Cortesi
Basically, for the simple list such as [2, 4, 6, 3, 1] and the complex list you provided, the sorting algorithms are the same.
The only differences are the complexity of elements in the list and the comparing scheme that how to compare any tow elements (e.g. myComparator you provided).
There is a good description for Python Sorting: https://wiki.python.org/moin/HowTo/Sorting
First, the cmp() function:
cmp(...)
cmp(x, y) -> integer
Return negative if x<y, zero if x==y, positive if x>y.
You are using this line: items.sort(myComparator) which is equivalent to saying: items.sort(-1) or items.sort(0) or items.sort(1)
Since you want to sort based on the sum of each tuples list, you could do this:
mylist = [(2, [3, 4, 5]), (3, [1, 0, 0, 0, 1]), (4, [-1]), (10, [1, 2, 3])]
sorted(mylist, key=lambda pair: sum(pair[1]))
What this is doing is, I think, exactly what you wanted. Sorting mylist based on the sum() of each tuples list

Building a list of random multiplication examples

I have two 100-element lists filled with random numbers between 1 and 10.
I want to make a list of multiplications of randomly selected numbers that proceeds until a product greater than 50 is generated.
How can I obtain such a list where each element is a product and its two factors?
Here is the code I tried. I think it has a lot of problems.
import random
list1 = []
for i in range(0,1000):
x = random.randint(1,10)
list1.append(x)
list2 = []
for i in range(0,1000):
y = random.randint(1,10)
list2.append(y)
m=random.sample(list1,1)
n=random.sample(list2,1)
list3=[]
while list3[-1][-1]<50:
c=[m*n]
list3.append(m)
list3.append(n)
list3.append(c)
print(list3)
The output I want
[[5.12154, 4.94359, 25.3188], [1.96322, 3.46708, 6.80663], [9.40574,
2.28941, 21.5336], [4.61705, 9.40964, 43.4448], [9.84915, 3.0071, 29.6174], [8.44413, 9.50134, 80.2305]]
To be more descriptive:
[[randomfactor, randomfactor, product],......,[[randomfactor,
randomfactor, greater than 50]]
Prepping two lists with 1000 numbers each with numbers from 1 to 10 in them is wasted memory. If that is just a simplification and you want to draw from lists you got otherwise, simply replace
a,b = random.choices(range(1,11),k=2)
by
a,b = random.choice(list1), random.choice(list2)
import random
result = []
while True:
a,b = random.choices(range(1,11),k=2) # replace this one as described above
c = a*b
result.append( [a,b,c] )
if c > 50:
break
print(result)
Output:
[[9, 3, 27], [3, 5, 15], [8, 5, 40], [5, 9, 45], [9, 3, 27], [8, 5, 40], [8, 8, 64]]
If you need 1000 random ints between 1 and 10, do:
random_nums = random.choices(range(1,11),k=1000)
this if much faster then looping and appending single integers 1000 times.
Doku:
random.choices(iterable, k=num_to_draw)
random.choice(iterable)

Resources