How to randomly select from list of lists - python-3.x

I have a list of lists in python as follows:
a = [[1,1,2], [2,3,4], [5,5,5], [7,6,5], [1,5,6]]
for example, How would I at random select 3 lists out of the 6?
I tried numpy's random.choice but it does not work for lists.
Any suggestion?

numpy's random.choice doesn't work on 2-d array, so one alternative is to use lenght of the array to get the random index of 2-d array and then get the elements from that random index. see below example.
import numpy as np
random_count = 3 # number of random elements to find
a = [[1,1,2], [2,3,4], [5,5,5], [7,6,5], [1,5,6]] # 2-d data list
alist = np.array(a) # convert 2-d data list to numpy array
random_numbers = np.random.choice(len(alist), random_count) # fetch random index of 2-d array based on len
for item in random_numbers: # iterate over random indexs
print(alist[item]) # print random elememt through index

You can use the random library like this:
a = [[1,1,2], [2,3,4], [5,5,5], [7,6,5], [1,5,6]]
import random
random.choices(a, k=3)
>>> [[1, 5, 6], [2, 3, 4], [7, 6, 5]]
You can read more about the random library at this official page https://docs.python.org/3/library/random.html.

Related

Choose n random elements from every row of list of list

I have a list of list L as :
[
[1,2,3,4,5,6],
[10,20,30,40,50,60],
[11,12,113,4,15,6],
]
Inner list are of same size.
I want to choose n-random elements from every row of L and output it as same list of list.
I tried the following code:
import random
import math
len_f=len(L)
index=[i for i in range(len_f)]
RANDOM_INDEX=random.sample(index, 5))
I am stuck at this point that how can I use random index to get output from L.
The output for "2" random elements would be:
[
[1,6],
[10,60],
[11,6],
]
If random function chose 1 and 6 as index.
random.sample could be leveraged. Adapt sample size k according to your needs.
In: import random
In: [random.sample(ls, k=3) for ls in L]
Out: [[1, 2, 6], [60, 10, 30], [4, 12, 15]]
It assumes the order of the picked elements doesn't matter.
Doc for random.sample for convenience: https://docs.python.org/3/library/random.html#random.sample

How to filter this type of data?

If I have some numpy arrays like
a = np.array([1,2,3,4,5])
b = np.array([4,5,7,8])
c = np.array([4,5])
I need to combine these arrays without repeating a number. My expected output is [1,2,3,4,5,7,8].
How do I combine them? Which function should I use?
One more approach you can give a try is using reduce from functools and union1d from numpy.
For eg -
from functools import reduce
reduce(np.union1d, (a, b, c))
Output -
array([1,2,3,4,5,7,8])
You can use numpy.concatenate with numpy.unique:
d = np.unique(np.concatenate((a,b,c)))
print(d)
Output:
[1 2 3 4 5 7 8]
Python has a datatype called set:
A set is an unordered collection with no duplicate elements
The easiest way to create a set out of your array would be unpacking your arrays into the set:
>>> import numpy as np
>>> a=np.array([1,2,3,4,5])
>>> b=np.array([4,5,7,8])
>>> c=np.array([4,5])
>>> {*a, *b, *c}
{1, 2, 3, 4, 5, 7, 8}
Please note, that the set is unordered. This is not the right answer for you, if the order of the elements in your array is important.

How can I fill an array with sets

Say I have an image of 2x2 pixels named image_array, each pixel color is identified by a tuple of 3 entries (RGB), so the shape of image_array is 2x2x3.
I want to create an np.array c which has the shape 2x2x1 and which last coordinate is an empty set.
I tried this:
import numpy as np
image = (((1,2,3), (1,0,0)), ((1,1,1), (2,1,2)))
image_array = np.array(image)
c = np.empty(image_array.shape[:2], dtype=set)
c.fill(set())
c[0][1].add(124)
print(c)
I get:
[[{124} {124}]
[{124} {124}]]
And instead I would like the return:
[[{} {124}]
[{} {}]]
Any idea ?
The object array has to be filled with separate set() objects. That means creating them individually, as I do with a list comprehension:
In [279]: arr = np.array([set() for _ in range(4)]).reshape(2,2)
In [280]: arr
Out[280]:
array([[set(), set()],
[set(), set()]], dtype=object)
That construction should highlight that fact that this array is closely related to a list, or list of lists.
Now we can do a set operation on one of those elements:
In [281]: arr[0,1].add(124) # more idiomatic than arr[0][1]
In [282]: arr
Out[282]:
array([[set(), {124}],
[set(), set()]], dtype=object)
Note that we cannot operate on more than one set at a time. The object array offers few advantages compared to a list.
This is a 2d array; the sets don't form a dimension. Contrast that with
In [283]: image = (((1,2,3), (1,0,0)), ((1,1,1), (2,1,2)))
...: image_array = np.array(image)
...:
In [284]: image_array
Out[284]:
array([[[1, 2, 3],
[1, 0, 0]],
[[1, 1, 1],
[2, 1, 2]]])
While it started with tuples, it made a 3d array of integers.
Try this:
import numpy as np
x = np.empty((2, 2), dtype=np.object)
x[0, 0] = set(1, 2, 3)
print(x)
[[{1, 2, 3} None]
[None None]]
For non-number types in numpy you should use np.object.
whenever you do fill(set()), this will fill the array with exactly same set, as they refer to the same set. To fix this, just make a set if there isnt one everytime you need to add to the set
c = np.empty(image_array.shape[:2], dtype=set)
if not c[0][1]:
c[0,1] = set([124])
else:
c[0,1].add(124)
print (c)
# [[None {124}]
# [None None]]
Try changing your line c[0][1].add to this.
c[0][1] = 124
print(c)

python3: find most k nearest vectors from a list?

Say I have a vector v1 and a list of vector l1. I want to find k vectors from l1 that are most closed (similar) to v1 in descending order.
I have a function sim_score(v1,v2) that will return a similarity score between 0 and 1 for any two input vectors.
Indeed, a naive way is to write a for loop over l1, calculate distance and store them into another list, then sort the output list. But is there a Pythonic way to do the task?
Thanks
import numpy as np
np.sort([np.sqrt(np.sum(( l-v1)*(l-v1))) For l in l1])[:3]
Consider using scipy.spatial.distance module for distance computations. It supports the most common metrics.
import numpy as np
from scipy.spatial import distance
v1 = [[1, 2, 3]]
l1 = [[11, 3, 5],
[ 2, 1, 9],
[.1, 3, 2]]
# compute distances
dists = distance.cdist(v1, l1, metric='euclidean')
# sorted distances
sd = np.sort(dists)
Note that each parameter to cdist must be two-dimensional. Hence, v1 must be a nested list, or a 2d numpy array.
You may also use your homegrown metric like:
def my_metric(a, b, **kwargs):
# some logic
dists = distance.cdist(v1, l1, metric=my_metric)

copy in numpy(python 3)

I've just learnt copy,shallow copy,and deep copy in python,and I created a list b,then make c equal b.I know it's reasonable to find that the same element share the identical 'id'.Then I think I'll get the similar result in numpy when I make the nearly same steps,however,it shows that the same element has different 'id', I can't figure out how that happens in numpy.
You don't need a duplicate reference to produce the result.
import numpy as np
a = np.array([[10, 10], [2, 3], [4, 5]])
for x, y in zip(a, a):
print(id(x), ',', id(y))
# 52949424 , 52949464
# 52949624 , 52951424
# 52949464 , 52949424
My guess is that when zip iterates over arrays, it triggers indexing in numpy which then returns copied row. Remember that [] in numpy is not like that for list.
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html
You may try this to see why messing with id is not a good idea for numpy.
a[0] is a[0] # False
a[0] is a[[0]] # False

Resources