Convert map object into numpy ndarray in python3 - python-3.x

The following code works well in python2, but after migration to python3, it does not work.
How do I change this code in python3?
for i, idx in enumerate(indices):
user_id, item_id = idx
feature_seq = np.array(map(lambda x: user_id, item_id))
X[i, :len(item_id), :] = feature_seq # ---- error here ----
error:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'
Thank you.

In python3, Map returns a iterator, not a list.
You can also try numpy.fromiter to get an array from a map obj,only for 1-D data.
Example:
a=map(lambda x:x,range(10))
b=np.fromiter(a,dtype=np.int)
b
Ouput:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
For multidimensional arrays, refer to Reconcile np.fromiter and multidimensional arrays in Python

In PY3, map is like a generator. You need to wrap it in list() to produce a list that np.array can use: e.g.
np.array(list(map(...)))

Related

numpy array shows same id after changing values

I have created 2 numpy arrays.
Original
Cloned
Cloned is a copy of original array. I changed an element in cloned array. I am checking id of original and cloned array using is keyword. It returns false. But when I print id of both elements, it is same.
I know about peephole optimization techniques and numbers from -5 to 256 is stored at same address in python. But here I have changed value to 400 (> 256). Still it shows same id. WHY?
Please correct me if I am wrong. I am new to numpy arrays
import numpy as np
original = np.array([
[1, 2, 3, 4, 5],
[6, 7, 9, 10, 11]
])
# Copying array "original" to "cloned"
cloned = original.copy()
# Changing first element of Cloned Array
cloned[0, 1] = 400
print(id(cloned[0, 1]))
print(id(original[0, 1]))
print(id(cloned[0, 1]) is id(original[0, 1]))
Output:
140132171232408
id is same
140132171232408
id is same
False
is returns false although id is same
The is keyword is used to test if two variables refer to the same object, not to check if they are equal. Use == instead as following:
print(id(cloned[0, 1]) == id(original[0, 1]))
# Returns True
Apparently id() function return the same value but different objects therefore using is operator does not work.

Can we initialise a numpy array of numpy arrays with different shapes using some constructor?

I want an array that looks like this,
array([array([[1, 1], [2, 2]]), array([3, 3])], dtype=object)
I can make an empty array and then assign elements one by one like this,
z = [np.array([[1,1],[2,2]]), np.array([3,3])]
x = np.empty(shape=2, dtype=object)
x[0], x[1] = z
I thought if this possible then so should be this: x = np.array(z, dtype=object), but that gets me the error: ValueError: could not broadcast input array from shape (2,2) into shape (2).
So is the way given above the only way to make a ragged numpy array? Or, is there a nice one line constructor/function we can can call to make the array x from above.

How can I fill an array with sets

Say I have an image of 2x2 pixels named image_array, each pixel color is identified by a tuple of 3 entries (RGB), so the shape of image_array is 2x2x3.
I want to create an np.array c which has the shape 2x2x1 and which last coordinate is an empty set.
I tried this:
import numpy as np
image = (((1,2,3), (1,0,0)), ((1,1,1), (2,1,2)))
image_array = np.array(image)
c = np.empty(image_array.shape[:2], dtype=set)
c.fill(set())
c[0][1].add(124)
print(c)
I get:
[[{124} {124}]
[{124} {124}]]
And instead I would like the return:
[[{} {124}]
[{} {}]]
Any idea ?
The object array has to be filled with separate set() objects. That means creating them individually, as I do with a list comprehension:
In [279]: arr = np.array([set() for _ in range(4)]).reshape(2,2)
In [280]: arr
Out[280]:
array([[set(), set()],
[set(), set()]], dtype=object)
That construction should highlight that fact that this array is closely related to a list, or list of lists.
Now we can do a set operation on one of those elements:
In [281]: arr[0,1].add(124) # more idiomatic than arr[0][1]
In [282]: arr
Out[282]:
array([[set(), {124}],
[set(), set()]], dtype=object)
Note that we cannot operate on more than one set at a time. The object array offers few advantages compared to a list.
This is a 2d array; the sets don't form a dimension. Contrast that with
In [283]: image = (((1,2,3), (1,0,0)), ((1,1,1), (2,1,2)))
...: image_array = np.array(image)
...:
In [284]: image_array
Out[284]:
array([[[1, 2, 3],
[1, 0, 0]],
[[1, 1, 1],
[2, 1, 2]]])
While it started with tuples, it made a 3d array of integers.
Try this:
import numpy as np
x = np.empty((2, 2), dtype=np.object)
x[0, 0] = set(1, 2, 3)
print(x)
[[{1, 2, 3} None]
[None None]]
For non-number types in numpy you should use np.object.
whenever you do fill(set()), this will fill the array with exactly same set, as they refer to the same set. To fix this, just make a set if there isnt one everytime you need to add to the set
c = np.empty(image_array.shape[:2], dtype=set)
if not c[0][1]:
c[0,1] = set([124])
else:
c[0,1].add(124)
print (c)
# [[None {124}]
# [None None]]
Try changing your line c[0][1].add to this.
c[0][1] = 124
print(c)

Trying to understand the following generator in python

I am trying to understand the difference between the following two code snippets. The second one just prints the generator but the first snippet expands it and iters the generator. Why does it happen?
Is it because the two square brackets expand any iterable object?
#Code snippet 1
li=[[1,2,3],[4,5,6],[7,8,9]]
for col in range(0,3):
print( [row[col] for row in li] )`
Output:
[1, 4, 7]
[2, 5, 8]
[3, 6, 9]
#Code snippet 2
li=[[1,2,3],[4,5,6],[7,8,9]]
for col in range(0,3):
print( row[col] for row in li )
Output
<generator object <genexpr> at 0x7f1e0aef55c8>
<generator object <genexpr> at 0x7f1e0aef55c8>
<generator object <genexpr> at 0x7f1e0aef55c8>
Why is the output of above two quotes different?
The print function outputs the returning values of the __str__ method of the objects in its arguments. For lists, the __str__ method returns a nicely formatted string of comma-delimited item values enclosed in square brackets, but for generator objects, the __str__ method simply returns generic object information so to avoid altering the state of the generator.
By putting a generator expression in square brackets you're using list comprehension to explicitly make a list by iterating through the output of the generator expression. Since the items are already produced, the __str__ method of the list would have no problem returning their values.

How to make a tuple including a numpy array hashable?

One way to make a numpy array hashable is setting it to read-only. This has worked for me in the past. But when I use such a numpy array in a tuple, the whole tuple is no longer hashable, which I do not understand. Here is the sample code I put together to illustrate the problem:
import numpy as np
npArray = np.ones((1,1))
npArray.flags.writeable = False
print(npArray.flags.writeable)
keySet = (0, npArray)
print(keySet[1].flags.writeable)
myDict = {keySet : 1}
First I create a simple numpy array and set it to read-only. Then I add it to a tuple and check if it is still read-only (which it is).
When I want to use the tuple as key in a dictionary, I get the error TypeError: unhashable type: 'numpy.ndarray'.
Here is the output of my sample code:
False
False
Traceback (most recent call last):
File "test.py", line 10, in <module>
myDict = {keySet : 1}
TypeError: unhashable type: 'numpy.ndarray'
What can I do to make my tuple hashable and why does Python show this behavior in the first place?
You claim that
One way to make a numpy array hashable is setting it to read-only
but that's not actually true. Setting an array to read-only just makes it read-only. It doesn't make the array hashable, for multiple reasons.
The first reason is that an array with the writeable flag set to False is still mutable. First, you can always set writeable=True again and resume writing to it, or do more exotic things like reassign its shape even while writeable is False. Second, even without touching the array itself, you could mutate its data through another view that has writeable=True.
>>> x = numpy.arange(5)
>>> y = x[:]
>>> x.flags.writeable = False
>>> x
array([0, 1, 2, 3, 4])
>>> y[0] = 5
>>> x
array([5, 1, 2, 3, 4])
Second, for hashability to be meaningful, objects must first be equatable - == must return a boolean, and must be an equivalence relation. NumPy arrays don't do that. The purpose of hash values is to quickly locate equal objects, but when your objects don't even have a built-in notion of equality, there's not much point to providing hashes.
You're not going to get hashable tuples with arrays inside. You're not even going to get hashable arrays. The closest you can get is to put some other representation of the array's data in the tuple.
The fastest way to hash a numpy array is likely tostring.
In [11]: %timeit hash(y.tostring())
What you could do is rather than use a tuple define a class:
class KeySet(object):
def __init__(self, i, arr):
self.i = i
self.arr = arr
def __hash__(self):
return hash((self.i, hash(self.arr.tostring())))
Now you can use it in a dict:
In [21]: ks = KeySet(0, npArray)
In [22]: myDict = {ks: 1}
In [23]: myDict[ks]
Out[23]: 1

Resources