Plotting distplots on one panel for different features with seaborn - python-3.x

I have a dataframe with 12 different features. And I would like to plot histograms for each in one go on a panel 4x3.
test = pd.DataFrame({
'a': [10, 5, -2],
'b': [2, 3, 1],
'c': [10, 5, -2],
'd': [-10, -5, 2],
'aa': [10, 5, -2],
'bb': [2, 3, 1],
'cc': [10, 5, -2],
'dd': [-10, -5, 2],
'aaa': [10, 5, -2],
'bbb': [2, 3, 1],
'ccc': [10, 5, -2],
'ddd': [-10, -5, 2]
})
I can do it by writing something like the code below:
# plot
f, axes = plt.subplots(3, 4, figsize=(20, 10), sharex=True)
sns.distplot( test["a"] , color="skyblue", ax=axes[0, 0])
sns.distplot( test["b"] , color="olive", ax=axes[0, 1])
sns.distplot( test["c"] , color="teal", ax=axes[0, 2])
sns.distplot( test["d"] , color="grey", ax=axes[0, 3])
...
How can I loop and iterate through features in an elegant way instead? I'd like to assign the same four colors for each row.

you can include everything in a for loop:
colors =["skyblue", "olive", "teal", "grey"]
f, axes = plt.subplots(3, 4, figsize=(20, 10), sharex=True)
for i, ax in enumerate(axes.flatten()):
sns.distplot( test.iloc[:, i] , color=colors[i%4], ax=ax)

Seaborn provides a FacetGrid for such purposes.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
test = pd.DataFrame({
'a': [10, 5, -2],
'b': [2, 3, 1],
'c': [10, 5, -2],
'd': [-10, -5, 2],
'aa': [10, 5, -2],
'bb': [2, 3, 1],
'cc': [10, 5, -2],
'dd': [-10, -5, 2],
'aaa': [10, 5, -2],
'bbb': [2, 3, 1],
'ccc': [10, 5, -2],
'ddd': [-10, -5, 2]
})
data = pd.melt(test)
data["hue"] = data["variable"].apply(lambda x: x[:1])
g = sns.FacetGrid(data, col="variable", col_wrap=4, hue="hue")
g.map(sns.distplot, "value")
plt.show()

Related

How to pass tensor in GPU to ProcessPoolExecutor?

Here is my code to demonstrate:
import torch
import concurrent.futures
if __name__ == '__main__':
x = torch.tensor([[1, 2, 3], [4, 5, 6]])
args=[(i,x) for i in range(3)]
with concurrent.futures.ProcessPoolExecutor(3) as executor:
executor.map(print,args)
x = x.to( torch.device('cuda'))
args=[(i,x) for i in range(3)]
print(args)
with concurrent.futures.ProcessPoolExecutor(3) as executor:
executor.map(print,args)
and the result:
(0, tensor([[1, 2, 3],
[4, 5, 6]])) (1, tensor([[1, 2, 3],
[4, 5, 6]])) (2, tensor([[1, 2, 3],
[4, 5, 6]]))
[(0, tensor([[1, 2, 3],
[4, 5, 6]], device='cuda:0')), (1, tensor([[1, 2, 3],
[4, 5, 6]], device='cuda:0')), (2, tensor([[1, 2, 3],
[4, 5, 6]], device='cuda:0'))]
(0, tensor([[0, 0, 0],
[0, 0, 0]], device='cuda:0')) (1, tensor([[0, 0, 0],
[0, 0, 0]], device='cuda:0')) (2, tensor([[0, 0, 0],
[0, 0, 0]], device='cuda:0'))
As we can see, the args were passed as expected with CPU but failed with GPU. All the inputs were cleared to zeros.
What is the right way? or I have to convert them inside the child processes?

Merge and concatenate 2 dataframes at same time

I'm trying to concatenate 2 dataframes keeping only those rows where value of 2 columns is present in both dataframes. For e.g.
tp1 = pd.DataFrame(
{
'A': [1, 2, 3, 4],
'B': [5, 4, 2, 7],
'C': [2, 4, 9, 1],
'D': [1, 9, 7, 0]
})
tp2 = pd.DataFrame(
{
'A': [8, 2, 3, 9],
'B': [6, 4, 2, 4],
'C': [2, 9, 9, 1],
'D': [1, 9, 7, 0]
})
tpOUT = pd.DataFrame(
{
'A': [2, 2, 3],
'B': [4, 4, 2],
'C': [4, 9, 9],
'D': [9, 9, 7]
})
if tp1 and tp2 are 2 dataframes then tpOUT is corresponding output. I googled and found you can do this with pd.merge and pd.concat but cant seem to get it worked.

PyTorch unfold vs as_stride

It seems PyTorch unfold and as_stride are doing the same thing but for the former, you cannot control the tensor output size.
import torch
import torch.nn as nn
x = torch.arange(0, 10)
x1 = x.unfold(0, 3, 1)
x2 = x.as_strided((8,3), (1,1))
print(f'x1 = {x1}')
print(f'x2 = {x2}')
output:
x1 = tensor([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
x2 = tensor([[0, 1, 2],
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7],
[6, 7, 8],
[7, 8, 9]])
Then is there any situation that you should use unfold instead of as_stride and vice versa?

How add a column to the front of np array?

I want to add a column x0 of shape(1,10) to the front of an existing nparray X of shape(10,3) so that the final np array X_new becomes of the shape (10,4).
x0 = np.ones((1,np.shape(X)[0]))
X = np.array([[1500,1,2],[1700,3,3],[2000,2,2],[2400,2,3],[2700,3,3],[3000,3,4],[3100,2,3],[3300,3,4],[3500,4,5],[3600,3,4]])
output:
X_new = np.array([[1,1500,1,2],[1,1700,3,3],[1,2000,2,2],[1,2400,2,3],[1,2700,3,3],[1,3000,3,4],[1,3100,2,3],[1,3300,3,4],[1,3500,4,5],[1,3600,3,4]])
I have tried doing concatenation, hstack but I am not able to get the desired resultant np array.
Please help.
Thank you.
You are using the wrong shape for x0, once you modify that, you can use np.hstack:
X = np.array([[1500,1,2],[1700,3,3],[2000,2,2],[2400,2,3],[2700,3,3],[3000,3,4],[3100,2,3],[3300,3,4],[3500,4,5],[3600,3,4]])
x0 = np.ones((np.shape(X)[0],1))
x_new = np.hstack([x0,X])
x_new
array([[1, 1500, 1, 2],
[1, 1700, 3, 3],
[1, 2000, 2, 2],
[1, 2400, 2, 3],
[1, 2700, 3, 3],
[1, 3000, 3, 4],
[1, 3100, 2, 3],
[1, 3300, 3, 4],
[1, 3500, 4, 5],
[1, 3600, 3, 4]])

numpy assignment doesn't work

Suppose I have the following numpy.array:
In[]: x
Out[]:
array([[1, 2, 3, 4, 5],
[5, 2, 4, 1, 5],
[6, 7, 2, 5, 1]], dtype=int16)
In[]: y
Out[]:
array([[-3, -4],
[-4, -1]], dtype=int16)
I want to replace a sub array of x by y and tried the following:
In[]: x[[0,2]][:,[1,3]]= y
Ideally, I wanted this to happen:
In[]: x
Out[]:
array([[1, -3, 3, -4, 5],
[5, 2, 4, 1, 5],
[6, -4, 2, -1, 1]], dtype=int16)
The assignment line doesn't give me any error, but when I check the output of x
In[]: x
I find that x hasn't changed, i.e. the assignment didn't happen.
How can I make that assignment? Why did the assignment didn't happen?
The the "fancy indexing" x[[0,2]][:,[1,3]] returns a copy of the data. Indexing with slices returns a view. The assignment does happen, but to a copy (actually a copy of a copy of...) of x.
Here we see that the indexing returns a copy:
>>> x[[0,2]]
array([[1, 2, 3, 4, 5],
[6, 7, 2, 5, 1]], dtype=int16)
>>> x[[0,2]].base is x
False
>>> x[[0,2]][:, [1, 3]].base is x
False
>>>
Now you can use fancy indexing to set array values, but not when you nest the indexing.
You can use np.ix_ to generate the indices and perform the assignment:
>>> x[np.ix_([0, 2], [1, 3])]
array([[2, 4],
[7, 5]], dtype=int16)
>>> np.ix_([0, 2], [1, 3])
(array([[0],
[2]]), array([[1, 3]]))
>>> x[np.ix_([0, 2], [1, 3])] = y
>>> x
array([[ 1, -3, 3, -4, 5],
[ 5, 2, 4, 1, 5],
[ 6, -4, 2, -1, 1]], dtype=int16)
>>>
You can also make it work with broadcasted fancy indexing (if that's even the term) but it's not pretty
>>> x[[0, 2], np.array([1, 3])[..., None]] = y
>>> x
array([[ 1, -3, 3, -4, 5],
[ 5, 2, 4, 1, 5],
[ 6, -4, 2, -1, 1]], dtype=int16)
By the way, there is some interesting discussion at the moment on the NumPy Discussion mailing list on better support for "orthogonal" indexing so this may become easier in the future.

Resources