Seaborn and mplcursors - python-3.x

I have some data that I want to plot on a scatter chart, and display the associated label for each point. The data looks like
xlist=[1,2,3,4]
ylist=[2,3,4,5]
labels=['a', 'b', 'c', 'd']
I can plot using Seaborn and tried to use mplcursor, but the displayed labels are the x and y instead of labels.
sns.scatterplot(x, y)
mplcursors.cursor(hover=True)
How can I make it display the labels, instead of (x, y)?

You will need to read the mplcursors documentation and copy the example on that matter from it to your code. Let me do that for you:
import matplotlib.pyplot as plt
import seaborn as sns
import mplcursors
xlist=[1,2,3,4]
ylist=[2,3,4,5]
labels=['a', 'b', 'c', 'd']
sns.scatterplot(xlist, ylist)
cursor = mplcursors.cursor(hover=True)
cursor.connect(
"add", lambda sel: sel.annotation.set_text(labels[sel.target.index]))
plt.show()

Related

Pandas Series of dates to vlines kwarg in mplfinance plot

import numpy as np
import pandas as pd
df = pd.DataFrame({'dt': ['2021-2-13', '2022-2-15'],
'w': [5, 7],
'n': [11, 8]})
df.reset_index()
print(list(df.loc[:,'dt'].values))
gives: ['2021-2-13', '2022-2-15']
NEEDED: [('2021-2-13'), ('2022-2-15')]
Important (at comment's Q): "NEEDED" is the way "mplfinance" accepts vlines argument for plot (checked) - I need to draw vertical lines for specified dates at x-axis of chart
import mplfinance as mpf
RES['Date'] = RES['Date'].dt.strftime('%Y-%m-%d')
my_vlines=RES.loc[:,'Date'].values # NOT WORKS
fig, axlist = mpf.plot( ohlc_df, type="candle", vlines= my_vlines, xrotation=30, returnfig=True, figsize=(6,4))
will only work if explcit my_vlines= [('2022-01-18'), ('2022-02-25')]
SOLVED: Oh, it really appears to be so simple after all
my_vlines=list(RES.loc[:,'Date'].values)
Your question asks for a list of Numpy arrays but your desired output looks like Tuples. If you need Tuples, note that it's the comma that makes the tuple not the parentheses, so you'd do something like this:
desired_format = [(x,) for x in list(df.loc[:,'dt'].values)]
If you want numpy arrays, you could do this
desired_format = [np.array(x) for x in list(df.loc[:,'dt'].values)]
I think I understand your problem. Please see the example code below and let me know if this resolves your problem. I expanded on your dataframe to meet mplfinance plot criteria.
import pandas as pd
import numpy as np
import mplfinance as mpf
df = pd.DataFrame({'dt': ['2021-2-13', '2022-2-15'],'Open': [5,7],'Close': [11, 8],'High': [21,30],'Low': [7, 3]})
df['dt']=pd.to_datetime(df['dt'])
df.set_index('dt', inplace = True)
mpf.plot(df, vlines = dict(vlines = df.index.tolist()))

Group Box Plots for different numerical variables in one figure

I have a data frame with several numerical variables and I would like to create box plots for each variable and group them in one figure. So each variable should have its own box plot and all these box plots should be in 1 figure.How can I do that in Seaborn or Matplotlib?
Thank you very much!
Yes, you can do with seaborn:
df = pd.DataFrame(np.random.rand(100,4), columns=list('ABCD'))
num_col_list = ['A','B','C','D']
sns.boxplot(data=df.melt(value_vars=num_col_list),
x='variable', y='value')
Output:
Or with just pandas/matplotlib:
df.boxplot(column=num_col_list)
Output:
If you use a pandas data frame you can use the boxplot function:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.randn(10, 4),columns=['Col1', 'Col2', 'Col3', 'Col4'])
df.boxplot(column=['Col1', 'Col2', 'Col3'])
plt.show()

Unable to customize labels and legend in Seaborn python

import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
sns.set(style="darkgrid")
df = pd.read_csv('Leap_Static_trials.csv')
Length = sns.swarmplot(x='name', y= 'length', data= df, color = 'green')
Width = sns.swarmplot(x='name', y= 'width', data= df, color = 'red')
plt.legend(labels=['Length','Width'])
plt.show()
From my dataset df I am plotting the length and width of the fingers taken from Leap Motion Controller. I am unable to change the legend to include the second color (red) which signifies the width.
Please find the attached figure as well. Your help is much appreciated. :)
Adding the parameter label= to a plot command usually creates the legend handles and labels automatically. In this case, seaborn creates handles for each column (so 5 of each). A trick is to create the legend with only the first and the last of the handles and the labels.
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
sns.set(style="darkgrid")
N = 100
# df = pd.read_csv('Leap_Static_trials.csv')
names = list('abcde')
ax = plt.gca()
df = pd.DataFrame({'name': np.random.choice(names, N),
'length': np.random.normal(50, 0.7, N),
'width': np.random.normal(20, 0.5, N)})
Length = sns.swarmplot(x='name', y='length', data=df, color='green', label='Length', order=names, ax=ax)
Width = sns.swarmplot(x='name', y='width', data=df, color='red', label='Width', ax=ax)
handles, labels = ax.get_legend_handles_labels()
plt.legend([handles[0], handles[-1]], [labels[0], labels[-1]])
plt.show()

How do I make my plot look like this with matplotlib?

So right now I'm trying to simulate a Poisson process for an assignment, here's the code so far:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
y = np.arange(0,21,1)
x = np.cumsum(np.random.exponential(2,21))
print(y)
print(x)
sns.set()
plt.plot(x,y)
plt.show()
The problem arises when I try plotting it. The code above, as expected, produces a normal matplotlib plot that looks like this:
However I need it to look like this:
Is there an easy way of doing it? I tried messing with bar plots but was unable to produce something that looks good.
The graph that you are wanting to plot is called as step plot in matplotlib. In order to plot it replace plt.plot(x,y) with plt.step(x,y)
So, your code becomes:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
y = np.arange(0,21,1)
x = np.cumsum(np.random.exponential(2,21))
print(y)
print(x)
sns.set()
plt.step(x,y)
plt.show()

Matplotlib warning using pandas.DataFrame.plot.scatter()

One windows 10, with versions:
Python 3.5.2, pandas 0.23.4, matplotlib 3.0.0, numpy 1.15.2,
the following code give me the following warning that i would like to sort out
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
# a 5x4 random pandas DataFrame
pf = pd.DataFrame(np.random.random((5,4)), columns=['a', 'b', 'c', 'd'])
# colors:
colors = cm.rainbow(np.linspace(0, 1, 4))
fig1 = pf.plot.scatter('a', 'b', color='k')
for i, j in enumerate(['b', 'c', 'd']):
pf.plot.scatter('a', j, color=colors[i+1], ax = fig1)
And I get a warning:
'c' argument looks like a single numeric RGB or RGBA sequence, which
should be avoided as value-mapping will have precedence in case its
length matches with 'x' & 'y'. Please use a 2-D array with a single
row if you really want to specify the same RGB or RGBA value for all
points.
Could you point me on how to address that warning?
I can't reproduce the warning with matplotlib 3.0 and pandas 0.23.4, but what it says is essentially that you should not use a single RGB tuple to specify a color.
So instead of color=colors[i+1] use
color=[colors[i+1]]

Resources