Generating different marker shapes in plotly/cufflinks - python-3.x

This post is similar to this one (Change Marker Shapes in Plotly .js), but I can't seem to get anything to work in python. First off, I am trying to make a multi-line graph (which I have done in both plt and plotly...code below), but being colorblind (which I am) I can't often tell what I am looking in plotly because the markers are always a circle (even though the label is included, it sometimes gets cut off (i.e., when the labels are too long) and I can't figure out what I'm looking at). The plotly/cufflinks graphs are much better in terms of being interactive and since I do a lot of data presentations, this will be my preferred method going forward if I can figure out how to change the markers for each line.
I am using Jupyter Notebook (version: 5.4.0) and Python (version 3.6.4)
Screenshot of the dummy_data file.
dummy_data_screenshot
In matplotlib, I did the following to get the output attached (note the different shape markers):
import matplotlib.pyplot as plt
import matplotlib as mpl ##(version: 2.1.2)
import pandas as pd ##(version: 0.22.0)
import numpy as np ##(version: 1.14.0)
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot
import cufflinks as cf ##(version: 0.12.1)
init_notebook_mode(connected=True)
cf.go_offline()
%matplotlib notebook
df = pd.read_csv("desktop\dummy_data.csv")
fx = df.groupby(['studyarm', 'visit'])\
['totdiffic_chg'].mean().unstack('studyarm').drop(['02_UNSCH','ZEOS'])
valid_markers = ([item[0] for item in
mpl.markers.MarkerStyle.markers.items() if
item[1] is not 'nothing' and not item[1].startswith('tick')
and not item[1].startswith('caret')])
markers = np.random.choice(valid_markers, df.shape[1], replace=False)
ax = fx.plot(kind = 'line', linestyle='-')
for i, line in enumerate(ax.get_lines()):
line.set_marker(markers[i])
ax.legend(loc='best')
ax.set_xticklabels(df.index, rotation=45)
plt.title('Some Made Up Data')
plt.ylabel('Score', fontsize=14)
plt.autoscale(enable=True, axis='x', tight=True)
plt.tight_layout()
plt_image_dummy_data
I used the code below and it created the graph via plotly/cufflinks:
fx.iplot(kind='line', yTitle='Score', title='Some Made Up Data',
mode=markers, filename='cufflinks/simple-line')
plotly_image_dummy_data
I have searched the web for the last few days and I can see many options to change the marker color, opacity, etc., etc., but I can't seem to figure out a way to automatically and randomly change the shape of the markers OR to manually change each individual line to a separate marker shape.
I am sure this is a simple fix, but I can't figure it out. Any help (or nudge in the right direction) would be very much appreciated.!

You can specify the shape for scatter plots using the symbol property, like bellow:
Scatter(x = ..., y = ..., mode = 'lines+markers',
marker = dict(size = 10, symbol = 1, ...))
For example:
0 gives circles
1 gives squares
3 gives '+' signs
5 gives triangles, etc.
Have a look at the 'symbol' entry in Plotly's doc here: https://plot.ly/python/reference/#box-marker-symbol

Related

problem on filing up the colour between two index values

I have a timeseries data timeseries.txt. First I select a index value (here 50) and put a red line mark on that selected index value. And I want to highlight portion before(idx-20) and after(idx+20) the red line index value on the timeseries.
I wrote this code however i am able to put the red line mark on the timeseries but while using fill_betweenx it doesnot work. I hope experts may help me overcoming this problem.Thanks.
import matplotlib.pyplot as plt
import numpy as np
input_data=np.loadtxt("timeseries.txt")
time=np.arange(len(input_data))
plt.plot(time,input_data)
idx = [50]
mark = [time[i] for i in idx]
plt.plot(idx,[input_data[i] for i in mark], marker="|",color='red',markerfacecolor='none',mew=0.4,ms=30,alpha=2.0)
plt.fill_betweenx(idx-20,idx+20 alpha=0.25,color='lightsteelblue')
plt.show()
If you are looking for just a semi-transparent rectangle, you can use patches.Rectangle to draw one. Refer here. I have updated your code to add a rectangle. See if this meets your requirement. I have used a sine wave as I didn't have your data.
import matplotlib.pyplot as plt
import numpy as np
## Create sine wave
x = np.arange(100)
input_data=np.sin(2*np.pi*3*x/100)
time=np.arange(len(input_data))
plt.plot(time,input_data)
idx = [50]
mark = [time[i] for i in idx]
plt.plot(idx,[input_data[i] for i in mark], marker="|", color='red', markerfacecolor='none', mew=0.4,ms=30,alpha=2.0)
#plt.fill_betweenx(mark,idx-20,0, alpha=0.25,color='lightsteelblue')
# Create a Rectangle patch
import matplotlib.patches as patches
from matplotlib.patches import Rectangle
plt.gca().add_patch(Rectangle((idx[0]-20, -0.15), 40, .3, facecolor = 'lightsteelblue',fill=True,alpha=0.25, lw=0))
plt.show()
EDIT
Please refer to the Rectangle documentation provided earlier in the response. You will need to adjust the start coordinates (x,y) and the height and width to see how big/small you need the Rectangle. For eg: changing the rectangle code like this...
plt.gca().add_patch(Rectangle((idx[0]-10, -0.40), 20, 0.8, facecolor = 'lightsteelblue',fill=True,alpha=0.25, lw=0))
will give you this plot.

How to get error bars on barchart PowerBI?

I want to have such barchart:
The error bar on each column should show dispersion (I have it calculated in one of the columns). And top lines show whether there is a significant difference. Right now I have only achieved such graph:
I am using simple clustered barchart in PowerBI Desktop. Maybe there is another visual for that or another program which could do it? Maybe Python somehow?
A mentioned here you can do that with matplotlib from python. Just as an example:
import numpy as np
import pylab as plt
data = np.array(np.random.rand(1000))
y,binEdges = np.histogram(data,bins=10)
bincenters = 0.5*(binEdges[1:]+binEdges[:-1])
menStd = np.sqrt(y)
width = 0.05
plt.bar(bincenters, y, width=width, color='r', yerr=menStd)
plt.show()

Python: Pickle.load function returns the correct 3d-scatter plot, but is not interactive anymore

this is my first question here so let me know if I should make any improvements regarding e.g. formulation of the question, code and so on.
So I am creating several 3-D Scatter Plots in Python and want to safe them for later re usage and comparability. I am using Qt5 as Graphics Backend in Spyder, which perfectly displays my interactive (so I can rotate over the axes and flip the plot) 3-D Scatter plot using the origin Code.
Now I am able to successfully save the created plot and also load it into a new script, which opens the Plot in Qt5 as well. But somehow the interactivity is gone, meaning I am not able to rotate over the axes and flip the plot anymore.
I was unable to find any guidance to that issue or find any person with a similar problem. Do you guys have any idea? I'll put the relevant part of my sample Code below:
""" First script """
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import pandas as pd
import pickle
testdf = pd.DataFrame({"X" : x, "Y" : y, "Z" : z}) #x and y are the criteria, z the values, stored as lists
# Create 3d scatter plot
fig = plt.figure(figsize=(12, 12))
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x, y, z, c=z, marker="o")
ax.set_xlabel("Initial Notional Cluster")
ax.set_ylabel("Laufzeit in Month Cluster")
ax.set_zlabel("Vol. Weighted Margin")
plt.show()
# Save the figure object as binary file
file = open(r"Init_Lfz_VolWeightedMargin.pkl", "wb")
pickle.dump(fig, file)
file.close()
""" Second script """
import matplotlib.pyplot as plt
figx = pickle.load(open(r"Init_Lfz_VolWeightedMargin.pkl", "rb"))
plt.show()
Any idea, why the interactivity is gone? According to the pickle library and other usercases, this shall not happen.
Many thanks.

Cant get the legend to show correctly on the chart

my legend is showing top right, but rather then stating AAPL and IBM it says one letter. cant figure out whats wrong
import quandl
import pandas as pd
import matplotlib.pyplot as plt
def get_mean_volume(symbol):
df = quandl.get("YAHOO/"+str(symbol))[::-1]
return df[['High', 'Adjusted Close']]
stock = ['AAPL', 'IBM']
for s in stock:
plt.plot(get_mean_volume(s))
plt.legend(s)
plt.ylabel('Price')
plt.xlabel('Date')
This is from the matplotlib.legend() documentation.
To make a legend for lines which already exist on the axes (via plot
for instance), simply call this function with an iterable of strings,
one for each legend item. For example:
plt.plot([1, 2, 3])
plt.legend(['A simple line'])
You should probably also add a plt.show().
So since you dont use any labels I think you should use:
plt.legend([s])
The error that you only see one letter is probably caused by the fact that legend iterates over the input (s="AAPL") and takes the first item (s[0]) for the label text for line 1 (s[0] is 'A').
For the second iteration of the loop the same happens for the 'I' (Because s[0]='I' in this case. s1 = 'B' and so on... )
legend() seems pretty customizable just check the matplotlib docs.
So this is the result for me:
import matplotlib.pyplot as plt
stock = ['AAPL']
for s in stock:
plt.plot([1,2,3])
plt.legend([s])
plt.ylabel('Price')
plt.xlabel('Date')
plt.show()
Results in:

Matplotlib: personalize imshow axis

I have the results of a (H,ranges) = numpy.histogram2d() computation and I'm trying to plot it.
Given H I can easily put it into plt.imshow(H) to get the corresponding image. (see http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.imshow )
My problem is that the axis of the produced image are the "cell counting" of H and are completely unrelated to the values of ranges.
I know I can use the keyword extent (as pointed in: Change values on matplotlib imshow() graph axis ). But this solution does not work for me: my values on range are not growing linearly (actually they are going exponentially)
My question is: How can I put the value of range in plt.imshow()? Or at least, or can I manually set the label values of the plt.imshow resulting object?
Editing the extent is not a good solution.
You can just change the tick labels to something more appropriate for your data.
For example, here we'll set every 5th pixel to an exponential function:
import numpy as np
import matplotlib.pyplot as plt
im = np.random.rand(21,21)
fig,(ax1,ax2) = plt.subplots(1,2)
ax1.imshow(im)
ax2.imshow(im)
# Where we want the ticks, in pixel locations
ticks = np.linspace(0,20,5)
# What those pixel locations correspond to in data coordinates.
# Also set the float format here
ticklabels = ["{:6.2f}".format(i) for i in np.exp(ticks/5)]
ax2.set_xticks(ticks)
ax2.set_xticklabels(ticklabels)
ax2.set_yticks(ticks)
ax2.set_yticklabels(ticklabels)
plt.show()
Expanding a bit on #thomas answer
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mi
im = np.random.rand(20, 20)
ticks = np.exp(np.linspace(0, 10, 20))
fig, ax = plt.subplots()
ax.pcolor(ticks, ticks, im, cmap='viridis')
ax.set_yscale('log')
ax.set_xscale('log')
ax.set_xlim([1, np.exp(10)])
ax.set_ylim([1, np.exp(10)])
By letting mpl take care of the non-linear mapping you can now accurately over-plot other artists. There is a performance hit for this (as pcolor is more expensive to draw than AxesImage), but getting accurate ticks is worth it.
imshow is for displaying images, so it does not support x and y bins.
You could either use pcolor instead,
H,xedges,yedges = np.histogram2d()
plt.pcolor(xedges,yedges,H)
or use plt.hist2d which directly plots your histogram.

Resources