plotting asymmetric errorbars using matplotlib - python-3.x

I am trying to plot asymmetric error bars which are really 95% confidence interval. The output that I get is not the desired outcome. I am not sure what part of the code is not giving rise to the desired outcome.
import numpy as np
import matplotlib.pyplot as plt
x = (18,20,22,24,26,28,30,32,34)
apo_average = (1933.877,1954.596,2058.192,2244.664,2265.383,2265.383,2306.821,2534.731,2576.169)
std_apo=(35.88652754,0,179.4326365,35.88652754,0,0,35.88652754,35.88652696,0)
error = np.array(apo_average)
lower_error_apo=error-((4.303*(np.array(std_apo)))/np.sqrt(3))
higher_error_apo=error+((4.303*(np.array(std_apo)))/np.sqrt(3))
asymmetric_error_apo=[lower_error_apo, higher_error_apo]
fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(x,apo_average,marker='o',label="0 Cu", color='none', edgecolor='blue', linewidth='1')
ax.errorbar(x,apo_average,yerr=asymmetric_error_apo, markerfacecolor='blue',markeredgecolor='blue')
The outcome is 
This is quite unexpected. For instance, I intended to put a lower error for the first error bar to be 1844.723, which doesn't agree with what's shown in the picture. This trend stays the same with every error bars.

Most of the time it helps to read the documentation which states
xerr/yerr : scalar or array-like, shape(N,) or shape(2,N), optional
If a scalar number, len(N) array-like object, or a N-element array-like object, errorbars are drawn at +/-value relative to the data. Default is None.
If a sequence of shape 2xN, errorbars are drawn at -row1 and +row2 relative to the data.
You therefore need to use the values calculated from the standard deviation directly, instead of subtracting them from or adding them to the mean.
lower_error_apo=(4.303*(np.array(std_apo)))/np.sqrt(3)
higher_error_apo=(4.303*(np.array(std_apo)))/np.sqrt(3)

Related

The plot method plots the list shifted back by one, while scatter is ok

Hi the following code represents the first 10 integers' cubes.
The scatter method works fine, the plot method shifts everything one to the left.
The axis looks correct to me.
I tried to figure it out but I don't know where I'm going wrong.
Thank you .
import matplotlib.pyplot as plt
n_values = range(1,11,1)
n_cubes = [n**3 for n in n_values]
fig, ax = plt.subplots()
ax.plot(n_cubes)
ax.scatter(n_values, n_cubes, c=n_cubes, cmap=plt.cm.Reds, s=20)
ax.axis([1, 12, 0, 1100])
print(n_cubes, n_values)
plt.style.use('seaborn')
plt.show()
If you call ax.plot() with only one argument, it will make its own x-axis values. In python, these start with zero. So, all is shifted.
So, you need to call the function like this:
ax.plot(n_values, n_cubes)

Scipy.detrend: Function changes range of values

I am trying to detrend this one dimensional array:
array([13.64352283, 13.48914862, 13.00767009, 13.35416524, 13.60143818,
13.40895156, 13.48349417, 13.65703125, 13.4959721 , 13.28891263,
12.97999066, 13.01112397, 12.79519705, 13.32030445, 13.19949068,
12.88691975, 13.32079707])
The function runs without errors but changes the range of values from ~[12,14] to ~[-0.4,0.4].
I believe it is due to the small std dev of the values that this happens.
Any ideas how to fix this, so I can plot the array with trend and the detrended one into one plot?
Normalization is not an option.
Please help.
Well, that is exactly what detrend does: it subtracts the values of the least square linear approximation to the input.
Here is a plot to illustrate what happens:
from scipy import signal
import numpy as np
import matplotlib.pyplot as plt
y = np.array([13.64352283, 13.48914862, 13.00767009, 13.35416524, 13.60143818,
13.40895156, 13.48349417, 13.65703125, 13.4959721, 13.28891263,
12.97999066, 13.01112397, 12.79519705, 13.32030445, 13.19949068,
12.88691975, 13.32079707])
plt.plot(y, color='dodgerblue')
plt.plot(signal.detrend(y), color='limegreen')
plt.plot(y - signal.detrend(y), color='crimson')
plt.show()
The red line in the plot is the linear approximation that got subtracted from the original data to obtain detrend(y).

Python visualization - histograms

the following two questions are regarding a histogram I am trying to build.
1) I want the bins to be as follows:
[0-10,10-20,...,580-590, 590-600]. I tried the following code:
bins_range=[]
for i in range(0,610,10):
bins_range.append(i)
plt.hist(df['something'], bins=bins_range, rwidth=0.95)
I expected to see bins as above with their corresponding amount of samples for each bin, but instead I got only 10 bins (as the default parameter).
2) How can I change the y-axis as follows: say my max bin contains 40 samples, so instead of 40 on the y-axis I want it to be 100%, and the others correspondly. I.e., 30 will be 75%, 20 will be 50% and so on.
Your code seems to be working OK. You can even pass the range command directly to the bins parameter of hist.
To get the y-axis as percentages, I think you need two passes: first calculate the bins to know how much the highest bin contains. Then, do the plotting using 1/highest as weights. There is a numpy np.hist that does all the calculations without plotting.
Use the PercentFormatter() to display the axis in percentages. It gets a parameter to tell how many 100% represents. Use PercentFormatter(max(hist)) to get the highest value as 100%. If you just want the total as 100%, just pass PercentFormatter(len(x)), without the need to calculate the histogram twice. As internally the y-axis is still in values, the ticks don't show up at the desired positions. You can use plt.yticks(np.linspace(0, max(hist), 11)) to have ticks for every 10%.
To get nicer separations between the bars, you can set an explicit edge color. Best without the rwidth=0.95
Example code:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
x = np.random.rayleigh(200, 50000)
hist, bins = np.histogram(x, bins=range(0, 610, 10))
plt.hist(x, bins=bins, ec='white', fc='darkorange')
plt.gca().yaxis.set_major_formatter(PercentFormatter(max(hist)))
plt.yticks(np.linspace(0, max(hist), 11))
plt.show()
PS: To use matplotlib's standard yticks, and having the y-axis also internally in percentages, you can use the weights parameter of hist. This can be handy when you want to interactively resize or zoom the plot, or need horizontal lines at specific percentages.
plt.hist(x, bins=bins, ec='white', fc='dodgerblue', weights=np.ones_like(x)/max(hist))
plt.gca().yaxis.set_major_formatter(PercentFormatter(1))

Using RGB values control individual data points matplotlib

I'm trying to be able to control the colour of an individual data point using a corresponding rgb tuple. I've tried looping through the data set and plotting individual data points however I get the same effect as the code I have below; all that happens is it refuses to produce a graph.
This is an example of the data type I'm working with
Any tips?
import matplotlib.pyplot as plt
y=[(0.200,0.1100,0.520)]
for i in range(4):
y.append(y)
plt.plot([1,2,3,4], [3,4,5,2],c=y)
plt.show()
One problem is that you are appending the list to the new list. Instead, try appending the tuple to the list. Moreover, you need to use scatter plot for the color argument which contains rgb tuple for each point. However, in oyur case, I see only a single color for all the scatter points.
tup=(0.200,0.1100,0.520)
y = []
for i in range(4):
y.append(tup)
plt.scatter([1,2,3,4], [3,4,5,2], c=y)
A rather short version to your code is using a list comprehension
tup=(0.200,0.1100,0.520)
y = [tup for _ in range(4)]
plt.scatter([1,2,3,4], [3,4,5,2], c=y)

Creating a structured grid of subplots with Seaborn FacetGrid

My attempt to use FacetGrid in Seaborn does not produces the expected results.
Moreover, I would like to control the white space in the grid.
My data and code is the following:
toy.to_json()
'{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(sns.barplot, 'reg_month', 'Total_Revenue')
g.add_legend();
If I use bar in pyplot I get this:
g = sns.FacetGrid(toy, col='has_cus_id_but_not_acc_id', hue='reg_year')
g.map(plt.bar, 'reg_month', 'Total_Revenue')
g.add_legend();
Again, I would like to be able to define the white space of the grid.
In addition I would not like to have the bars stacked one over the other but rather one next to the other.
Some values of the year 2018 are really large compared to the any of the values where has_cus_id_but_not_acc_id is 1. Hence the right plot is almost empty. It might make sense to use a logarithmic scale.
Now you have 6 years, so each month would need to show 6 bars next to each other. That will make bars pretty small and does not let the chart be easily readable. Still it's possible.
The following does not use seaborn, but pandas and matplotlib:
import matplotlib.pyplot as plt
import pandas as pd
toy = '{"has_cus_id_but_not_acc_id":{"0":0,"1":0,"2":0,"3":0,"4":0,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0,"14":0,"15":0,"16":0,"17":0,"18":1,"19":0,"20":0,"21":0,"22":1,"23":0,"24":0,"25":1,"26":0,"27":1,"28":0,"29":1,"30":0,"31":1,"32":0,"33":1,"34":0,"35":1,"36":0,"37":1,"38":0,"39":0,"40":1,"41":1,"42":0,"43":1,"44":0,"45":1,"46":0,"47":1,"48":0,"49":1,"50":0,"51":1,"52":0,"53":1,"54":0,"55":1,"56":0,"57":1,"58":0,"59":1,"60":0,"61":1,"62":0,"63":1,"64":0,"65":1,"66":0,"67":1,"68":0,"69":1,"70":0,"71":1,"72":0,"73":1,"74":0,"75":1,"76":0,"77":0,"78":1,"79":0,"80":1,"81":0,"82":0,"83":1,"84":0,"85":1},"reg_year":{"0":2014.0,"1":2014.0,"2":2014.0,"3":2014.0,"4":2014.0,"5":2014.0,"6":2014.0,"7":2014.0,"8":2015.0,"9":2015.0,"10":2015.0,"11":2015.0,"12":2015.0,"13":2015.0,"14":2015.0,"15":2015.0,"16":2015.0,"17":2016.0,"18":2016.0,"19":2016.0,"20":2016.0,"21":2016.0,"22":2016.0,"23":2016.0,"24":2016.0,"25":2016.0,"26":2016.0,"27":2016.0,"28":2016.0,"29":2016.0,"30":2016.0,"31":2016.0,"32":2016.0,"33":2016.0,"34":2016.0,"35":2016.0,"36":2016.0,"37":2016.0,"38":2017.0,"39":2017.0,"40":2017.0,"41":2017.0,"42":2017.0,"43":2017.0,"44":2017.0,"45":2017.0,"46":2017.0,"47":2017.0,"48":2017.0,"49":2017.0,"50":2017.0,"51":2017.0,"52":2017.0,"53":2017.0,"54":2017.0,"55":2017.0,"56":2017.0,"57":2017.0,"58":2017.0,"59":2017.0,"60":2018.0,"61":2018.0,"62":2018.0,"63":2018.0,"64":2018.0,"65":2018.0,"66":2018.0,"67":2018.0,"68":2018.0,"69":2018.0,"70":2018.0,"71":2018.0,"72":2018.0,"73":2018.0,"74":2018.0,"75":2018.0,"76":2018.0,"77":2018.0,"78":2018.0,"79":2018.0,"80":2018.0,"81":2018.0,"82":2019.0,"83":2019.0,"84":2019.0,"85":2019.0},"reg_month":{"0":3.0,"1":5.0,"2":6.0,"3":7.0,"4":9.0,"5":10.0,"6":11.0,"7":12.0,"8":1.0,"9":3.0,"10":5.0,"11":6.0,"12":7.0,"13":8.0,"14":9.0,"15":11.0,"16":12.0,"17":1.0,"18":1.0,"19":2.0,"20":3.0,"21":4.0,"22":4.0,"23":5.0,"24":6.0,"25":6.0,"26":7.0,"27":7.0,"28":8.0,"29":8.0,"30":9.0,"31":9.0,"32":10.0,"33":10.0,"34":11.0,"35":11.0,"36":12.0,"37":12.0,"38":1.0,"39":2.0,"40":2.0,"41":3.0,"42":4.0,"43":4.0,"44":5.0,"45":5.0,"46":6.0,"47":6.0,"48":7.0,"49":7.0,"50":8.0,"51":8.0,"52":9.0,"53":9.0,"54":10.0,"55":10.0,"56":11.0,"57":11.0,"58":12.0,"59":12.0,"60":1.0,"61":1.0,"62":2.0,"63":2.0,"64":3.0,"65":3.0,"66":4.0,"67":4.0,"68":5.0,"69":5.0,"70":6.0,"71":6.0,"72":7.0,"73":7.0,"74":8.0,"75":8.0,"76":9.0,"77":10.0,"78":10.0,"79":11.0,"80":11.0,"81":12.0,"82":1.0,"83":1.0,"84":2.0,"85":2.0},"Total_Revenue":{"0":35852.02,"1":2623.97,"2":3526.67,"3":21466.71,"4":72784.1200000003,"5":103921.2899999999,"6":10852.87,"7":16522.07,"8":7443.76,"9":68962.1600000002,"10":10956.38,"11":193856.8799999985,"12":110766.6099999997,"13":123861.8599999987,"14":2722.34,"15":303488.6900000007,"16":6876.58,"17":17729.5,"18":4687.93,"19":26914.06,"20":2228.12,"21":15708.93,"22":859.58,"23":19164.89,"24":163164.4799999995,"25":33180.7300000001,"26":10033.01,"27":1114.48,"28":462613.2900000042,"29":9822.95,"30":70901.4400000003,"31":22370.29,"32":46711.8900000002,"33":2335.02,"34":7259.28,"35":11.83,"36":13590.51,"37":7677.77,"38":282.01,"39":358522.7900000003,"40":5844.0,"41":7027.28,"42":1908.71,"43":4032.35,"44":11072.6,"45":3973.15,"46":30706.23,"47":2644.13,"48":23831.75,"49":670.12,"50":6949.54,"51":4687.7,"52":9672.69,"53":7333.01,"54":12814.33,"55":689.39,"56":6962.86,"57":2283.16,"58":1259.5,"59":224.84,"60":12812.12,"61":247.68,"62":25452.65,"63":1245.02,"64":24211.36,"65":5255.25,"66":28402.76,"67":9148.55,"68":14822.61,"69":345.37,"70":12408.13,"71":989.93,"72":10601.33,"73":730.32,"74":169020.5000000001,"75":697.54,"76":3862038.6799997138,"77":6148750.9899984254,"78":194.06,"79":2379382.4500000761,"80":1174.11,"81":1729567.9000000793,"82":889650.029999995,"83":95.8,"84":415996.6999999974,"85":654.78}}'
df = pd.read_json(toy)
df['reg_year'].astype(int)
u = df["has_cus_id_but_not_acc_id"].unique()
y = df['reg_year'].unique()
fig, axes = plt.subplots(1,len(u), sharey=True)
axes[0].set_yscale("log")
for ax, (n, grp) in zip(axes.flat, df.groupby("has_cus_id_but_not_acc_id")):
piv = grp.pivot('reg_month', 'reg_year', 'Total_Revenue')
empty = pd.DataFrame(index=range(1,12), columns=y)
empty.combine_first(piv).plot.bar(ax=ax, width=0.8, legend=False)
axes[1].legend()
plt.show()

Resources