I am trying to draw a pie chart using Matplotlib, though there are no negative values present, I keep getting the error "pie doesn't allow negative values"!
contrib = sales_data.groupby('Region')['Sales2016'].sum().round().reset_index()
contrib["Percentage"] = (contrib.Sales2016/sum(contrib.Sales2016))*100
contrib = contrib.drop(columns = ["Sales2016"])
contrib.plot(kind = "pie", subplots = True).plot(kind = "pie",subplots=True,legend=False,figsize=(12,5),autopct="%.2f%%")
plt.show()
Is it possible to point out where am I going wrong? The following is the output for contrib:
Region Percentage
0 Central 32.994771
1 East 42.701319
2 West 24.303911
Define argument y in the pie plot:
contrib.plot(kind = "pie",y="Percentage",labels=['Region'],legend=False,figsize=(12,5),autopct="%.2f%%")
Related
I am having trouble with Altair repeating X axis label values.
Data:
rule_abbreaviation flagged_claim bill_month
0 CONCIDPROC 1 Apr2022
1 CONTUSMAT1 1 Apr2022
2 COVID05 1 Jun2021
3 FILTROTUB2 1 Sep2021
4 MEPIARTRO1 1 Mar2022
#Code to generate Altair Bar Chart
bar = alt.Chart(Data).mark_bar().encode(
x=alt.X('flagged_claim:Q', axis=alt.Axis(title='Flagged Claims', format= ',.0f'), stack='zero'),
y=alt.Y('rule_abbreaviation:N', axis=alt.Axis(title='Component Abbreviation'), sort=alt.SortField(field=measure, order='descending')),
tooltip=[alt.Tooltip('max(ClaimRuleName):N', title='Claim Component'), alt.Tooltip('flagged_claim:Q', title='Flagged Claims', format= ',.0f')],
color=alt.Color('bill_month', legend=None)
).properties(width=485,
title = alt.TitleParams(text = 'Bottom Components',
font = 'Arial',
fontSize = 16,
color = '#000080',
)
).interactive()
X axis label generated by this chart contains repeated 0 and 1
Image of Visualization: https://i.stack.imgur.com/0XdWB.png
The reason this is happening is because you have format= ',.0f' which tells Altair to include 0 decimals in the axis labels. Remove it or change to 1f to see decimals in the labels. In general, a good way to troubleshoot problems like this is to remove part of your code at a time to identify which part is causing the unexpected behavior.
To reduce the number of ticks you can use alt.Axis(title='Flagged Claims', format='d', tickCount=1) or alt.Axis(title='Flagged Claims', format='d', values=[0, 1]). See also Changing Number of y-axis Ticks in Altair
I am writing a latex document and using matplotlib for plots. I want to have the font and font size (9) of the captions of my latex document also for the plot axes and legend text.
Furthermore, I would like to fill out the \linewidth or \textwidth of my latex document, which is 369 pt. Now the matplotlib.pyplot.figure function accepts the input parameter figsize which should be in inches, so I duly specify it as 369/72 inches, 1/72 being the conversion factor from pt to inches.
Later I cut down excess white space by using the bbox_inches=tight and pad_inches=0 options of the savefig function.
The font and font size part works as intended. It looks exactly identical between the figure text and the caption text. However, I am still dissatisfied with the figure width.
Below is a minimal example of a figure I produce.
import matplotlib
import matplotlib.pyplot as plt
plt.rcdefaults()
plt.rcParams['font.size'] = '9'
plt.rcParams['figure.autolayout'] = False
matplotlib.rc('font', family='sans-serif', serif=['Palatino'])
matplotlib.rc('text', usetex=True)
params = {'text.latex.preamble': [
r'\usepackage[american]{babel}',
r'\usepackage{mathpazo}',
r'\usepackage{amsmath,amssymb,amsfonts,mathrsfs}',
r'\usepackage{textcomp}',
]}
plt.rcParams.update(params)
plt.rcParams['mathtext.default'] = 'regular'
plt.rcParams['legend.handlelength'] = 1
delta_adjust = 0
plt.rcParams['figure.subplot.bottom'] = delta_adjust
plt.rcParams['figure.subplot.top'] = 1 - delta_adjust
plt.rcParams['figure.subplot.left'] = delta_adjust
plt.rcParams['figure.subplot.right'] = 1 - delta_adjust
plt.rcParams['figure.subplot.hspace'] = 0.55
plt.rcParams['figure.subplot.wspace'] = 0.55
default_figsize=(369/72, 369/72)
fig = plt.figure(figsize=default_figsize)
sp1 = plt.subplot(3,3,1)
sp2 = plt.subplot(3,3,2)
sp3 = plt.subplot(3,3,3)
for sp in sp1, sp2, sp3:
sp.set_title('Title')
sp.set_xlabel('Xlabel')
sp.set_ylabel('Ylabel')
twin = sp3.twinx()
twin.set_ylabel('Ylabel')
fig.set_size_inches(default_figsize)
fig.savefig('./example.pdf', transparent=False, bbox_inches='tight', pad_inches=0, ending='.pdf')
This is the result of the above code. The figure has a width of 429.356 pt instead of the desired 369 pt. When I increase the delta_adjust parameter in the code, I get smaller pdf widths.
[philipp#desktop scripts]$ python minimal_example.py
[philipp#desktop scripts]$ pdfinfo example.pdf
Creator: matplotlib 3.1.2, http://matplotlib.org
Producer: matplotlib pdf backend 3.1.2
CreationDate: Thu Jan 13 11:41:13 2022 CET
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 1
Encrypted: no
Page size: 429.356 x 130.412 pts
Page rot: 0
File size: 102863 bytes
Optimized: no
PDF version: 1.4
When I scale the figsize parameter of the python code from 369 pt to 369*369/429 pt, I end up with a 386 pt pdf. I do not want to use a trial and error strategy to find the correct parameter. As a last resort, I could write an iterative program as a savefig routine but I would prefer to avoid this. For reference, here is the output of the program converted to png: image
In summary, I am looking for help on how to set the figure width reliably.
I am on Ubuntu 20.04, python 3.8, matplotlib 3.1.2, and I use the TkAgg backend which is the default.
Any help is appreciated.
Only after posting this question did this website recommend me the following question: How to get figure size and fontsize right for PDFs exported from matplotlib?
It turns out that bbox_inches=tight messes with the figure size.
I removed this option and set delta_adjust = 0.1 in the code above.
Now the figure has the expected size of exactly 369x369 pt.
Most of it is whitespace, which I can remove using the pdfcrop command line utility.
The current script looks like this.
import os
import matplotlib
import matplotlib.pyplot as plt
plt.rcdefaults()
plt.rcParams['font.size'] = '9'
plt.rcParams['figure.autolayout'] = False
matplotlib.rc('font', family='sans-serif', serif=['Palatino'])
matplotlib.rc('text', usetex=True)
params = {'text.latex.preamble': [
r'\usepackage[american]{babel}',
r'\usepackage{mathpazo}',
r'\usepackage{amsmath,amssymb,amsfonts,mathrsfs}',
r'\usepackage{textcomp}',
]}
plt.rcParams.update(params)
plt.rcParams['mathtext.default'] = 'regular'
plt.rcParams['legend.handlelength'] = 1
delta_adjust_h = 0.1
delta_adjust_v = 0.1
plt.rcParams['figure.subplot.bottom'] = delta_adjust_v
plt.rcParams['figure.subplot.top'] = 1 - delta_adjust_v
plt.rcParams['figure.subplot.left'] = delta_adjust_h
plt.rcParams['figure.subplot.right'] = 1 - delta_adjust_h
plt.rcParams['figure.subplot.hspace'] = 0.55
plt.rcParams['figure.subplot.wspace'] = 0.55
#factor = 369/429.356
factor = 1
default_figsize=(369/72*factor, 369/72*factor)
fig = plt.figure(figsize=default_figsize)
sp1 = plt.subplot(3,3,1)
sp2 = plt.subplot(3,3,2)
sp3 = plt.subplot(3,3,3)
for sp in sp1, sp2, sp3:
sp.set_title('Title')
sp.set_xlabel('Xlabel')
sp.set_ylabel('Ylabel')
twin = sp3.twinx()
twin.set_ylabel('Ylabel')
fig.set_size_inches(default_figsize)
#fig.savefig('./example.pdf', transparent=False, ending='.pdf', pad_inches=0)
# bbox_inches='tight', pad_inches=0, ending='.pdf')
fig.savefig('./example.pdf', transparent=True, pad_inches=0, ending='.pdf')
os.system('pdfcrop ./example.pdf ./example_cropped.pdf')
The output of pdfinfo is the following:
[philipp#desktop scripts]$ pdfinfo example_cropped.pdf
Creator: TeX
Producer: pdfTeX-1.40.20
CreationDate: Thu Jan 13 13:28:25 2022 CET
ModDate: Thu Jan 13 13:28:25 2022 CET
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 1
Encrypted: no
Page size: 364 x 112 pts
Page rot: 0
File size: 102853 bytes
Optimized: no
PDF version: 1.4
So it is still not perfect, as the figure width is slightly too small due to the cropping.
Also it is now not guaranteed that all the contents fit within the printed pdf, which previously was ensured by the bbox_inches option.
Nevertheless, this is an improvement as now the plot sizes can no longer exceed the latex \linewidth.
I may update this answer if I find a better solution.
Edit: There is a feature under development at matplotlib which would solve the problem: https://matplotlib.org/stable/tutorials/intermediate/constrainedlayout_guide.html
I tried it but it seems to work best when the subplots are all created at once, such as with plt.subplots. I will have to change all my scripts, because currently I add subplots one by one with plt.subplot. Moreover I need to set the vertical figure size explicitly instead of simply generating a square figure and cropping all the unused whitespace.
I am working on electrophysiological data which is in .abf format.
I want to obtain the hyperpolarization depth as indicated above in the figure. This is what I have done so far;
import matplotlib.pyplot as plt
import pyabf
import pandas as pd
abf = pyabf.ABF("test.abf")
abf.setSweep(10) # I can access a given sweep. Here sweep 10
df = pd.DataFrame({'time': abf.sweepX, 'current':abf.sweepY})
df1 = df.loc[15650:15800]
df1.plot(x='time', y='current')
I am thinking to apply change in derivative to find the first point of interest (x1,y1) and then lower point (x2,y2), but it looks complex. I would appreciate if someone give some hint or procedure.
The dataset as follow,
time current
0.7825 -63.323975
0.78255 -63.171387
0.7826 -62.89673
0.78265 -62.713623
0.7827 -62.469482
0.78275 -62.37793
0.7828 -62.10327
0.78285 -61.950684
0.7829 -61.76758
0.78295 -61.584473
0.783 -61.401367
0.78305 -61.24878
0.7831 -61.035156
0.78315 -60.85205
0.7832 -60.72998
0.78325 -60.516357
0.7833 -60.455322
0.78335 -60.2417
0.7834 -60.08911
0.78345 -59.96704
0.7835 -59.814453
0.78355 -59.661865
0.7836 -59.509277
0.78365 -59.417725
0.7837 -59.23462
0.78375 -59.11255
0.7838 -58.95996
0.78385 -58.86841
0.7839 -58.685303
0.78395 -58.59375
0.784 -58.441162
0.78405 -58.34961
0.7841 -58.19702
0.78415 -58.044434
0.7842 -57.922363
0.78425 -57.769775
0.7843 -57.678223
0.78435 -57.434082
0.7844 -57.34253
0.78445 -56.9458
0.7845 -56.274414
0.78455 -54.96216
0.7846 -53.253174
0.78465 -51.208496
0.7847 -48.950195
0.78475 -46.325684
0.7848 -43.09082
0.78485 -38.42163
0.7849 -31.036377
0.78495 -22.033691
0.785 -13.397217
0.78505 -6.072998
0.7851 -0.61035156
0.78515 2.7160645
0.7852 3.9367676
0.78525 3.4179688
0.7853 1.3427734
0.78535 -1.4953613
0.7854 -5.0964355
0.78545 -9.185791
0.7855 -13.641357
0.78555 -18.249512
0.7856 -23.132324
0.78565 -27.98462
0.7857 -32.714844
0.78575 -37.261963
0.7858 -41.47339
0.78585 -45.22705
0.7859 -48.553467
0.78595 -51.54419
0.786 -53.985596
0.78605 -56.18286
0.7861 -58.013916
0.78615 -59.539795
0.7862 -60.760498
0.78625 -61.88965
0.7863 -62.652588
0.78635 -63.323975
0.7864 -63.934326
0.78645 -64.2395
0.7865 -64.60571
0.78655 -64.78882
0.7866 -65.00244
0.78665 -64.971924
0.7867 -65.093994
0.78675 -65.03296
0.7868 -64.971924
0.78685 -64.819336
0.7869 -64.78882
0.78695 -64.66675
0.787 -64.48364
0.78705 -64.42261
0.7871 -64.2395
0.78715 -64.11743
0.7872 -63.964844
0.78725 -63.842773
0.7873 -63.659668
0.78735 -63.568115
0.7874 -63.446045
0.78745 -63.26294
0.7875 -63.171387
0.78755 -62.98828
0.7876 -62.89673
0.78765 -62.74414
0.7877 -62.713623
0.78775 -62.530518
0.7878 -62.438965
0.78785 -62.37793
0.7879 -62.25586
0.78795 -62.164307
0.788 -62.042236
0.78805 -62.01172
0.7881 -61.88965
0.78815 -61.88965
0.7882 -61.73706
0.78825 -61.706543
0.7883 -61.645508
0.78835 -61.61499
0.7884 -61.523438
0.78845 -61.462402
0.7885 -61.431885
0.78855 -61.340332
0.7886 -61.37085
0.78865 -61.279297
0.7887 -61.279297
0.78875 -61.157227
0.7888 -61.187744
0.78885 -61.09619
0.7889 -61.157227
0.78895 -61.12671
0.789 -61.09619
0.78905 -61.12671
0.7891 -61.00464
0.78915 -61.00464
0.7892 -60.97412
0.78925 -60.97412
0.7893 -60.943604
0.78935 -61.00464
0.7894 -60.913086
0.78945 -60.97412
0.7895 -60.943604
0.78955 -60.913086
0.7896 -60.943604
0.78965 -60.85205
0.7897 -60.85205
0.78975 -60.821533
0.7898 -60.88257
0.78985 -60.88257
0.7899 -60.913086
0.78995 -60.88257
0.79 -60.913086
We can plot the difference in current between consecutive points (which essentially is to a constant factor the derivative, since times are evenly spaced). First chart shows the actual diffs. Based on this we can set some threshold, such as 0.3, and apply it to filter the main DataFrame. The filtered values are shown in orange on the second chart:
fig, ax = plt.subplots(2, figsize=(8,8))
# plot derivative
df['current'].diff().plot(ax=ax[0])
# current
threshold = 0.4
df['filtered'] = df.loc[df['current'].diff().abs() > threshold]
df.plot(ax=ax[1])
# add spans
x = df['filtered'].dropna()
ax[1].axhspan(x.iloc[0], x.iloc[-1], alpha=0.3, edgecolor='skyblue', facecolor="none", hatch='////')
ax[1].axvspan(x.index.min(), x.index.max(), alpha=0.3, edgecolor='orange', facecolor="none", hatch='\\\\')
Output:
If you're interested in range values, you can dropna values in the filtered subset and find min and max from the index:
print('min', df['filtered'].dropna().index.min())
print('max', df['filtered'].dropna().index.max())
Output:
min 0.78445
max 0.7865
For the value of the gap you can use:
abs(df['filtered'].dropna().iloc[-1] - df['filtered'].dropna().iloc[0])
Output:
7.6599100000000035
Note: We can alternatively also get left edges of these spans as points where diff in the point is lower than the threshold and diff in the next point is higher than the threshold, and similarly for the right edges. This would also work in case we have multiple peaks:
threshold = 0.3
x = df['current'].diff().abs()
spanA = df.loc[(x < threshold) & (x.shift(-1) >= threshold)]
spanB = df.loc[(x >= threshold) & (x.shift(-1) < threshold)]
print(spanA)
current
time
0.7844 -57.34253
print(spanB)
current
time
0.7865 -64.60571
I am trying to segment 3d tomographs of porous networks in python. I am able to calculate the distance map with ndimage.distance_transform_edt and the peaks with feature.peak_local_max. when I apply the watershed algorithm a get an acceptable result, but the markers of the peaks are not located at the visible peaks, see image, of the distance map
Thanks in advance
Here the code a is the image
D = ndimage.distance_transform_edt(a)
localMax = feature.peak_local_max(D, indices=False, min_distance=50,
labels=a)
localMax2 = feature.peak_local_max(D, indices=True, min_distance=50,
labels=a)
markers = ndimage.label(localMax, structure=np.ones((3,3,3)))[0]
labels = morphology.watershed(-D,markers,mask=a)
I found a way:
i had to exclude the borders and apply a threshold
D = ndimage.distance_transform_edt(a)
localMax = feature.peak_local_max(D, indices=False, min_distance=30,
labels=a,threshold_abs=9,exclude_border=1)
localMax2 = feature.peak_local_max(D, indices=True, min_distance=30,
labels=a,threshold_abs=9,exclude_border=1)
#markers = ndimage.label(localMax, structure=np.ones((3,3,3)))[0]
markers = ndimage.label(localMax, structure=np.ones((3,3,3)))[0]
labels = morphology.watershed(-D,markers,mask=a)
regions=measure.regionprops(labels,intensity_image=a)
I am trying to learn linearK estimates on a small linnet object from the CRC spatstat book (chapter 17) and when I use the linearK function, spatstat throws an error. I have documented the process in the comments in the r code below. The error is as below.
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I do not understand how to resolve this. I am following this process:
# I have data of points for each data of the week
# d1 is district 1 of the city.
# I did the step below otherwise it was giving me tbl class
d1_data=lapply(split(d1, d1$openDatefactor),as.data.frame)
# I previously create a linnet and divided it into districts of the city
d1_linnet = districts_linnet[["d1"]]
# I create point pattern for each day
d1_ppp = lapply(d1_data, function(x) as.ppp(x, W=Window(d1_linnet)))
plot(d1_ppp[[1]], which.marks="type")
# I am then converting the point pattern to a point pattern on linear network
d1_lpp <- as.lpp(d1_ppp[[1]], L=d1_linnet, W=Window(d1_linnet))
d1_lpp
Point pattern on linear network
3 points
15 columns of marks: ‘status’, ‘number_of_’, ‘zip’, ‘ward’,
‘police_dis’, ‘community_’, ‘type’, ‘days’, ‘NAME’,
‘DISTRICT’, ‘openDatefactor’, ‘OpenDate’, ‘coseDatefactor’,
‘closeDate’ and ‘instance’
Linear network with 4286 vertices and 6183 lines
Enclosing window: polygonal boundary
enclosing rectangle: [441140.9, 448217.7] x [4640080, 4652557] units
# the errors start from plotting this lpp object
plot(d1_lpp)
"show.all" is not a graphical parameter
Show Traceback
Error in plot.window(...) : need finite 'xlim' values
coords(d1_lpp)
x y seg tp
441649.2 4649853 5426 0.5774863
445716.9 4648692 5250 0.5435492
444724.6 4646320 677 0.9189631
3 rows
And then consequently, I also get error on linearK(d1_lpp)
Error in seq.default(from = 0, to = right, length.out = npos + 1L) : 'to' cannot be NA, NaN or infinite
I feel lpp object has the problem, but I find it hard to interpret the errors and how to resolve them. Could someone please guide me?
Thanks
I can confirm there is a bug in plot.lpp when trying to plot the marked point pattern on the linear network. That will hopefully be fixed soon. You can plot the unmarked point pattern using
plot(unmark(d1_lpp))
I cannot reproduce the problem with linearK. Which version of spatstat are you running? In the development version on my laptop spatstat_1.51-0.073 everything works. There has been changes to this code recently, so it is likely that this will be solved by updating to development version (see https://github.com/spatstat/spatstat).