Plot multi label (values) with multi bar chart - python-3.x

I've this issue I hope you can help.
I've this data :
to_stack = pd.DataFrame([['CHILDREN', 0.42806248287201976, 0.0],
['AMT_TOTAL', 165006, 179357],
['SAL', 582065, 703917.0],
['ANNUITY', 26851, 28416]], columns=('Variable','Id','Mean'))
When I run the code below
to_stack.plot.barh(x='Variable', figsize=(12,8), width = .9)
## First Loop for first Variable "ID"
for index,value in enumerate(to_stack['Id']):
plt.text(value, index, str(value), va='top', )
## Second Loop for Second Variable
for i,val in enumerate(to_stack['Mean']):
plt.text(val, i, str(val), va='bottom' )
I get this result
The Values in each bar ar not well centralized
I've tried several options in Matplotlib.plt.text (ha (center, left, right) , va (top, bottom, baseline) without good results, sometimes it's even worse, values are one on each other.
How can we get the values aligned with the bars ?
Any ideas are really welcome

It's better to extract information from the bars and annotate. That way, you have more control of how the text appears in relative to the bars:
fig, ax = plt.subplots(figsize=(12,8),)
to_stack.plot.barh(x='Variable', width = .9, ax=ax)
for patch in ax.patches:
w, h = patch.get_width(), patch.get_height()
y = patch.get_y()
ax.text(w + -0.1,h/2+y, f'{w:.3f}', va='center')
Output:

Related

Is it possible to extract the default tick locations from the primary axis and pass it to a secondary access with matplotlib?

When making a plot with with
fig, ax = plt.subplots()
x=[1,2,3,4,5,6,7,8,9,10]
y=[1,2,3,4,5,6,7,8,9,10]
ax.plot(x,y)
plt.show()
matplotlib will determine the tick spacing/location and value of the tick. Is there are way to extract this automatic spacing/location AND the value? I want to do this so i can pass it to
set_xticks()
for my secondary axis (using twiny()) then use set_ticklabels() with a custom label. I realise I could use secondary axes giving both a forward and inverse function however providing an inverse function is not feasible for the goal of my code.
So in the image below, the ticks are only showing at 2,4,6,8,10 rather than all the values of x and I want to somehow extract these values and position so I can pass to set_xticks() and then change the tick labels (on a second x axis created with twiny).
UPDATE
When using the fix suggested it works well for the x axis. However, it does not work well for the y-axis. For the y-axis it seems to take the dataset values for the y ticks only. My code is:
ax4 = ax.twinx()
ax4.yaxis.set_ticks_position('left')
ax4.yaxis.set_label_position('left')
ax4.spines["left"].set_position(("axes", -0.10))
ax4.set_ylabel(self.y_2ndary_label, fontweight = 'bold')
Y = ax.get_yticks()
ax4.yaxis.set_ticks(Y)
ax4.yaxis.set_ticklabels( Y*Y )
ax4.set_ylim(ax.get_ylim())
fig.set_size_inches(8, 8)
plt.show()
but this gives me the following plot. The plot after is the original Y axis. This is not the case when I do this on the x-axis. Any ideas?
# From "get_xticks" Doc: The locations are not clipped to the current axis limits
# and hence may contain locations that are not visible in the output.
current_x_ticks = ax.get_xticks()
current_x_limits = ax.get_xlim()
ax.set_yticks(current_x_ticks) # Use this before "set_ylim"
ax.set_ylim(current_x_limits)
plt.show()

Is it possible to run a for loop to plot graphs and produce a table of contents at the top which shows the location of each at the top?

I'm working on project where I have to create plots for multiple states. I know I can use subplots but they are not producing visually pleasing results so I would rather keep them printed one after the other.
At the end of the for loop, whenever I need to find info I have to scroll around trying to find certain states. Is it possible to have the locations printed at the top so I can quickly jump to the graph?
I have provided my code as it is now.
the08legend = [] '$'
print(the08legend)'$'
for state in state_df['State'].unique():'$'
the08legend.append(state)
plt.figure(figsize = (8,8))
my_range =range(1,len(state_df[state_df['State']==state].index)+1)
st_df = state_df[state_df['State'] == state]
st_df = st_df.sort_values('rep08vot_perc')
plt.axvline(x=50, c='green', alpha = 0.2)
plt.hlines(y=my_range, xmin=st_df['rep08vot_perc'], xmax=st_df['dem08vot_perc'], color='ghostwhite', alpha=1)
plt.scatter(y = my_range, x = st_df['rep08vot_perc'], c = 'red', label='Republican vote share')
plt.scatter(y = my_range, x = st_df['dem08vot_perc'], c = 'blue', label = 'Democrat vote share')
plt.xlim(0,100)
plt.title(f'The vote shares in {state} 2008 election')
plt.ylabel(f'The county in {state} where vote was counted')
plt.xlabel('Percentage share of votes')
plt.legend(loc = 1)
plt.show()
I know what I did is obviously not going to print the resulting list, however, is it possible to make that happen?

Matplotlib and pie/donut chart labels

If yall had seen my previous question, I am coding a Python program to evaluate the data that I collect while playing a game of Clue. I have decided to implement a GUI (tkinter) into my program to make it faster and easier to work with. One of the main window's of the GUI illustrates the different cards that I know each player has in their hand, the cards that I know must be in the middle "the murder cards", and the unknown cards that are inconclusively placed in the above categories. I have decided to implement this data through a matplotlib pie chart, five wedges for each of the previously mentioned categories.
Right now, I am unconcerned with how I implement this matplotlib function into my tkinter widget. I am solely focused on the design of the chart.
So far, I have documented the cards that are within each player's hand within a dictionary, wherein the keys are the player names, and the values are a set of cards that are in their hand. For example...
player_cards = { 'player1':{'Mustard', 'Scarlet', 'Revolver', 'Knife', 'Ballroom', 'Library'}, 'player2':{}, 'player3':{} }
So the data for the first three wedges of the pie chart will be extracted from the dictionary. For the other two wedges, the data will be stored within similarly organized sets.
After looking at the matplotlib.org website I have seen a example that sorta demonstrates what I am looking for...
with the code...
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(6, 3), subplot_kw=dict(aspect="equal"))
recipe = ["225 g flour",
"90 g sugar",
"1 egg",
"60 g butter",
"100 ml milk",
"1/2 package of yeast"]
data = [225, 90, 50, 60, 100, 5]
wedges, texts = ax.pie(data, wedgeprops=dict(width=0.5), startangle=-40)
bbox_props = dict(boxstyle="square,pad=0.3", fc="w", ec="k", lw=0.72)
kw = dict(xycoords='data', textcoords='data', arrowprops=dict(arrowstyle="-"), bbox=bbox_props, zorder=0, va="center")
for i, p in enumerate(wedges):
ang = (p.theta2 - p.theta1)/2. + p.theta1
y = np.sin(np.deg2rad(ang))
x = np.cos(np.deg2rad(ang))
horizontalalignment = {-1: "right", 1: "left"}[int(np.sign(x))]
connectionstyle = "angle,angleA=0,angleB={}".format(ang)
kw["arrowprops"].update({"connectionstyle": connectionstyle})
ax.annotate(recipe[i], xy=(x, y), xytext=(1.35*np.sign(x), 1.4*y),
horizontalalignment=horizontalalignment, **kw)
ax.set_title("Matplotlib bakery: A donut")
plt.show()
However, what is lacking from this example code is... (1) The label for each wedge is a single string rather than a set of strings (which is what stores the cards in each player's hand). (2) I cannot seem to control the color of the wedges. (3) the outline of each wedge is black, rather than white which is the background color of my GUI window. (4) I want to control the exact placement of the labels. And finally (5) I need the change the font/size of the labels. Other than that the example code is perfect.
Just note that the actual size of each wedge in the pie chart will be dictated by the size of each of the five sets (so they will add up to 21).
Just in case that you all need some more substantive code to work with, here are five sets that make up the data needed for this pie chart...
player1_cards = {'Mustard', 'Plum', 'Revolver', 'Rope', 'Ballroom', 'Library'}
player2_cards = {'Scarlet', 'White', 'Candlestick'}
player3_cards = {'Green', 'Library', 'Kitchen', 'Conservatory'}
middle_cards = {'Peacock'}
unknown_cards = {'Lead Pipe', 'Wrench', 'Knife', 'Hall', 'Lounge', 'Dining Room, 'Study'}
Okay that it, sorry for a rather long post, and thanks for those of you viewing and responding :)

Plot the distance between every two points in 2 D

If I have a table with three columns where the first column represents the name of each point, the second column represent numerical data (mean) and the last column represent (second column + fixed number). The following an example how is the data looks like:
I want to plot this table so I have the following figure
If it is possible how I can plot it using either Microsoft Excel or python or R (Bokeh).
Alright, I only know how to do it in ggplot2, I will answer regarding R here.
These method only works if the data-frame is in the format you provided above.
I rename your column to Name.of.Method, Mean, Mean.2.2
Preparation
Loading csv data into R
df <- read.csv('yourdata.csv', sep = ',')
Change column name (Do this if you don't want to change the code below or else you will need to go through each parameter to match your column names.
names(df) <- c("Name.of.Method", "Mean", "Mean.2.2")
Method 1 - Using geom_segment()
ggplot() +
geom_segment(data=df,aes(x = Mean,
y = Name.of.Method,
xend = Mean.2.2,
yend = Name.of.Method))
So as you can see, geom_segment allows us to specify the end position of the line (Hence, xend and yend)
However, it does not look similar to the image you have above.
The line shape seems to represent error bar. Therefore, ggplot provides us with an error bar function.
Method 2 - Using geom_errorbarh()
ggplot(df, aes(y = Name.of.Method, x = Mean)) +
geom_errorbarh(aes(xmin = Mean, xmax = Mean.2.2), linetype = 1, height = .2)
Usually we don't use this method just to draw a line. However, its functionality fits your requirement. You can see that we use xmin and ymin to specify the head and the tail of the line.
The height input is to adjust the height of the bar at the end of the line in both ends.
I would use hbar for this:
from bokeh.io import show, output_file
from bokeh.plotting import figure
output_file("intervals.html")
names = ["SMB", "DB", "SB", "TB"]
p = figure(y_range=names, plot_height=350)
p.hbar(y=names, left=[4,3,2,1], right=[6.2, 5.2, 4.2, 3.2], height=0.3)
show(p)
However Whisker would also be an option if you really want whiskers instead of interval bars.

How to get MultiCells in Pyfpdf Side by side?

I am making a table of about 10 cells with headings in them. They will not fit accross the page unless I use multi_cell option. However I cant figure out How to get a multi_cell side by side. When I make a new one it autmatically goes to the next line
from fpdf import FPDF
import webbrowser
pdf=FPDF()
pdf.add_page()
pdf.set_font('Arial','B',16)
pdf.multi_cell(40,10,'Hello World!,how are you today',1,0)
pdf.multi_cell(100,10,'This cell needs to beside the other',1,0)
pdf.output('tuto1.pdf','F')
webbrowser.open_new('tuto1.pdf')
You will have to keep track of x and y coordinates:
from fpdf import FPDF
import webbrowser
pdf=FPDF()
pdf.add_page()
pdf.set_font('Arial','B',16)
# Save top coordinate
top = pdf.y
# Calculate x position of next cell
offset = pdf.x + 40
pdf.multi_cell(40,10,'Hello World!,how are you today',1,0)
# Reset y coordinate
pdf.y = top
# Move to computed offset
pdf.x = offset
pdf.multi_cell(100,10,'This cell needs to beside the other',1,0)
pdf.output('tuto1.pdf','F')
webbrowser.open_new('tuto1.pdf')
NOTE : This is an alternate approach to the above problem
For my requirement, i need some of the columns to have higher column width and some columns with lower width. Fixed column width for all the columns is making my table to go out of the pdf margin. Multi cell wraps the text and increase the column height only for certain columns and not every column. So my approach was to use the enumerate function and dynamically adjust the column width according to the needs as shown below.
data1 = list(csv.reader(csvfile))
print(data1) ##[[row1],[row2],[row3],[row4]]
## row1, row2, etc are the rows from csv
for row in data1:
for x,y in enumerate(row):
if (x == 0) or (x == 1): ## dynamically change the column width with certain conditions
pdf.cell(2.0, 0.15, str(y), align = 'C', border=1) ## width = 2.0
else:
pdf.cell(0.60, 0.15, str(y), align = 'L', border=1) ## width = 0.60
Hope this helps.

Resources