How to get MultiCells in Pyfpdf Side by side? - python-3.x

I am making a table of about 10 cells with headings in them. They will not fit accross the page unless I use multi_cell option. However I cant figure out How to get a multi_cell side by side. When I make a new one it autmatically goes to the next line
from fpdf import FPDF
import webbrowser
pdf=FPDF()
pdf.add_page()
pdf.set_font('Arial','B',16)
pdf.multi_cell(40,10,'Hello World!,how are you today',1,0)
pdf.multi_cell(100,10,'This cell needs to beside the other',1,0)
pdf.output('tuto1.pdf','F')
webbrowser.open_new('tuto1.pdf')

You will have to keep track of x and y coordinates:
from fpdf import FPDF
import webbrowser
pdf=FPDF()
pdf.add_page()
pdf.set_font('Arial','B',16)
# Save top coordinate
top = pdf.y
# Calculate x position of next cell
offset = pdf.x + 40
pdf.multi_cell(40,10,'Hello World!,how are you today',1,0)
# Reset y coordinate
pdf.y = top
# Move to computed offset
pdf.x = offset
pdf.multi_cell(100,10,'This cell needs to beside the other',1,0)
pdf.output('tuto1.pdf','F')
webbrowser.open_new('tuto1.pdf')

NOTE : This is an alternate approach to the above problem
For my requirement, i need some of the columns to have higher column width and some columns with lower width. Fixed column width for all the columns is making my table to go out of the pdf margin. Multi cell wraps the text and increase the column height only for certain columns and not every column. So my approach was to use the enumerate function and dynamically adjust the column width according to the needs as shown below.
data1 = list(csv.reader(csvfile))
print(data1) ##[[row1],[row2],[row3],[row4]]
## row1, row2, etc are the rows from csv
for row in data1:
for x,y in enumerate(row):
if (x == 0) or (x == 1): ## dynamically change the column width with certain conditions
pdf.cell(2.0, 0.15, str(y), align = 'C', border=1) ## width = 2.0
else:
pdf.cell(0.60, 0.15, str(y), align = 'L', border=1) ## width = 0.60
Hope this helps.

Related

Issues when freezing wx.Grid columns with FreezeTo: bad cell position in grid, and HOME keyboard key behavior

Context
I have put in place column freezing in a wx.grid.Grid, using FreezeTo method.
def __init__(self, parent):
# relevant lines
self.grid = wx.grid.Grid(self.sbox_grid, size=(1000, 800))
self.grid.CreateGrid(self.row_number, self.col_number)
self.grid.FreezeTo(0, self.frozen_column_number)
The freezing by itself works well, as soon as I keep the standard label renderer (*).
The first few columns I have frozen always stay visible, and moving the horizontal scrollbar by hand is also ok.
(*) I was initially using the GridWithLabelRenderersMixin of wx.lib.mixins.gridlabelrenderer, but it totally breaks consistency between column label width and column width. Anyway I can deal with the standard renderer, so it is not really a problem.
I faced several issues, now all solved and detailed below.
Capture the cell position for frozen columns: cells or labels (SOLVED)
For cells, the window can be captured with GetFrozenColGridWindow.
So mouseover can be done simply with:
if widget == self.grid.GetFrozenColGridWindow():
(x, y) = self.grid.CalcUnscrolledPosition(event.GetX(), event.GetY())
row = self.grid.YToRow(y)
col = self.grid.XToCol(x)
# do whatever your want with row, col
For labels, the window exists but is NOT accessible with a method.
With a GetChildren on the grid, I have found that it is the last of the list (corresponding to the latest defined).
So it is not very reliable, but a relatively good placeholder for the missing GetGridFrozenColLabelWindow method.
wlist = self.grid.GetChildren()
frozen_col_label_window = wlist[-1]
if widget == frozen_col_label_window:
x = event.GetX()
y = event.GetY()
col = self.grid.XToCol(x, y)
# do stuff with col
Mouse position from non-frozen columns (labels or cells) is shifted (SOLVED)
The effective position for non-frozen columns labels or cells is shifted from the total width of all the frozen columns.
This one is easily handled by a shift in position, computed before calls to YToRow or XToCol methods.
The following code shows the position corrections:
class Report(wx.Panel):
def _freeze_x_shit(self):
"""Returns the horizontal position offset induced by columns freeze"""
offset = 0
for col in range(self.frozen_column_number):
offset += self.grid.GetColSize(col)
return offset
def on_mouse_over(self, event):
widget = event.GetEventObject()
# grid header
if widget == self.grid.GetGridColLabelWindow():
x = event.GetX()
y = event.GetY()
x += self._freeze_x_shit() # <-- position correction here
col = self.grid.XToCol(x, y)
# do whatever grid processing using col value
# grid cells
elif widget == self.grid.GetGridWindow():
(x, y) = self.grid.CalcUnscrolledPosition(event.GetX(), event.GetY())
x += self._freeze_x_shit() # <-- and also here
row = self.grid.YToRow(y)
col = self.grid.XToCol(x)
# do whatever grid cell processing using row and col values
event.Skip()
HOME keyboard key not working as intended (SOLVED)
I generally use the HOME key to immediately go at the utmost left of the grid, and the END key to go far right. This is the normal behavior with a non-frozen grid.
The END key does its jobs, but not the HOME key.
When pushing HOME on any grid cell, I got two effects:
the selected cell becomes the first column: this is OK
but the scrollbar position is not changed at all: I would expect the scroll position to be fully left
I have corrected it by a simple remapping using EVT_KEY_DOWN event:
def __init__(self, parent):
self.grid.Bind(wx.EVT_KEY_DOWN, self.on_key_event)
def on_key_event(self, event):
"""Remap the HOME key so it scrolls the grid to the left, as it did without the frozen columns
:param event: wx.EVT_KEY_DOWN event on the grid
:return:
"""
key_code = event.GetKeyCode()
if key_code == wx.WXK_HOME:
self.grid.Scroll(0, -1)
event.Skip()
my 3 issues concerning column Freeze in a grid are now solved. I have edited my initial post with my solutions.

How to group-by twice, preserve original columns, and plot

I have the following data sets (only sample is shown):
I want to find the most impactful exercise per area and then plot it via Seaborn barplot.
I use the following code to do so.
# Create Dataset Using Only Area, Exercise and Impact Level Chategories
CA_data = Data[['area', 'exercise', 'impact level']]
# Compute Mean Impact Level per Exercise per Area
mean_il_CA = CA_data.groupby(['area', 'exercise'])['impact level'].mean().reset_index()
mean_il_CA_hello = mean_il_CA.groupby('area')['impact level'].max().reset_index()
# Plot
cx = sns.barplot(x="impact level", y="area", data=mean_il_CA_hello)
plt.title('Most Impactful Exercises Considering Area')
plt.show()
The resulting dataset is:
This means that when I plot, on the y axis only the label relative to the area appears, NOT 'area label' + 'exercise label' like I would like.
How do I reinsert 'exercise column into my final dataset?
How do I get both the name of the area and the exercise on the y plot?
The problem of losing the values of 'exercise' when grouping by the maximum of 'area' can be solved by keeping the MultiIndex (i.e. not using reset_index) and using .transform to create a boolean mask to select the appropriate full rows of mean_il_CA that contain the maximum 'impact_level' values per 'area'. This solution is based on the code provided in this answer by unutbu. The full labels for the bar chart can be created by concatenating the labels of 'area' and 'exercise'.
Here is an example using the titanic dataset from the seaborn package. The variables 'class', 'embark_town', and 'fare' are used in place of 'area', 'exercise', and 'impact_level'. The categorical variables both contain three unique values: 'First', 'Second', 'Third', and 'Cherbourg', 'Queenstown', 'Southampton'.
import pandas as pd # v 1.2.5
import seaborn as sns # v 0.11.1
df = sns.load_dataset('titanic')
data = df[['class', 'embark_town', 'fare']]
data.head()
data_mean = data.groupby(['class', 'embark_town'])['fare'].mean()
data_mean
# Select max values in each class and create concatenated labels
mask_max = data_mean.groupby(level=0).transform(lambda x: x == x.max())
data_mean_max = data_mean[mask_max].reset_index()
data_mean_max['class, embark_town'] = data_mean_max['class'].astype(str) + ', ' \
+ data_mean_max['embark_town']
data_mean_max
# Draw seaborn bar chart
sns.barplot(data=data_mean_max,
x=data_mean_max['fare'],
y=data_mean_max['class, embark_town'])

Plot multi label (values) with multi bar chart

I've this issue I hope you can help.
I've this data :
to_stack = pd.DataFrame([['CHILDREN', 0.42806248287201976, 0.0],
['AMT_TOTAL', 165006, 179357],
['SAL', 582065, 703917.0],
['ANNUITY', 26851, 28416]], columns=('Variable','Id','Mean'))
When I run the code below
to_stack.plot.barh(x='Variable', figsize=(12,8), width = .9)
## First Loop for first Variable "ID"
for index,value in enumerate(to_stack['Id']):
plt.text(value, index, str(value), va='top', )
## Second Loop for Second Variable
for i,val in enumerate(to_stack['Mean']):
plt.text(val, i, str(val), va='bottom' )
I get this result
The Values in each bar ar not well centralized
I've tried several options in Matplotlib.plt.text (ha (center, left, right) , va (top, bottom, baseline) without good results, sometimes it's even worse, values are one on each other.
How can we get the values aligned with the bars ?
Any ideas are really welcome
It's better to extract information from the bars and annotate. That way, you have more control of how the text appears in relative to the bars:
fig, ax = plt.subplots(figsize=(12,8),)
to_stack.plot.barh(x='Variable', width = .9, ax=ax)
for patch in ax.patches:
w, h = patch.get_width(), patch.get_height()
y = patch.get_y()
ax.text(w + -0.1,h/2+y, f'{w:.3f}', va='center')
Output:

How to label line chart with column from pandas dataframe (from 3rd column values)?

I have a data set I filtered to the following (sample data):
Name Time l
1 1.129 1G-d
1 0.113 1G-a
1 3.374 1B-b
1 3.367 1B-c
1 3.374 1B-d
2 3.355 1B-e
2 3.361 1B-a
3 1.129 1G-a
I got this data after filtering the data frame and converting it to CSV file:
# Assigns the new data frame to "df" with the data from only three columns
header = ['Names','Time','l']
df = pd.DataFrame(df_2, columns = header)
# Sorts the data frame by column "Names" as integers
df.Names = df.Names.astype(int)
df = df.sort_values(by=['Names'])
# Changes the data to match format after converting it to int
df.Time=df.Time.astype(int)
df.Time = df.Time/1000
csv_file = df.to_csv(index=False, columns=header, sep=" " )
Now, I am trying to graph lines for each label column data/items with markers.
I want the column l as my line names (labels) - each as a new line, Time as my Y-axis values and Names as my X-axis values.
So, in this case, I would have 7 different lines in the graph with these labels: 1G-d, 1G-a, 1B-b, 1B-c, 1B-d, 1B-e, 1B-a.
I have done the following so far which is the additional settings, but I am not sure how to graph the lines.
plt.xlim(0, 60)
plt.ylim(0, 18)
plt.legend(loc='best')
plt.show()
I used sns.lineplot which comes with hue and I do not want to have name for the label box. Also, in that case, I cannot have the markers without adding new column for style.
I also tried ply.plot but in that case, I am not sure how to have more lines. I can only give x and y values which create only one line.
If there's any other source, please let me know below.
Thanks
The final graph I want to have is like the following but with markers:
You can apply a few tweaks to seaborn's lineplot. Using some created data since your sample isn't really long enough to demonstrate:
# Create data
np.random.seed(2019)
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = pd.DataFrame({'Name':np.repeat(range(1,11), 10),
'Time':np.random.randn(100).cumsum(),
'l':np.random.choice(categories, 100)
})
# Plot
sns.lineplot(data=df, x='Name', y='Time', hue='l', style='l', dashes=False,
markers=True, ci=None, err_style=None)
# Temporarily removing limits based on sample data
#plt.xlim(0, 60)
#plt.ylim(0, 18)
# Remove seaborn legend title & set new title (if desired)
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], title='New Title', loc='best')
plt.show()
To apply markers, you have to specify a style variable. This can be the same as hue.
You likely want to remove dashes, ci, and err_style
To remove the seaborn legend title, you can get the handles and labels, then re-add the legend without the first handle and label. You can also specify the location here and set a new title if desired (or just remove title=... for no title).
Edits per comments:
Filtering your data to only a subset of level categories can be done fairly easily via:
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = df.loc[df['l'].isin(categories)]
markers=True will fail if there are too many levels. If you are only interested in marking points for aesthetic purposes, you can simply multiply a single marker by the number of categories you are interested in (which you have already created to filter your data to categories of interest): markers='o'*len(categories).
Alternatively, you can specify a custom dictionary to pass to the markers argument:
points = ['o', '*', 'v', '^']
mult = len(categories) // len(points) + (len(categories) % len(points) > 0)
markers = {key:value for (key, value)
in zip(categories, points * mult)}
This will return a dictionary of category-point combinations, cycling over the marker points specified until each item in categories has a point style.

Plot the distance between every two points in 2 D

If I have a table with three columns where the first column represents the name of each point, the second column represent numerical data (mean) and the last column represent (second column + fixed number). The following an example how is the data looks like:
I want to plot this table so I have the following figure
If it is possible how I can plot it using either Microsoft Excel or python or R (Bokeh).
Alright, I only know how to do it in ggplot2, I will answer regarding R here.
These method only works if the data-frame is in the format you provided above.
I rename your column to Name.of.Method, Mean, Mean.2.2
Preparation
Loading csv data into R
df <- read.csv('yourdata.csv', sep = ',')
Change column name (Do this if you don't want to change the code below or else you will need to go through each parameter to match your column names.
names(df) <- c("Name.of.Method", "Mean", "Mean.2.2")
Method 1 - Using geom_segment()
ggplot() +
geom_segment(data=df,aes(x = Mean,
y = Name.of.Method,
xend = Mean.2.2,
yend = Name.of.Method))
So as you can see, geom_segment allows us to specify the end position of the line (Hence, xend and yend)
However, it does not look similar to the image you have above.
The line shape seems to represent error bar. Therefore, ggplot provides us with an error bar function.
Method 2 - Using geom_errorbarh()
ggplot(df, aes(y = Name.of.Method, x = Mean)) +
geom_errorbarh(aes(xmin = Mean, xmax = Mean.2.2), linetype = 1, height = .2)
Usually we don't use this method just to draw a line. However, its functionality fits your requirement. You can see that we use xmin and ymin to specify the head and the tail of the line.
The height input is to adjust the height of the bar at the end of the line in both ends.
I would use hbar for this:
from bokeh.io import show, output_file
from bokeh.plotting import figure
output_file("intervals.html")
names = ["SMB", "DB", "SB", "TB"]
p = figure(y_range=names, plot_height=350)
p.hbar(y=names, left=[4,3,2,1], right=[6.2, 5.2, 4.2, 3.2], height=0.3)
show(p)
However Whisker would also be an option if you really want whiskers instead of interval bars.

Resources