Selecting data for a specific coordinate, openpyxl - python-3.x

Firstly - piece of my data:
I need to plot a line chart with years on the X axis and people's height on the Y axis:
But I'm getting it:
For plotting chart I using code below
from openpyxl.chart import LineChart, Reference
chart = LineChart()
chart.title='Average height'
chart.y_axis.title = 'Height'
chart.x_axis.title = 'Year'
values = Reference(working_sheet, min_col = 7, min_row = 1, max_col=9, max_row = 102)
chart.add_data(values, titles_from_data=True)
working_sheet.add_chart(chart, 'K2')
So I need to know how I can select data for a specific coordinate using openpyxl

Related

Remove margins around subplots in Plotly

I have a plot made up of 3 choropleth subplots next to each other. I set the overall height and width to my desired dimensions (800 x 400 pixels). I want each subplot to go from top to bottom, but as it stands, the subplots retain the aspect ratio of 2:1, meaning I have wide margins at top and bottom. Those I want to remove.
As a minimum example, I am attaching the data and plot code:
The toy dataset:
import geopandas as gpd
from shapely.geometry.polygon import Polygon
minidf = gpd.GeoDataFrame(dict(
krs_code = ["08111", "08118"],
m_rugged = [42.795776, 37.324421],
bip = [83747, 43122],
cm3_over_1999 = [47.454688, 47.545940],
geometry = [Polygon(((9.0397, 48.6873),
(9.0397, 48.8557),
(9.3152, 48.8557),
(9.3152, 48.6873),
(9.0397, 48.6873))),
Polygon(((8.8757, 48.7536),
(8.8757, 49.0643),
(9.4167, 49.0643),
(9.4167, 48.7536),
(8.8757, 48.7536)))]
)).set_index("krs_code")
The plotting code:
import json
from plotly.subplots import make_subplots
import plotly.graph_objects as go
fig = make_subplots(rows = 1, cols = 3,
specs = [[{"type": "choropleth"}, {"type": "choropleth"}, {"type": "choropleth"}]],
horizontal_spacing = 0.0025 )
fig.update_layout(height = 400, width = 800,
margin = dict(t=0, r=0, b=0, l=0),
coloraxis_showscale=False )
for i, column in enumerate(["m_rugged", "cm3_over_1999", "bip"]):
fig.add_trace(
go.Choropleth(
locations = minidf.index,
z = minidf[column].astype(float), # Data to be color-coded
geojson = json.loads(minidf[["geometry"]].to_json()),
showscale = False
),
col = i+1, row = 1)
fig.update_geos(fitbounds="locations", visible=True)
fig.show()
Notice the margins at top and bottom, which retain the aspect ratio of each subplot, while they are supposed to stretch from top to bottom:
I tried several parameters within go.Choropleth() and .update_layout(), but to no avail.

Plotly : How to enable text label in line graph for the last value?

I am trying to build a graph where the line graph should show the value of only the last element in some beautiful formating.
line graph with no text at end
Now the current method of the text shows for all elements and is a straight text that creates a lot of collisions with different lines in the same graph and looks clumsy.
Will be very nice to achieve something as mentioned in the below image.
desired line graph with text
This is now handled through:
legendgroup = d.name
Plot 1: All
Plot 2: Deselect GOOG in the legend and see that the marker disappears as well:
Complet code:
# imports
import pandas as pd
import plotly.express as px
# data
df = px.data.stocks()
df = df.drop('AMZN', axis = 1)
colors = px.colors.qualitative.T10
# plotly
fig = px.line(df,
x = 'date',
y = [c for c in df.columns if c != 'date'],
template = 'plotly_dark',
color_discrete_sequence = colors,
title = 'Stocks',
)
# move legend
fig.layout.legend.x = -0.3
# add traces for annotations and text for end of lines
for i, d in enumerate(fig.data):
fig.add_scatter(x=[d.x[-1]], y = [d.y[-1]],
mode = 'markers+text',
text = d.y[-1],
textfont = dict(color=d.line.color),
textposition='middle right',
marker = dict(color = d.line.color, size = 12),
legendgroup = d.name,
showlegend=False)
fig.show()

How to update the date range on X-Axis with python-pptx

I have a multi-line chart that I'm trying to update the data for. I can change the data for the data series (1 to 5) in my case using a dataframe; I'm unable to figure out how to change the range for the category axis.
In the current scenario, I have the daterange starting from 2010; I can't figure out how to update that dynamically bases on input data
My chart is as shown below:
My chart data is as below:
My code is as below:
import pandas as pd
from pptx import Presentation
from pptx.chart.data import CategoryChartData, ChartData
df = pd.DataFrame({
'Date':['2010-01-01','2010-02-01','2010-03-01','2010-04-01','2010-05-01'],
'Series 1': [0.262918, 0.259484,0.263314,0.262108,0.252113],
'Series 2': [0.372340,0.368741,0.375740,0.386040,0.388732],
'Series 3': [0.109422,0.109256,0.112426,0.123932,0.136620],
'Series 4': [0.109422,0.109256,0.112426,0.123932,0.136620], # copy of series 3 for easy testing
'Series 5': [0.109422,0.109256,0.112426,0.123932,0.136620], # copy of series 3 for easy testing
})
prs = Presentation(presentation_path)
def update_multiline(chart,df):
plot = chart.plots[0]
category_labels = [c.label for c in plot.categories]
# series = plot.series[0]
chart_data = CategoryChartData()
chart_data.categories = [c.label for c in plot.categories]
category_axis = chart.category_axis
category_axis.minimum_scale = 1 # this should be a date
category_axis.minimum_scale = 100 # this should be a date
tick_labels = category_axis.tick_labels
df = df.drop(columns=['Date'])
for index in range(df.shape[1]):
columnSeriesObj = df.iloc[:, index]
chart_data.add_series(plot.series[index].name, columnSeriesObj)
chart.replace_data(chart_data)
# ================================ slide index 3 =============================================
slide_3 = prs.slides[3]
slide_3_title = slide_3.shapes.title # assigning a title
graphic_frame = slide_3.shapes
# slide has only one chart and that's the 3rd shape, hence graphic_frame[2]
slide_3_chart = graphic_frame[2].chart
update_multiline(slide_3_chart, df)
prs.save(output_path)
How to update the date range if my date in the dataframe starts from say 2015 i.e. 'Date':['2015-01-01','2015-02-01','2015-03-01','2015-04-01','2015-05-01']
You are simply copying the categories of the old chart into the new chart with:
chart_data.categories = [c.label for c in plot.categories]
You must draw the category labels from the dataframe if you expect them to change.

Adding labels to bokeh pie chart wedge

I am new to bokeh, and want to render a pie chart using bokeh figure.
I used the reference from https://docs.bokeh.org/en/latest/docs/gallery/pie_chart.html in order to create my pie chart figure.
Now, I need to add on each part of the pie chart a label which represent the percentage of this part, and the label position should be align to the center.
I could not find a simple way to do it via the documentation, and try to find ways to do it manually, like this example: Adding labels in pie chart wedge in bokeh
I tried to create a label set and add the layout to the plot but i could not figure out if there is a way to control the label position, size, and font. text_align (right, left, center) does not do the job for me.
Here is my code - this function create and return an html of the pie chart
The chart argument contains the relevant data for the chart. in this case its a tuple (size 1), and series[0] contains the name of the series (series.title), list of x values (series.x), and list of y values (series.y)
def render_piechart(self, chart):
"""
Renders PieChart object using Bokeh
:param chart: Pie chart
:return:
"""
series = chart.series[0]
data_dict = dict(zip(series.x, series.y))
data = pd.Series(data_dict).reset_index(name='value').rename(columns={'index': 'Category'})
data['angle'] = data['value'] / data['value'].sum() * 2 * pi
data['color'] = palette[:len(series.x)]
data['percentage'] = data['value'] / data['value'].sum() * 100
data['percentage'] = data['percentage'].apply(lambda x: str(round(x, 2)) + '%')
TOOLTIPS = [('Category', '#Category'), ('Value', '#value'), ('Percentage', '#percentage')]
fig = figure(title=series.title,
plot_width=400 if chart.sizehint == 'medium' else 600,
plot_height=350 if chart.sizehint == 'medium' else 450,
tools='hover', tooltips=TOOLTIPS, x_range=(-0.5, 1.0))
fig.wedge(x=0, y=1, radius=0.45, start_angle=cumsum('angle', include_zero=True),
end_angle=cumsum('angle'), line_color='white', fill_color='color',
legend='Category', source=data)
fig.title.text_font_size = '20pt'
source = ColumnDataSource(data)
labels = LabelSet(x=0, y=1, text='percentage', level='glyph', angle=cumsum('angle', include_zero=True),
source=source, render_mode='canvas')
fig.add_layout(labels)
fig.axis.axis_label = None
fig.axis.visible = False
fig.grid.grid_line_color = None
return bokeh.embed.file_html(fig, bokeh.resources.CDN)
And this is the results:
pie chart consist of 3 parts
pie chart consist of 10 parts
in the 2 examples - the series title is 'kuku'
x and y values for the first example:
x=["A", "B", "C"]
y=[10, 20, 30]
and for the second example:
x=["A", "B", "C", "D", "E", "F", "G", "H", "I"]
y=[10, 20, 30, 100, 90, 80, 70, 60, 30 , 40 ,50]
I know that in the past i could do it easily with Donut but it is deprecated.
I want to be able to get something like this one:
example1
or this: example2
The problem, as you understand, is here:
labels = LabelSet(x=0, y=1, text='percentage', level='glyph', angle=cumsum('angle', include_zero=True), source=source, render_mode='canvas')
It's a bit confusing to create labels in Bokeh, but still:
you should add columns like 'text_pos_x' and 'text_pos_y' for every row you draw and fill it in with coordinates where you would like to place the text. And then apply it in LabelSet function, giving x='text_pos_x' and y='text_pos_y' so that every single part of plot have its own coordinates where to place a label:
labels = LabelSet(x='text_pos_x', y='text_pos_y', text='percentage', level='glyph', angle=0, source=source, render_mode='canvas')
and yes, it's necessary to set angle = 0 to avoid text being rotated.
To complete #Higem 's answer I would suggest you some formula to centre your labels correctly on your pie chart. I modified your code as follows:
def render_piechart(self, chart):
"""
Renders PieChart object using Bokeh
:param chart: Pie chart
:return:
"""
radius = 0.45 # Radius of your pie chart
series = chart.series[0]
data_dict = dict(zip(series.x, series.y))
data = pd.Series(data_dict).reset_index(name='value').rename(columns={'index': 'Category'})
data['angle'] = data['value'] / data['value'].sum() * 2 * pi
data['color'] = palette[:len(series.x)]
data['percentage'] = data['value'] / data['value'].sum() * 100
data['percentage'] = data['percentage'].apply(lambda x: str(round(x, 2)) + '%')
# Projection on X and Y axis for label positioning
data['label_x_pos'] = np.cos(data['angle'].cumsum()-data['angle'].div(2))*3*radius/4
data['label_y_pos'] = np.sin(data['angle'].cumsum()-data['angle'].div(2))*3*radius/4
TOOLTIPS = [('Category', '#Category'), ('Value', '#value'), ('Percentage', '#percentage')]
fig = figure(title=series.title,
plot_width=400 if chart.sizehint == 'medium' else 600,
plot_height=350 if chart.sizehint == 'medium' else 450,
tools='hover', tooltips=TOOLTIPS, x_range=(-0.5, 1.0))
fig.wedge(x=0, y=0, radius=radius, start_angle=cumsum('angle', include_zero=True),
end_angle=cumsum('angle'), line_color='white', fill_color='color',
legend='Category', source=data) # Change center of the pie chart to (0, 0)
fig.title.text_font_size = '20pt'
source = ColumnDataSource(data)
labels = LabelSet(x='label_x_pos', y='label_y_pos', text='percentage', level='glyph', text_align='center', source=source, render_mode='canvas')
fig.add_layout(labels)
fig.axis.axis_label = None
fig.axis.visible = False
fig.grid.grid_line_color = None
return bokeh.embed.file_html(fig, bokeh.resources.CDN)
The result is the following:
I used the basic formula to convert polar coordinates to cartesian coordinates, see Wikipedia.

How to give a chart a title with iterator value in a loop?

I would like to add a title (from a list titles) to each of my charts in a loop, but the loop is failing when I try to add a text box to the chart (with iterator). Are iterator values allowed to serve as titles for charts? I'm using Python 3.6. Any ideas?
cleaned = [['AF In', 0.281777948141098], ['AF Top', 0.11941492557525635], ['AF View', 12.46446630278714], ['AF Non', 'AF V', 0.6745020151138306, 0.6817344427108765], ['AF Cab', 'AF N', 'AF HB', 0.6878681182861328, 0.2603790760040283, 0.05175277590751648]]
titles = ['first', 'second', 'third', 'fourth', 'fifth']
from pptx import Presentation
from pptx.chart.data import ChartData
from pptx.chart.data import XyChartData
from pptx.enum.chart import XL_CHART_TYPE
from pptx.util import Inches,Pt
from pptx.enum.chart import XL_LABEL_POSITION
from pptx.dml.color import RGBColor
from pptx.dml import fill
from pptx.chart.chart import ChartTitle
from pptx.chart.chart import Chart
# create presentation
prs = Presentation()
#For each list items from query_results (cleaned)
for idx, (chart_result, t) in enumerate(zip(cleaned, titles)):
#define size of chart
x, y, cx, cy = Inches(2), Inches(2), Inches(6), Inches(4.5)
try:
#split the results into two arrays
arr1, arr2 = np.array_split(chart_result,2)
#create a table for chart data
chart_data = CategoryChartData()
#assign arr1 to categories
chart_data.categories = arr1
#assign arr2 to series
chart_data.add_series('series_1',(float(x) for x in arr2))
#add slide
prs.slides.add_slide(prs.slide_layouts[6])
#add chart to slide by index
prs.slides[idx].shapes.add_chart(XL_CHART_TYPE.COLUMN_CLUSTERED, x, y, cx, cy, chart_data)
#chart = shapes[0].chart
chart.chart_title.text_frame.text = t
except:
print(f'chart load failed {idx}')
Make sure you actually have a reference to the chart object; I don't see the lines that would give that to you in your code. I recommend:
chart = prs.slides[idx].shapes.add_chart(...).chart
The object returned by .add_chart() is a graphic-frame shape. This shape contains the chart, which is obtained using the .chart property.
Then your assignment in the last line of the try block should succeed.
I figured out that the code I had was iterating through the slides twice.
I replaced the following code:
#add slide
prs.slides.add_slide(prs.slide_layouts[6])
#add chart to slide by index
prs.slides[idx].shapes.add_chart(XL_CHART_TYPE.COLUMN_CLUSTERED, x, y, cx, cy, chart_data)
#chart = shapes[0].chart
chart.chart_title.text_frame.text = t
With the following code:
#add slide
slide = prs.slides.add_slide(prs.slide_layouts[6])
#add chart to slide with title
chart = slide.shapes.add_chart(XL_CHART_TYPE.COLUMN_CLUSTERED, x, y, cx, cy, chart_data).chart
chart.chart_title.text_frame.text = t
I think the prs.slides[idx] was messing it up.

Resources