Setting Altair FormatLocale isn't working - locale

import altair as alt
import pandas as pd
from urllib import request
import json
# fetch & enable a Brazil format & timeFormat locales.
with request.urlopen('https://raw.githubusercontent.com/d3/d3-format/master/locale/pt-BR.json') as f:
pt_format = json.load(f)
with request.urlopen('https://raw.githubusercontent.com/d3/d3-time-format/master/locale/pt-BR.json') as f:
pt_time_format = json.load(f)
alt.renderers.set_embed_options(formatLocale=pt_format, timeFormatLocale=pt_time_format)
df = pd.DataFrame({
'date': pd.date_range('2020-01-01', freq='M', periods=6),
'revenue': [100000, 110000, 90000, 120000, 85000, 115000]
})
a = alt.Chart(df).mark_bar().encode(
y='month(date):O',
x=alt.X('revenue:Q', axis=alt.Axis(format='$,.0f'))
)
a.save('tst.html')
When I run the code I expect the revenue to be formatted using "R$" as prefix. But still getting "$". I checked pt_format and I could see "R$" for currency as below.
{'decimal': ',', 'thousands': '.', 'grouping': [3], 'currency': ['R$', '']}
It seems alt.renderers.set_embed_options is not working. I have no clue. Any help would be appreciated

alt.renderers settings only apply to charts being displayed by a renderer, e.g. in Jupyter Notebook: it does not affect charts being saved to HTML via chart.save().
In this case you can pass the embed options directly to the save() command:
chart.save('chart.html', embed_options=dict(formatLocale=pt_format, timeFormatLocale=pt_time_format))

Related

Plotly Choropleth_mapbox plots with Dash interactivity

I am a GIS person fairly new to Plotly and exceptionally new to Dash. I'm trying to mostly copy an example solution from a post here:
drop down menu with dash / plotly
To build an interactive app to look at various choropleth maps based on choropleth_mapbox figures. The last solution from the above post, using Plotly and Dash by Rob Raymond, looks brilliant and close to what I am trying to do. But in my case, my figures built on several data 'columns' also require an individual update_layout call and a hovertemplate built for each data column; and I cannot figure out where to place those definitions within the solution posted above.
This is my code for a single data column's figure, which gives me the functionality I want in the layout and hover tool:
fig = px.choropleth_mapbox(
gdf_blks_results,
geojson = gdf_blks.geometry,
locations = gdf_blks_results.index,
color=classme.yb,
color_continuous_scale = "YlOrRd",
center={"lat": 18.2208, "lon": -66.49},
mapbox_style="open-street-map",
width=800,
height=500,
custom_data = [gdf_blks_results['GEOID'],
gdf_blks_results['overallBurden']]
)
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0},
coloraxis_colorbar=dict(
title="burden",
thicknessmode="pixels",
lenmode="pixels",
yanchor="top",y=1,
ticks="outside",
tickvals=[0,1,2,3,4],
ticktext=myclasses,
dtick=5
))
# hover template
hovertemp = '<i>Census ID :</i> %{customdata[0]}<br>'
hovertemp += '<i>burden : </i> %{customdata[1]:.5f}<br>'
fig.update_traces(hovertemplate=hovertemp)
fig.show()
My question is, how do I incorporate that into the list of figures for a set of columns of data with custom template and figure update info for each? I tried to add it to the figure definitions in the cited post example before the "for c, color in zip(...)" statement, but I cannot get the syntax right, and I am not sure why not.
I think you should create a Dropdown list with Options as gdf_blks_results columns the returns it with callback to update choropleth map. Please refer below code:
import pandas as pd
import numpy as np
import plotly.express as px
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output
import dash_table
import dash_bootstrap_components as dbc
columns_list = list(gdf_blks_results.columns)
app = dash.Dash(__name__,external_stylesheets=[dbc.themes.LUX])
app.layout = html.Div([
dbc.Row([
dbc.Col([
html.H5('Columns',className='text-center'),
],width={'size':2,"offset":0,'order':1}),
dbc.Col([
dcc.Dropdown(id='columns',placeholder="Please select columns",
options=[{'label':x,'value':x} for x in columns_list],
value=[],
multi=False,
disabled=False,
clearable=True,
searchable=True),
],width={'size':10,"offset":0,'order':1})
], className='p-2 align-items-stretch'),
dbc.Row([
dbc.Col([
dcc.Graph(id="choropleth_maps",figure={},style={'height':500}), #Heatmap plot
],width={'size':12,'offset':0,'order':2}),
]),
])
#app.callback(Output('choropleth_maps', 'figure'),
[Input('columns', 'value')])
def update_graph(columns):
fig = px.choropleth_mapbox(
gdf_blks_results,
geojson = gdf_blks.geometry,
locations = gdf_blks_results.index,
color=columns,
color_continuous_scale = "YlOrRd",
center={"lat": 18.2208, "lon": -66.49},
mapbox_style="open-street-map",
width=800,
height=500,
custom_data = [gdf_blks_results['GEOID'],
gdf_blks_results['overallBurden']])
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0},
coloraxis_colorbar=dict(
title="burden",
thicknessmode="pixels",
lenmode="pixels",
yanchor="top",y=1,
ticks="outside",
tickvals=[0,1,2,3,4],
ticktext=myclasses,
dtick=5
))
# hover template
hovertemp = '<i>Census ID :</i> %{customdata[0]}<br>'
hovertemp += '<i>burden : </i> %{customdata[1]:.5f}<br>'
fig.update_traces(hovertemplate=hovertemp)
return fig
if __name__ == "__main__":
app.run_server(debug=False,port=1116)

How do you pull weekly/monthly historical data from yahoo finance?

By default, this code obtains all the closing days data or several Tickers:
tickers = ['SPY', 'QQQ', 'GLD ', 'EEM', 'IEMG', 'VTI', 'HYG', 'SJNK', 'USO']
ind_data = pd.DataFrame()
for t in tickers:
ind_data[t] = wb.DataReader(t,data_source='yahoo', start='2015-1-1')['Adj Close']
ind_data.to_excel('C:/Users/micka/Desktop/ETF.xlsx')
How do you add a parameter to Datareader in order to obtain weekly/monthly historical data instead? I tried using freq and interval but it doesn't work.
What if you try to replace this in your code for weakly data:
# Don't forget to import pandas_datareader exactly in this way
import pandas_datareader
# Then replace this in for loop
pandas_datareader.yahoo.daily.YahooDailyReader(t, interval='w' , start='2015-1-1').read()['Adj Close']
And this for monthly data:
# Don't forget to import pandas_datareader exactly in this way
import pandas_datareader
# Then replace this in for loop
pandas_datareader.yahoo.daily.YahooDailyReader(t, interval='m' , start='2015-1-1').read()['Adj Close']
Check official doc for further options. After executing this code for the weakly period:
import pandas_datareader
tickers = ['SPY', 'QQQ', 'GLD ', 'EEM', 'IEMG', 'VTI', 'HYG', 'SJNK', 'USO']
ind_data = pd.DataFrame()
for t in tickers:
ind_data[t] = pandas_datareader.yahoo.daily.YahooDailyReader(t, interval='w' , start='2015-1-1').read()['Adj Close']
ind_data
I got this:

Callback error updating output-link.href - Parsing and downloading CSV file

Aim and objective: I am beginner to web development and I'm using plotly Dash to allow user to upload a CSV file, run a script to modify the CSV file and let the user download the the modified CSV file.
Tools employed: I am using dash.core.components to achieve this. I use dcc.upload to allow the user to upload a file and using html.A to return a link to the modified csv file which allows the user to download the file. All modifications to csv file performed using Pandas.
Issue: However, when I'm trying to return the href for the modified file, I run into a callback error.
Code Description: For reference in the below code, the function interval_cleaner reads the csv file, identifies missing data and fills the missing data.The interval_cleaner basically takes the contents component of dcc.upload as an input and decodes the contents and returns a dataframe. In the callback function "update_output_parser", I call the interval_cleaner function which returns a pandas dataframe (clean_dat) which is then converted to a csv file which I'm trying to return as the href component of html.A.
Data: I have 21886 rows in the original file but output is around 17520 rows.
df=pd.DataFrame({'Date':[7/1/2019 0:30,7/1/2019 1:00,7/1/2019 1:30,7/1/2019 2:00],'Demand':[60.48,52.92,49.32,53.28]})
import pandas as pd
import datetime
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output, State
import urllib
import base64
import io
app=dash.Dash()
app.layout=html.Div([
html.H3('Unit of the data'),
dcc.RadioItems(
id='units',
options=[{'label':'kWh','value':'kwh'},
{'label':'kW','value':'kw'}],
value='kwh',
labelStyle={'display':'inline-block'}
),
html.H3('Interval of load data'),
dcc.RadioItems(
id='intervals',
options=[{'label':'15 minutes','value':15},
{'label':'30 minutes','value':30},
{'label':'Hourly','value':60}],
value=60,
labelStyle={'display':'inline-block'}
),
html.H3('Convert into other intervals'),
dcc.RadioItems(
id='conversion',
options=[{'label':'Yes','value':'y'},
{'label':'No','value':'n'}],
value='n',
labelStyle={'display':'inline-block'}
),
dcc.Upload(
id='upload-data',
children=html.Div([
'Drag and Drop or ',
html.A('Select Files')
]),
style={
'width': '100%',
'height': '60px',
'lineHeight': '60px',
'borderWidth': '1px',
'borderStyle': 'dashed',
'borderRadius': '5px',
'textAlign': 'center',
'margin': '10px'
},
# Allow multiple files to be uploaded
multiple=True
),
html.Div(id='Filename'),
html.A('Download cleaned load',id='output-link',download='cleaned_load_data.csv',target='_blank'),
])
def interval_cleaner(csv_file_content,unit,frequency):
content_type, content_string = csv_file_content.split(',')
decoded = base64.b64decode(content_string)
df = pd.read_csv(io.StringIO(decoded.decode('utf-8')))
df['Date']=df['Date'].astype('datetime64[ns]')
df=df.set_index('Date').sort_index()
if frequency==15:
mult=4
elif frequency==30:
mult=2
else:
mult=1
cleaned_load=pd.DataFrame(pd.date_range(start=df.index[0], end=df.index[0]+pd.to_timedelta('364 days 23:59:00'),freq=str(frequency)+'T'),columns=['Date']) #365 days changed to 364 days
leap=len(cleaned_load[((cleaned_load['Date'].dt.month==2) & (cleaned_load['Date'].dt.day==29))])
if leap!=0:
cleaned_load=cleaned_load[~((cleaned_load['Date'].dt.month==2) & (cleaned_load['Date'].dt.day==29))]
if frequency==60:
cleaned_load=cleaned_load.append(pd.DataFrame(pd.date_range(cleaned_load.iloc[len(cleaned_load)-1,0]+pd.to_timedelta('01:00:00'),cleaned_load.iloc[len(cleaned_load)-1,0]+pd.to_timedelta('1 day'),freq=str(frequency)+'T'),columns=['Date']),ignore_index=True)
else:
cleaned_load=cleaned_load.append(pd.DataFrame(pd.date_range(cleaned_load.iloc[len(cleaned_load)-1,0]+pd.DateOffset(minute=frequency),cleaned_load.iloc[len(cleaned_load)-1,0]+pd.to_timedelta('1 day'),freq=str(frequency)+'T'),columns=['Date']),ignore_index=True)
cleaned_load=cleaned_load.set_index('Date')
cleaned_load=cleaned_load.join(df,how='left')
cleaned_load=cleaned_load[~cleaned_load.index.duplicated(keep='last')].reset_index()
#print(cleaned_load.head())
cleaned_load['MDH']=cleaned_load['Date'].dt.strftime('%m-%d %H:%M')
nans=cleaned_load[cleaned_load['Demand'].isnull()].index
#print(nans)
#print(cleaned_load.iloc[nans,0])
for item in nans:
if item<24*mult*7:
ind=cleaned_load.iloc[item:len(cleaned_load)+item:24*mult,1].first_valid_index() # 70 day changed to len()
cleaned_load.iloc[item,1]=cleaned_load.iloc[ind:ind+24*mult*7:24*mult,1].mean(skipna=True)
else:
cleaned_load.iloc[item,1]=cleaned_load.iloc[item-24*mult*7:item:24*mult,1].mean(skipna=True)
cleaned_load=cleaned_load.sort_values(by='MDH')
return cleaned_load
#app.callback(Output('output-link', 'href'),
Input('upload-data', 'contents'),
[State('units', 'value'),
State('intervals', 'value')])
def update_output_parser(file_contents,unit,frequency):
clean_dat=interval_cleaner(file_contents,unit,frequency)
dat_csv=clean_dat.to_csv(encoding='utf-8')
dat_csv = "data:text/csv;charset=utf-8," + urllib.quote(dat_csv)
return dat_csv
if __name__ == '__main__':
app.run_server(debug=True)

insert using pandas to_sql() missing data into clickhouse db

It's my first time using sqlalchemy and pandas to insert some data into a clickhouse db.
When I try to insert some data using clickhouse cli it works fine, but when I tried to do the same thing using sqlalchemy I don't know why one row is missing.
Have I done something wrong?
import pandas as pd
# created the dataframe
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
metadata.reflect(bind = engine)
conn = engine.connect()
df.to_sql('test', conn, if_exists = 'append', index = False)
Let's try this way:
import pandas as pd
from infi.clickhouse_orm.engines import Memory
from infi.clickhouse_orm.fields import UInt16Field, StringField
from infi.clickhouse_orm.models import Model
from sqlalchemy import create_engine
# define the ClickHouse table schema
class Test_Humans(Model):
year = UInt16Field()
first_name = StringField()
engine = Memory()
engine = create_engine('clickhouse://default:#localhost/test')
# create table
with engine.connect() as conn:
conn.connection.create_table(Test_Humans) # https://github.com/Infinidat/infi.clickhouse_orm/blob/master/src/infi/clickhouse_orm/database.py#L142
pdf = pd.DataFrame.from_records([
{'year': 1994, 'first_name': 'Vova'},
{'year': 1995, 'first_name': 'Anja'},
{'year': 1996, 'first_name': 'Vasja'},
{'year': 1997, 'first_name': 'Petja'},
# ! sqlalchemy-clickhouse ignores the last item so add fake one
{}
])
pdf.to_sql('test_humans', engine, if_exists='append', index=False)
Take into account that sqlalchemy-clickhouse ignores the last item so add fake one (see source code and related issue 10).

Dash python plotly live update table

I am new to plotly dash. I want to draw a table whose values (Rows) will
automatically be updated after certain interval of time but i do not know how
to use dash table experiments. The table is already saved as CSV file but i
am somehow unable make it live.
Please help!
Can some one guide me in the right direction what should i do
Your help will be highly appreciated. Following is the code.
import dash
import pandas as pd
from pandas import Series, DataFrame
from dash.dependencies import Input, Output, Event
import dash_core_components as dcc
import dash_html_components as html
import dash_table_experiments as dtable
app=dash.Dash()
def TP_Sort():
address = 'E:/Dats Science/POWER BI LAB DATA/PS CORE KPIS/Excel Sheets/Throughput.xlsx'
TP = pd.read_excel(address)
TP1=TP.head()
Current_Interval.to_csv('TP1.csv', index=False)
return app.layout = html.Div([
html.H1('Data Throughput Dashboard-NOC NPM Core'),
dcc.Interval(id='graph-update',interval=240000),
dtable.DataTable(id='my-table',
rows=[{}],
row_selectable=False,
filterable=True,
sortable=False,
editable=False)
])
#app.callback(
dash.dependencies.Output('my-table','row_update'),
events=[dash.dependencies.Event('graph-update', 'interval')])
def update_table(maxrows=4):
TP_Sort()
TP_Table1='C:/Users/muzamal.pervez/Desktop/Python Scripts/TP1.csv'
TP_Table2=pd.read_csv(TP_Table1)
return TP_Table2.to_dict('records')
if __name__ == '__main__':
app.run_server(debug=False)
I am trying the above approach. Please correct me where i am wrong as the output is error loading dependencies.
BR
Rana
Your callback is wrong.
It should be:
#app.callback(Output('my-table', 'rows'), [Input('graph-update', 'n_intervals')])
def update_table(n, maxrows=4):
# We're now in interval *n*
# Your code
return TP_Table2.to_dict('records')

Resources