update value in a column of a table based on input in shiny - shiny-reactivity

I am new in Shiny and I need yours knowledge. I will try explain with detail:
In my application, the user upload a csv file. This information is visualized in 2 tabPanel. In the first tabPanel named Original, the user can see the original dataset. The other one named Procesado, the user can see a select input with values from 1 to 4 and only 2 columns (internally before to show the information, the dataset is edited, eliminating columns and rows).
I want update the values of the first column every time that the user change the value in the select input. I tried with renderUI and reactive, but is not working. Thanks
shinyUI(
pageWithSidebar(
headerPanel("Ciclismo - Calidad de la sesion"),
sidebarPanel(
p("Escoge el dataset generado por la aplicacion de ciclismo Polar"),
fileInput("file1","Escoge un archivo CSV",
multiple = FALSE,
accept = c(".csv")),
tags$hr(),
submitButton("Evaluar Dataset")
),
mainPanel(
tabsetPanel(
tabPanel("Original", tableOutput("contents")),
tabPanel("Procesado",
p("Selecciona la zona de entrenamiento a evaluar"),
selectInput("zone", "Zona de entrenamiento",
choices = c("1", "2", "3", "4")),
uiOutput("ui"),
tags$hr(),
tableOutput("modified"),),
tabPanel("Graficos",
h3(textOutput("output_text")),
plotOutput("output_plot")
)
)
)
)
)
The code on the server is
shinyServer(function(input, output) {
output$contents <- renderTable({
#data.frame(x=auto)
req(input$file1)
# when reading semicolon separated files,
# having a comma separator causes `read.csv` to error
#tryCatch(
#{
df <- read.csv(input$file1$datapath)
#},
#error = function(e) {
# return a safeError if a parsing error occurs
# stop(safeError(e))
# }
#)
return(df)
})
output$ui <- renderUI({
switch(input$zone,
"1" = cbind(ZONA = 1, dfProc),
"2" = cbind(ZONA = 2, dfProc),
"3" = cbind(ZONA = 3, dfProc),
"4" = cbind(ZONA = 4, dfProc))
})
output$modified <- renderTable({
req(input$file1)
dff <- read.csv(input$file1$datapath)
#return(df)
dfN = dff[-c(1,2,3),-c(1,2)]
FC = dfN[,-c(2:26)]
#Se convierte en dataframe para realizar operaciones
dfProc = data.frame(FC)
newdata_test = cbind(ZONA = 1,dfProc)
return(newdata_test)
})
})

Finally, I did it.
I put two eventReactive. The first one is to read the CSV file. The second one with the input Zona using cbind function for insert the input value in the column. I hope this answer can help to other people
df <- eventReactive(input$csv,{
req(input$csv)
dff <- read.csv(input$csv$datapath)
#The dataset is updated, eliminating rows and columns
dfN = dff[-c(1,2,3),-c(1,2)]
FC = dfN[,-c(2:26)]
dfProc = data.frame(FC)
return(dfProc)
})
datos <- eventReactive(input$zone,{
newtable = cbind(input$zone, df())
})
output$modified <- renderTable({datos()})

Related

Optimizing usage of memory of for loops in python

Let's say I have multiple for loops that insert data into a database like the example below:
categoria = ('Associação Nacional', 'Federação Nacional', 'Organização Não-Governamental', 'Sociedade Beneficente', 'Fundação', 'União', 'Instituto', 'Entidade')
nome = ('do Meio Ambiente', 'do Voluntariado', 'da Saúde', 'da Boa Vontade', 'da Preservação da Natureza', 'do Idoso', 'da Informática', 'do Empreendedorismo', 'da Assistência Social')
areas_atuacao = ('Ambientalismo Conservação', 'Conservação de recursos naturais', 'Controle da poluição Proteção de animais', 'Tecnologias alternativas')
group_nome = []
for x in range (5):
nome_entidade = random.choice(categoria)+" "+random.choice(nome)
group_nome.append(nome_entidade)
group_atuacao = []
for x in range(5):
atuacao = random.choice(areas_atuacao)
group_atuacao.append(AreaAtuacao.objects.filter(nome=atuacao)[0])
group_cep = []
for x in range(5):
cep = randint(10000000, 99999999)
group_cep.append(cep)
for x in range(5):
entidade = Entidade(razao_social=group_nome[x],
cep=group_cep[x],
area_atuacao=group_atuacao[x],
entidade.save()
My question: how can I reestructure the code in order to optimize it to use less ram in the process? I've tried but didn't work. Thanks in advance.

Extracting particular word from existing Sentence

I have a following string
My idea is to extract particular word and update the same in different column. For e.g
I could not get an idea how to extract the same
my_string ="TT Load:Route1 TT Load for 30 out of 80 from Route 2"
description=my_string.split(":")[0]
route_start = my_string.find('Route')
route_end= my_string.find('Route')+6
route= my_string[route_start : route_end]
TTLoad = my_string[route_end:].split('TT Load')
res = [int(i) for i in TTLoad[1].split() if i.isdigit()]
TTLoad_value= res[0]
Out_Of = my_string[route_end:].split('out of')
res = [int(i) for i in Out_Of[1].split() if i.isdigit()]
Out_Of_value= res[0]
required_dict={ "GivenDescription":my_string,
"Description":description,
"Route": route,
"TTLoad" :TTLoad_value,
"Out of" : Out_Of_value}
df=pd.DataFrame.from_dict(required_dict,orient='index').T

How to handle blank line,junk line and \n while converting an input file to csv file

Below is the sample data in input file. I need to process this file and turn it into a csv file. With some help, I was able to convert it to csv file. However not fully converted to csv since I am not able to handle \n, junk line(2nd line) and blank line(4th line). Also, i need help to filter transaction_type i.e., avoid "rewrite" transaction_type
{"transaction_type": "new", "policynum": 4994949}
44uu094u4
{"transaction_type": "renewal", "policynum": 3848848,"reason": "Impressed with \n the Service"}
{"transaction_type": "cancel", "policynum": 49494949, "cancel_table":[{"cancel_cd": "AU"}, {"cancel_cd": "AA"}]}
{"transaction_type": "rewrite", "policynum": 5634549}
Below is the code
import ast
import csv
with open('test_policy', 'r') as in_f, open('test_policy.csv', 'w') as out_f:
data = in_f.readlines()
writer = csv.DictWriter(
out_f,
fieldnames=[
'transaction_type', 'policynum', 'cancel_cd','reason'],lineterminator='\n',
extrasaction='ignore')
writer.writeheader()
for row in data:
dict_row = ast.literal_eval(row)
if 'cancel_table' in dict_row:
cancel_table = dict_row['cancel_table']
cancel_cd= []
for cancel_row in cancel_table:
cancel_cd.append(cancel_row['cancel_cd'])
dict_row['cancel_cd'] = ','.join(cancel_cd)
writer.writerow(dict_row)
Below is my output not considering the junk line,blank line and transaction type "rewrite".
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with
the Service"
cancel,49494949,"AU,AA",
Expected output
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with the Service"
cancel,49494949,"AU,AA",
Hmm I try to fix them but I do not know how CSV file work, but my small knoll age will suggest you to run this code before to convert the file.
txt = {"transaction_type": "renewal",
"policynum": 3848848,
"reason": "Impressed with \n the Service"}
newTxt = {}
for i,j in txt.items():
# local var (temporar)
lastX = ""
correctJ = ""
# check if in J is ascii white space "\n" and get it out
if "\n" in f"b'{j}'":
j = j.replace("\n", "")
# for grammar purpose check if
# J have at least one space
if " " in str(j):
# if yes check it closer (one by one)
for x in ([j[y:y+1] for y in range(0, len(j), 1)]):
# if 2 spaces are consecutive pass the last one
if x == " " and lastX == " ":
pass
# if not update correctJ with new values
else:
correctJ += x
# remember what was the last value checked
lastX = x
# at the end make J to be the correctJ (just in case J has not grammar errors)
j = correctJ
# add the corrections to a new dictionary
newTxt[i]=j
# show the resoult
print(f"txt = {txt}\nnewTxt = {newTxt}")
Termina:
txt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with \n the Service'}
newTxt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with the Service'}
Process finished with exit code 0

Updating multiple line plots dynamically in callback in bokeh

I have a use case where I have multiple line plots (with legends), and I need to update the line plots based on a column condition. Below is an example of two data set, based on the country, the column data source changes. But the issue I am facing is, the number of columns is not fixed for the data source, and even the types can vary. So, when I update the data source based on a callback when there is a new country selected, I get this error:
Error: attempted to retrieve property array for nonexistent field 'pay_conv_7d.content'.
I am guessing because in the new data source, the pay_conv_7d.content column doesn't exist, but in my plot those lines were already there. I have been trying to fix this issue by various means (making common columns for all country selection - adding the missing column in the data source in callback, but still get issues.
Is there any clean way to have multiple line plots updating using callback, and not do a lot of hackish way? Any insights or help would be really appreciated. Thanks much in advance! :)
def setup_multiline_plots(x_axis, y_axis, title_text, data_source, plot):
num_categories = len(data_source.data['categories'])
legends_list = list(data_source.data['categories'])
colors_list = Spectral11[0:num_categories]
# xs = [data_source.data['%s.'%x_axis].values] * num_categories
# ys = [data_source.data[('%s.%s')%(y_axis,column)] for column in data_source.data['categories']]
# data_source.data['x_series'] = xs
# data_source.data['y_series'] = ys
# plot.multi_line('x_series', 'y_series', line_color=colors_list,legend='categories', line_width=3, source=data_source)
plot_list = []
for (colr, leg, column) in zip(colors_list, legends_list, data_source.data['categories']):
xs, ys = '%s.'%x_axis, ('%s.%s')%(y_axis,column)
plot.line(xs,ys, source=data_source, color=colr, legend=leg, line_width=3, name=ys)
plot_list.append(ys)
data_source.data['plot_names'] = data_source.data.get('plot_names',[]) + plot_list
plot.title.text = title_text
def update_plot(country, timeseries_df, timeseries_source,
aggregate_df, aggregate_source, category,
plot_pay_7d, plot_r_pay_90d):
aggregate_metrics = aggregate_df.loc[aggregate_df.country == country]
aggregate_metrics = aggregate_metrics.nlargest(10, 'cost')
category_types = list(aggregate_metrics[category].unique())
timeseries_df = timeseries_df[timeseries_df[category].isin(category_types)]
timeseries_multi_line_metrics = get_multiline_column_datasource(timeseries_df, category, country)
# len_series = len(timeseries_multi_line_metrics.data['time.'])
# previous_legends = timeseries_source.data['plot_names']
# current_legends = timeseries_multi_line_metrics.data.keys()
# common_legends = list(set(previous_legends) & set(current_legends))
# additional_legends_list = list(set(previous_legends) - set(current_legends))
# for legend in additional_legends_list:
# zeros = pd.Series(np.array([0] * len_series), name=legend)
# timeseries_multi_line_metrics.add(zeros, legend)
# timeseries_multi_line_metrics.data['plot_names'] = previous_legends
timeseries_source.data = timeseries_multi_line_metrics.data
aggregate_source.data = aggregate_source.from_df(aggregate_metrics)
def get_multiline_column_datasource(df, category, country):
df_country = df[df.country == country]
df_pivoted = pd.DataFrame(df_country.pivot_table(index='time', columns=category, aggfunc=np.sum).reset_index())
df_pivoted.columns = df_pivoted.columns.to_series().str.join('.')
categories = list(set([column.split('.')[1] for column in list(df_pivoted.columns)]))[1:]
data_source = ColumnDataSource(df_pivoted)
data_source.data['categories'] = categories
Recently I had to update data on a Multiline glyph. Check my question if you want to take a look at my algorithm.
I think you can update a ColumnDataSource in three ways at least:
You can create a dataframe to instantiate a new CDS
cds = ColumnDataSource(df_pivoted)
data_source.data = cds.data
You can create a dictionary and assign it to the data attribute directly
d = {
'xs0': [[7.0, 986.0], [17.0, 6.0], [7.0, 67.0]],
'ys0': [[79.0, 69.0], [179.0, 169.0], [729.0, 69.0]],
'xs1': [[17.0, 166.0], [17.0, 116.0], [17.0, 126.0]],
'ys1': [[179.0, 169.0], [179.0, 1169.0], [1729.0, 169.0]],
'xs2': [[27.0, 276.0], [27.0, 216.0], [27.0, 226.0]],
'ys2': [[279.0, 269.0], [279.0, 2619.0], [2579.0, 2569.0]]
}
data_source.data = d
Here if you need different sizes of columns or empty columns you can fill the gaps with NaN values in order to keep column sizes. And I think this is the solution to your question:
import numpy as np
d = {
'xs0': [[7.0, 986.0], [17.0, 6.0], [7.0, 67.0]],
'ys0': [[79.0, 69.0], [179.0, 169.0], [729.0, 69.0]],
'xs1': [[17.0, 166.0], [np.nan], [np.nan]],
'ys1': [[179.0, 169.0], [np.nan], [np.nan]],
'xs2': [[np.nan], [np.nan], [np.nan]],
'ys2': [[np.nan], [np.nan], [np.nan]]
}
data_source.data = d
Or if you only need to modify a few values then you can use the method patch. Check the documentation here.
The following example shows how to patch entire column elements. In this case,
source = ColumnDataSource(data=dict(foo=[10, 20, 30], bar=[100, 200, 300]))
patches = {
'foo' : [ (slice(2), [11, 12]) ],
'bar' : [ (0, 101), (2, 301) ],
}
source.patch(patches)
After this operation, the value of the source.data will be:
dict(foo=[11, 22, 30], bar=[101, 200, 301])
NOTE: It is important to make the update in one go to avoid performance issues

Unknown column added in user input form

I have a simple data entry form that writes the inputs to a csv file. Everything seems to be working ok, except that there are extra columns being added to the file in the process somewhere, seems to be during the user input phase. Here is the code:
import pandas as pd
#adds all spreadsheets into one list
Batteries= ["MAT0001.csv","MAT0002.csv", "MAT0003.csv", "MAT0004.csv",
"MAT0005.csv", "MAT0006.csv", "MAT0007.csv", "MAT0008.csv"]
#User selects battery to log
choice = (int(input("Which battery? (1-8):")))
def choosebattery(c):
done = False
while not done:
if(c in range(1,9)):
return Batteries[c]
done = True
else:
print('Sorry, selection must be between 1-8')
cfile = choosebattery(choice)
cbat = pd.read_csv(cfile)
#Collect Cycle input
print ("Enter Current Cycle")
response = None
while response not in {"Y", "N", "y", "n"}:
response = input("Please enter Y or N: ")
cy = response
#Charger input
print ("Enter Current Charger")
response = None
while response not in {"SC-G", "QS", "Bosca", "off", "other"}:
response = input("Please enter one: 'SC-G', 'QS', 'Bosca', 'off', 'other'")
if response == "other":
explain = input("Please explain")
ch = response + ":" + explain
else:
ch = response
#Location
print ("Enter Current Location")
response = None
while response not in {"Rack 1", "Rack 2", "Rack 3", "Rack 4", "EV001", "EV002", "EV003", "EV004", "Floor", "other"}:
response = input("Please enter one: 'Rack 1 - 4', 'EV001 - 004', 'Floor' or 'other'")
if response == "other":
explain = input("Please explain")
lo = response + ":" + explain
else:
lo = response
#Voltage
done = False
while not done:
choice = (float(input("Enter Current Voltage:")))
modchoice = choice * 10
if(modchoice in range(500,700)):
vo = choice
done = True
else:
print('Sorry, selection must be between 50 and 70')
#add inputs to current battery dataframe
log = pd.DataFrame([[cy,ch,lo,vo]],columns=["Cycle", "Charger", "Location", "Voltage"])
clog = pd.concat([cbat,log], axis=0)
clog.to_csv(cfile, index = False)
pd.read_csv(cfile)
And I receive:
Out[18]:
Charger Cycle Location Unnamed: 0 Voltage
0 off n Floor NaN 50.0
Where is the "Unnamed" column coming from?
There's an 'unnamed' column coming from your csv. The reason most likely is that the lines in your input csv files end with a comma (i.e. your separator), so pandas interprets that as an additional (nameless) column. If that's the case, check whether your lines end with your separator. For example, if your files are separated by commas:
Column1,Column2,Column3,
val_11, val12, val12,
...
Into:
Column1,Column2,Column3
val_11, val12, val12
...
Alternatively, try specifying the index column explicitly as in this answer. I believe some of the confusion stems from pandas concat reordering your columns .

Resources