My data is arranged in dictionaries within dictionaries, like so:
dict = {subdict1:{}, subdict2:{},...}
where
subdict1 = { subdict_a: {"date":A, "smallest_date":False}, subdict_b : {"date":B, "smallest_date": False},...}
I'd like to loop through the subdictionaries a,b,c... and identify which of the dates A, B, C... is the smallest in each subdictionary, and change the value of 'smallest_date' to True.
How to approach this problem? I tried something like this, but couldn't quite finish it:
for subdict_number, values1 in dict.items():
smallest_date = None
for subdict_alphabet, values2 in values1.items():
if smallest_date == None or smallest_date > values2["date"]
smallest_date = values2["date"]
smallest_subdict = subdict_alphabet
And then some magic where as the loop within subdict closes sets
dict[subdict][smallest_subdict]["date"] = smallest_date
and then continues to the next subdict to do the same thing.
I can't finish this. Can you help me out? A completely different approach can be used, but as a beginner I couldn't think of one.
I've tried to keep the naming explanatory.
Given the input dictionary:
main_dict = { 'subdict1' : {'subdict_1a': {"date":1, "smallest_date":False},
'subdict_1b' : {"date":2, "smallest_date": False}},
'subdict2': {'subdict_2a': {"date":3, "smallest_date":False},
'subdict_2b' : {"date":4, "smallest_date": False}}}
Iterate through the subdicts and declare variables:
for subdict in main_dict:
min_date = 10000000
min_date_subsubdict_name = None
Iterate through the subsubdicts and determine the minimum
for subsubdict in main_dict[subdict]:
if main_dict[subdict][subsubdict]['date'] < min_date:
min_date = main_dict[subdict][subsubdict]['date']
min_date_subsubdict_name = subsubdict
Inside the first loop, but outside the second loop:
main_dict[subdict][min_date_subsubdict_name]['smallest_date'] = True
This should return the output maindict:
{'subdict2': {'subdict_2a': {'date': 3, 'smallest_date': True}, 'subdict_2b': {'date': 4, 'smallest_date': False}}, 'subdict1': {'subdict_1a': {'date': 1, 'smallest_date': True}, 'subdict_1b': {'date': 2, 'smallest_date': False}}}
Related
im trying to add variables to a list that i created. Got a result from a session.execute.
i´ve done this:
def machine_id(session, machine_serial):
stmt_raw = '''
SELECT
id
FROM
machine
WHERE
machine.serial = :machine_serial_arg
'''
utc_now = datetime.datetime.utcnow()
utc_now_iso = pytz.utc.localize(utc_now).isoformat()
utc_start = datetime.datetime.utcnow() - datetime.timedelta(days = 30)
utc_start_iso = pytz.utc.localize(utc_start).isoformat()
stmt_args = {
'machine_serial_arg': machine_serial,
}
stmt = text(stmt_raw).columns(
#ts_insert = ISODateTime
)
result = session.execute(stmt, stmt_args)
ts = utc_now_iso
ts_start = utc_start_iso
ID = []
for row in result:
ID.append({
'id': row[0],
'ts': ts,
'ts_start': ts_start,
})
return ID
In trying to get the result over api like this:
def form_response(response, session):
result_machine_id = machine_id(session, machine_serial)
if not result_machine_id:
response['Error'] = 'Seriennummer nicht vorhanden/gefunden'
return
response['id_timerange'] = result_machine_id
Output looks fine.
{
"id_timerange": [
{
"id": 1,
"ts": "2020-08-13T08:32:25.835055+00:00",
"ts_start": "2020-07-14T08:32:25.835089+00:00"
}
]
}
Now i only want the id from it as a parameter for another function. Problem is i think its not a list. I cant select the first element. result_machine_id[0] result is like the posted Output. I think in my first function i only add ts & ts_start to the first row? Is it possible to add emtpy rows and then add 'ts':ts as value?
Help would be nice
If I have understood your question correctly ...
Your output looks like dict. so access its id_timerange key which gives you a list. Access the first element which gives you another dict. On this dict you have an id key:
result_machine_id["id_timerange"][0]["id"]
I am working with object variables based off ICD-9 codes and am trying to create a dictionary which identifies all ICD-9 codes between E880.xx and E888.xx.
I want to code all values between E880.xx and E888.xx as a 1, and all other values as a 0.
I attempted this:
fallinjury_Dictionary = {1 : 'E880', 1 : 'E881', 1 : 'E882' ...}
but the key gets overwritten every time and I only end up with one value in the dictionary (1 = E888)
I've also tried this:
fallinjury_Dictionary = {1 : {'E880' or 'E881' or 'E882' .... or 'E888'}
which just doesn't work.
Dictionaries can only have one value for each key. If you are looking to map the value 1 to E880 (instead of the other way around), you can do that, but otherwise you can't (uniqueness of keys is in an inherent part of dictionaries).
fallinjury_Dictionary = dict.fromkeys(['E880', 'E881', 'E882'], 1)
fallinjury_Dictionary.update(dict.fromkeys(['E880', 'E883'], 0)) # to update values
If i got you right it must be "0: 'E901'" for E901
fallinjury_Dictionary = {1 : 'E880', 1 : 'E881', 1 : 'E882', ... 0: 'E901'}
# Instead i would switch the key and the value for this case
fallinjury_Dictionary = {'E880': 1, 'E881': 1, 'E882':1, ... 'E901': 0}
# You could use boolean
fallinjury_Dictionary = {'E880': True, 'E881': True, 'E882': True, ... 'E901': False}
You could also think about creating classes.
I want to update the value of a particular key in the list of dictionaries.
For example I have the following list of dictionaries (Input values):
deviceDynamics = [{'updated': '2019-07-10T10:27:44.763Z',
'created': '2019-07-10T10:27:44.763Z'},
{'updated': '2019-07-10T10:27:44.763Z',
'created': '2019-07-10T10:27:44.763Z'},
{'updated': '2019-07-10T10:27:44.763Z',
'created': '2019-07-10T10:27:44.763Z'}]
My code is -
for d in deviceDynamics:
timestamp = ((datetime.strptime(d['updated'], '%Y-%m-%dT%H:%M:%S.%fZ')) - datetime(1970, 1, 1)).total_seconds()
d = next(d for d in deviceDynamics)
d['updated'] = timestamp
print(deviceDynamics)
Instead of changing the every created keys value it is changing the first one. The following is the output -
[{'created': '2019-07-10T10:27:44.763Z', 'updated': 1562754464.763}, {'created': '2019-07-10T10:27:44.763Z', 'updated': '2019-07-10T10:27:44.763Z'}, {'created': '2019-07-10T10:27:44.763Z', 'updated': '2019-07-10T10:27:44.763Z'}]
But it is not changing the other created keys value...any suggestion, please
Remove the line where you are setting d equal to the new iterator that is always stuck on the first item.
for d in deviceDynamics:
timestamp = ((datetime.strptime(d['updated'], '%Y-%m-%dT%H:%M:%S.%fZ')) -datetime(1970, 1, 1)).total_seconds()
d['updated'] = timestamp
print(deviceDynamics)
If you would like to print the entire list with the updated KV pairs, include a print statement outside of your for loop.
for d in deviceDynamics:
timestamp = ((datetime.strptime(d['updated'], '%Y-%m-%dT%H:%M:%S.%fZ')) - datetime(1970, 1, 1)).total_seconds()
d['updated'] = timestamp
print(deviceDynamics)
I have a use case where I have multiple line plots (with legends), and I need to update the line plots based on a column condition. Below is an example of two data set, based on the country, the column data source changes. But the issue I am facing is, the number of columns is not fixed for the data source, and even the types can vary. So, when I update the data source based on a callback when there is a new country selected, I get this error:
Error: attempted to retrieve property array for nonexistent field 'pay_conv_7d.content'.
I am guessing because in the new data source, the pay_conv_7d.content column doesn't exist, but in my plot those lines were already there. I have been trying to fix this issue by various means (making common columns for all country selection - adding the missing column in the data source in callback, but still get issues.
Is there any clean way to have multiple line plots updating using callback, and not do a lot of hackish way? Any insights or help would be really appreciated. Thanks much in advance! :)
def setup_multiline_plots(x_axis, y_axis, title_text, data_source, plot):
num_categories = len(data_source.data['categories'])
legends_list = list(data_source.data['categories'])
colors_list = Spectral11[0:num_categories]
# xs = [data_source.data['%s.'%x_axis].values] * num_categories
# ys = [data_source.data[('%s.%s')%(y_axis,column)] for column in data_source.data['categories']]
# data_source.data['x_series'] = xs
# data_source.data['y_series'] = ys
# plot.multi_line('x_series', 'y_series', line_color=colors_list,legend='categories', line_width=3, source=data_source)
plot_list = []
for (colr, leg, column) in zip(colors_list, legends_list, data_source.data['categories']):
xs, ys = '%s.'%x_axis, ('%s.%s')%(y_axis,column)
plot.line(xs,ys, source=data_source, color=colr, legend=leg, line_width=3, name=ys)
plot_list.append(ys)
data_source.data['plot_names'] = data_source.data.get('plot_names',[]) + plot_list
plot.title.text = title_text
def update_plot(country, timeseries_df, timeseries_source,
aggregate_df, aggregate_source, category,
plot_pay_7d, plot_r_pay_90d):
aggregate_metrics = aggregate_df.loc[aggregate_df.country == country]
aggregate_metrics = aggregate_metrics.nlargest(10, 'cost')
category_types = list(aggregate_metrics[category].unique())
timeseries_df = timeseries_df[timeseries_df[category].isin(category_types)]
timeseries_multi_line_metrics = get_multiline_column_datasource(timeseries_df, category, country)
# len_series = len(timeseries_multi_line_metrics.data['time.'])
# previous_legends = timeseries_source.data['plot_names']
# current_legends = timeseries_multi_line_metrics.data.keys()
# common_legends = list(set(previous_legends) & set(current_legends))
# additional_legends_list = list(set(previous_legends) - set(current_legends))
# for legend in additional_legends_list:
# zeros = pd.Series(np.array([0] * len_series), name=legend)
# timeseries_multi_line_metrics.add(zeros, legend)
# timeseries_multi_line_metrics.data['plot_names'] = previous_legends
timeseries_source.data = timeseries_multi_line_metrics.data
aggregate_source.data = aggregate_source.from_df(aggregate_metrics)
def get_multiline_column_datasource(df, category, country):
df_country = df[df.country == country]
df_pivoted = pd.DataFrame(df_country.pivot_table(index='time', columns=category, aggfunc=np.sum).reset_index())
df_pivoted.columns = df_pivoted.columns.to_series().str.join('.')
categories = list(set([column.split('.')[1] for column in list(df_pivoted.columns)]))[1:]
data_source = ColumnDataSource(df_pivoted)
data_source.data['categories'] = categories
Recently I had to update data on a Multiline glyph. Check my question if you want to take a look at my algorithm.
I think you can update a ColumnDataSource in three ways at least:
You can create a dataframe to instantiate a new CDS
cds = ColumnDataSource(df_pivoted)
data_source.data = cds.data
You can create a dictionary and assign it to the data attribute directly
d = {
'xs0': [[7.0, 986.0], [17.0, 6.0], [7.0, 67.0]],
'ys0': [[79.0, 69.0], [179.0, 169.0], [729.0, 69.0]],
'xs1': [[17.0, 166.0], [17.0, 116.0], [17.0, 126.0]],
'ys1': [[179.0, 169.0], [179.0, 1169.0], [1729.0, 169.0]],
'xs2': [[27.0, 276.0], [27.0, 216.0], [27.0, 226.0]],
'ys2': [[279.0, 269.0], [279.0, 2619.0], [2579.0, 2569.0]]
}
data_source.data = d
Here if you need different sizes of columns or empty columns you can fill the gaps with NaN values in order to keep column sizes. And I think this is the solution to your question:
import numpy as np
d = {
'xs0': [[7.0, 986.0], [17.0, 6.0], [7.0, 67.0]],
'ys0': [[79.0, 69.0], [179.0, 169.0], [729.0, 69.0]],
'xs1': [[17.0, 166.0], [np.nan], [np.nan]],
'ys1': [[179.0, 169.0], [np.nan], [np.nan]],
'xs2': [[np.nan], [np.nan], [np.nan]],
'ys2': [[np.nan], [np.nan], [np.nan]]
}
data_source.data = d
Or if you only need to modify a few values then you can use the method patch. Check the documentation here.
The following example shows how to patch entire column elements. In this case,
source = ColumnDataSource(data=dict(foo=[10, 20, 30], bar=[100, 200, 300]))
patches = {
'foo' : [ (slice(2), [11, 12]) ],
'bar' : [ (0, 101), (2, 301) ],
}
source.patch(patches)
After this operation, the value of the source.data will be:
dict(foo=[11, 22, 30], bar=[101, 200, 301])
NOTE: It is important to make the update in one go to avoid performance issues
Could you help me wiht my issue ? Let's say that I have few list with ID's their members, like below:
team_A = [1,2,3,4,5]
team_B = [6,7,8,9,10]
team_C = [11,12,13,14,15]
and now I have a dictionary with their values:
dictionary = {5:23, 10:68, 15:68, 4:1, 9:37, 14:21, 3:987, 8:3, 13:14, 2:98, 7:74, 12:47, 1:37, 6:82, 11:99}
I would like to take correct elements from dictionary and create new dictionary for team A, B and C, like below:
team_A_values = {5:23, 4:1, 3:987, 2:98, 1:37}
Could you give advice how to do that ? Thanks for your help
You can do something like below by just Iterating through the lists
team_A = [1,2,3,4,5]
team_B = [6,7,8,9,10]
team_C = [11,12,13,14,15]
dictionary = {5:23, 10:68, 15:68, 4:1, 9:37, 14:21, 3:987, 8:3, 13:14, 2:98, 7:74, 12:47, 1:37, 6:82, 11:99}
team_A_values = {}
for i in team_A:
team_A_values[i] = dictionary[i]
print(team_A_values )
can repeat this to team B and team C
in that case you can do like this
team_values = [{i: dictionary[i] for i in team_A },{i: dictionary[i] for i in team_B},{i: dictionary[i] for i in team_C}]
teamA,teamB,teamC = team_values
print(team_values)
print(teamA)
print(teamB)
print(teamC)
in one line you can do like this
team_values = [{i: dictionary[i] for i in team } for team in [team_A ,team_B ,team_C]]
teamA,teamB,teamC = team_values
print(team_values)
print(teamA)
print(teamB)
print(teamC)