I try to write a shopping program in python. so i need to categorizing shopping item as default or new category that user adding like below:
1- user can add category and item also update them.
shop = [category1[ [item name : apple , count : 2 , price:1$],[item name :orange , count :2 , price:3]],category2[[item name : spoon , count :2 , price :3],[item name :fork , count :4 , price:5]]]
You may be better off using a dictionary for the data:
shop = {
'category1': {
'apple': { 'count': 2, 'price': 1 },
'orange': { 'count': 2, 'price': 3 }
},
'category2': {
'spoon': { 'count': 2, 'price': 3 },
'fork': { 'count': 4, 'price': 5 }
}
}
You can still iterate over the keys if you want, and it provides sensible nesting because you can access the named keys instead of indexes.
Related
I have a DataFrame that has a website, categories, and keywords for that website.
Url | categories | keywords
Espn | [sport, nba, nfl] | [half, touchdown, referee, player, goal]
Tmz | [entertainment, sport] | [gossip, celebrity, player]
Goal [ [sport, premier_league, champions_league] | [football, goal, stadium, player, referee]
Which can be created using this code:
data = [{ 'Url': 'ESPN', 'categories': ['sport', 'nba', 'nfl'] ,
'keywords': ["half", "touchdown", "referee", "player", "goal"] },
{ 'Url': 'TMZ', 'categories': ["entertainment", "sport"] ,
'keywords': ["gossip", "celebrity", "player"] },
{ 'Url': 'Goal', 'categories': ["sport", "premier_league", "champions_league"] ,
'keywords': ["football", "goal", "stadium", "player", "referee"]},
]
df =pd.DataFrame(data)
For all the word in the keywords column, I want to get the frequency of categories associated with it. The results might look like this:
{half: {sport: 1, nba: 1, nfl: 1}, touchdown : {sport: 1, nba: 1,
nfl: 1}, referee: {sport: 2, nba: 1, nfl: 1, premier_league: 1,
champions_league:1 }, player: {sport: 3, nba: 1, nfl: 1,
premier_league: 1, champions_league:1 }, gossip: {sport:1,
entertainment:1}, celebrity: {sport:1, entertainment:1}, goal:
{sport:2, premier_league:1, champions_league:1, nba: 1, nfl: 1},
stadium:{sport:1, premier_league:1, champions_league:1} }
Since the columns contain lists, you can explode them to repeat a row once for each element per list:
result = (
df.explode("keywords")
.explode("categories")
.groupby(["keywords", "categories"])
.size()
)
In the AWS API documentation it wants me to call a function in the boto3 module like this:
response = client.put_metric_data(
Namespace='string',
MetricData=[
{
'MetricName': 'string',
'Dimensions': [
{
'Name': 'string',
'Value': 'string'
},
],
'Timestamp': datetime(2015, 1, 1),
'Value': 123.0,
'StatisticValues': {
'SampleCount': 123.0,
'Sum': 123.0,
'Minimum': 123.0,
'Maximum': 123.0
},
'Values': [
123.0,
],
'Counts': [
123.0,
],
'Unit': 'Seconds'|'Microseconds'|'Milliseconds'|'Bytes'|'Kilobytes'|'Megabytes'|'Gigabytes'|'Terabytes'|'Bits'|'Kilobits'|'Megabits'|'Gigabits'|'Terabits'|'Percent'|'Count'|'Bytes/Second'|'Kilobytes/Second'|'Megabytes/Second'|'Gigabytes/Second'|'Terabytes/Second'|'Bits/Second'|'Kilobits/Second'|'Megabits/Second'|'Gigabits/Second'|'Terabits/Second'|'Count/Second'|'None',
'StorageResolution': 123
},
]
)
So, I set a variable using the same format:
cw_metric = [
{
'MetricName': '',
'Dimensions': [
{
'Name': 'Protocol',
'Value': 'SSH'
},
],
'Timestamp': '',
'Value': 0,
'StatisticValues': {
'SampleCount': 1,
'Sum': 0,
'Minimum': 0,
'Maximum': 0
}
}
]
To my untrained eye, this looks simply like json and I am able to use json.dumps(cw_metric) to get a JSON formatted string output that looks, well, exactly the same.
But, apparently, in Python, when I use brackets I am creating a list and when I use curly brackets I am creating a dict. So what did I create above? A list of dicts or in the case of Dimensions a list of dicts with a list of dicts? Can someone help me to understand that?
And finally, now that I have created the cw_metric variable I want to update some of the values inside of it. I've tried several combinations. I want to do something like this:
cw_metric['StatisticValues']['SampleCount']=2
I am of course told that I can't use a str as an index on a list.
So, I try something like this:
cw_metric[4][0]=2
or
cw_metric[4]['SampleCount']=2
It all just ends up in errors.
I found that this works:
cw_metric[0]['StatisticValues']['SampleCount']=2
But, that just seems stupid. Is this the proper way to handle this?
cw_metric is a list of one dictionary. Thus, cw_metric[0] is that dictionary. cw_metric[0]['Dimensions'] is a list of one dictionary as well. cw_metric[0]['StatisticValues'] is just a dictionary. One of its elements is, for example, cw_metric[0]['StatisticValues']['SampleCount'] == 1.
I have nested dictionary , trying to iterate over it and get the values by key,
I have a payload which has route as main node, inside route i have many waypoints, i would like to iterate over all way points and sets the value based on key name into a protobuff variable.
sample code below:
'payload':
{
'route':
{
'name': 'Argo',
'navigation_type': 2,
'backtracking': False,
'continuous': False,
'waypoints':
{
'id': 2,
'coordinate':
{
'type': 0,
'x': 51.435989,
'y': 25.32838,
'z': 0
},
'velocity': 0.55555582,
'constrained': True,
'action':
{
'type': 1,
'duration': 0
}
}
'waypoints':
{
'id': 2,
'coordinate':
{
'type': 0,
'x': 51.435989,
'y': 25.32838,
'z': 0
},
'velocity': 0.55555582,
'constrained': True,
'action':
{
'type': 1,
'duration': 0
}
}
},
'waypoint_status_list':
{
'id': 1,
'status': 'executing'
},
'autonomy_status': 3
},
#method to iterate over payload
def get_encoded_payload(self, payload):
#1 fill route proto from payload
a = payload["route"]["name"] #working fine
b = payload["route"]["navigation_type"] #working fine
c = payload["route"]["backtracking"] #working fine
d = payload["route"]["continuous"] #working fine
self.logger.debug(type(payload["route"]["waypoints"])) # type is dict
#iterate over waypoints
for waypoint in payload["route"]["waypoints"]:
wp_id = waypoint["id"] # Error, string indices must be integer
i would like to iterate over all waypoints and set the value of each key value to a variable
self.logger.debug(type(payload["route"]["waypoints"])) # type is dict
Iterating over a dict gives you its keys. Your later code seems to be expecting multiple waypoints as a list of dicts, which would work, but that's not what your structure actually contains.
Try print(waypoint) and see what you get.
I have data in table in this format
emp_id,emp_name,title,supervisor_id,supervisor_name
11,Anant,Business Unit Executive,8,abc
15,Raina,Analysis Manager Senior,11,Anant
16,Kumar,Conversion Manager,11,Anant
18,amit,Analyst Specialist,11,Anant
25,anil,senior engineer,18,amit
35,Pang Pang,senior engineer,25,anil
38,Xiang Xiang,UE engineer,25,anil
I will enter supervisor_id and it will return all employee under that then after continue this until we achieve lower level, i want to do this in node and sql server with recursive function.
I want this data to be in hierarchical way like this .
var ds ={ 'emp_id':11,
'name': 'Anant',
'title': 'Business Unit Executive',
'children': [
{ 'name': 'Raina','emp_id':15, 'title': 'Analysis Manager Senior' },
{ 'name': 'Kumar','emp_id':16, 'title': 'Conversion Manager' },
{ 'name': 'amit', 'emp_id':18, 'title': 'Analyst Specialist',
'children': [
{ 'name': 'anil','emp_id':25, 'title': 'senior engineer' ,
'children': [
{ 'name': 'Pang Pang','emp_id':35, 'title': 'engineer' },
{ 'name': 'Xiang Xiang', 'emp_id':38,'title': 'UE engineer' }
]
}
]
}
]
};
I'm not familiar with the which library you are using to request form server so i will sudo code those portions
async getEmployeesBySupervisorId(supervidor_id){
const employees = await <get-employees-query> // you may also need to map the results to your {emp_id, name, title} depending on your query library default to [] if no employees are found
return Promise.all(...employees.map(employee=>{
employee.children = await getEmployeesBySupervisorId(employee.emp_id)
}))
}
That will get you an array of employees, with children until no more employees are found,
While this will work it fires many queries, it may be better for you to leverage sql and your ORM to make this more efficient in the future.
So I have a set of data that have timestamps associated with it. I want mongo to aggregate the ones that have duplicates within a 3 min timestamp. I'll show you an example of what I mean:
Original Data:
[{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:47:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z"}]
After querying, it would be:
[{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z"}]
Because the second entry was within the 3 min bubble created by the first entry. I've gotten the code so that it aggregates and removed dupes that have the same fruit but now I only want to combine the ones that are within the timestamp bubble.
We should be able to do this! First lets split up an hour in 3 minute 'bubbles':
[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57]
Now to group these documents we need to modify the timestamp a little. As far I as know this isn't currently possible with the aggregation framework so instead I will use the group() method.
In order to group fruits within the same time period we need to set the timestamp to the nearest minute 'bubble'. We can do this with timestamp.minutes -= (timestamp.minutes % 3).
Here is the resulting query:
db.collection.group({
keyf: function (doc) {
var timestamp = new ISODate(doc.timestamp);
// seconds must be equal across a 'bubble'
timestamp.setUTCSeconds(0);
// round down to the nearest 3 minute 'bubble'
var remainder = timestamp.getUTCMinutes() % 3;
var bubbleMinute = timestamp.getUTCMinutes() - remainder;
timestamp.setUTCMinutes(bubbleMinute);
return { fruit: doc.fruit, 'timestamp': timestamp };
},
reduce: function (curr, result) {
result.sum += 1;
},
initial: {
sum : 0
}
});
Example results:
[
{
"fruit" : "apple",
"timestamp" : ISODate("2014-07-17T06:45:00Z"),
"sum" : 2
},
{
"fruit" : "apple",
"timestamp" : ISODate("2014-07-17T06:54:00Z"),
"sum" : 1
},
{
"fruit" : "banana",
"timestamp" : ISODate("2014-07-17T09:03:00Z"),
"sum" : 1
},
{
"fruit" : "orange",
"timestamp" : ISODate("2014-07-17T14:24:00Z"),
"sum" : 2
}
]
To make this easier you could precompute the 'bubble' timestamp and insert it into the document as a separate field. The documents you create would look something like this:
[
{"fruit" : "apple", "timestamp": "2014-07-17T06:45:18Z", "bubble": "2014-07-17T06:45:00Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:47:18Z", "bubble": "2014-07-17T06:45:00Z"},
{"fruit" : "apple", "timestamp": "2014-07-17T06:55:18Z", "bubble": "2014-07-17T06:54:00Z"}
]
Of course this takes up more storage. However, with this document structure you can use the aggregate function[0].
db.collection.aggregate(
[
{ $group: { _id: { fruit: "$fruit", bubble: "$bubble"} , sum: { $sum: 1 } } },
]
)
Hope that helps!
[0] MongoDB aggregation comparison: group(), $group and MapReduce