How to compare list of dicts with list in python - python-3.x

I am working on a computer vision project where the model is predicting the objects in the frame. I am appending all the objects in a list detectedObjs. I have to create a list of dicts for these detected objects which will contain the name, start time and end time of the object. Start time basically means when the object was first detected and end time means when the object was last detected. So for this I have below code:
for obj in detectedObjs:
if not objList:
# First object is detected, save its information
tmp = dict()
tmp['Name'] = obj
tmp['StartTime'] = datetime.datetime.utcnow().isoformat()
tmp['EndTime'] = datetime.datetime.utcnow().isoformat()
objList.append(tmp)
else:
# Here check if the object is alreay present in objList
# If yes, then keep updating end time
# If no, then add the object information in objList
for objDict in objList:
if objDict['Name'] == obj:
objDict["EndTime"] = datetime.datetime.utcnow().isoformat()
break
else:
tmp = dict()
tmp['Name'] = obj
tmp['StartTime'] = datetime.datetime.utcnow().isoformat()
tmp['EndTime'] = datetime.datetime.utcnow().isoformat()
objList.append(tmp)
So first in for loop I am saving the information of the first detected object. After that in else, I am checking if the current object is already added in objList, if yes then keep updating the end time otherwise, add it in objList.
The detectedObjs list have item1 and then after few secs item2 is also added. But in the output of objList I can see item1 properly added but item2 is added lot many times. Is there any way to optimize this code so that I can have proper start and end times. Thanks
Below is the full reproducible code. I cannot put the code of prediction from the model here so I have added a thread which will keep on adding items to detectedObj list
from threading import Thread
import datetime
import time
detectedObjs = []
def doJob():
global detectedObjs
for i in range(2):
if i == 0:
detectedObjs.append("item1")
elif i == 1:
detectedObjs.append("item2")
elif i == 2:
detectedObjs.append("item3")
elif i == 3:
detectedObjs.remove("item1")
elif i == 4:
detectedObjs.remove("item2")
elif i == 5:
detectedObjs.remove("item3")
time.sleep(3)
Thread(target=doJob).start()
while True:
objList = []
for obj in detectedObjs:
if not objList:
# First object is detected, save its information
tmp = dict()
tmp['Name'] = obj
tmp['StartTime'] = datetime.datetime.utcnow().isoformat()
tmp['EndTime'] = datetime.datetime.utcnow().isoformat()
objList.append(tmp)
else:
# Here check if the object is alreay present in objList
# If yes, then keep updating end time
# If no, then add the object information in objList
for objDict in objList:
if objDict['Name'] == obj:
objDict["EndTime"] = datetime.datetime.utcnow().isoformat()
break
else:
tmp = dict()
tmp['Name'] = obj
tmp['StartTime'] = datetime.datetime.utcnow().isoformat()
tmp['EndTime'] = datetime.datetime.utcnow().isoformat()
objList.append(tmp)
print(objList)

I would recommend you use a dict containing dicts… here is an untested version of your code…
obj_dict = {}
for obj in detectedObjs:
if obj not in obj_dict: # checks the keys for membership
# first entry
time_seen = datetime.datetime.utcnow().isoformat()
obj_dict[obj] = {
“name”: obj,
“start”: time_seen,
“end”: time_seen,
}
else: # additional time(s) seen
time_seen = datetime.datetime.utcnow().isoformat()
obj_dict[obj][“end”] = time_seen
Additionally this will save on processing as your list grows larger, it won’t have to search the whole list for an entry each time to update it.

Related

Why am I getting this error? AttributeError: type object 'bubbleSort' has no attribute 'array'

I'm trying to make my own bubble-sort algorithm for learning purposes. I'm doing it by:
Making a random array
Checking if the first two indexes of the array need to be swapped
it does this throughout the whole list
and does it over and over until when looping through until the end it doesn't need to swap anything anymore then the loop breaks
but when I print any variable in the class it says that the class has no attribute of the variable.
this is my code right now
from random import randint
class bubbleSort:
def __init__(self, size):
self.size = size # Array size
self.array = [] # Random array
self.sorted = self.array # Sorted array
self.random = 0 # Random number
self.count = 0
self.done = False
self.equal = 0
while self.count != self.size:
random = randint(1, self.size)
if random in self.array:
pass
else:
self.array.append(random)
self.count += 1
def sort(self):
while self.done != True:
self.equal = False
for i in range(self.size):
if i == self.size:
pass
else:
if self.sorted[i] > [self.tmp]:
self.equal += 1
if self.equal == self.size:
self.done = True
else:
self.sorted[i], self.sorted[i + 1] = self.sorted[i+1], self.sorted[i]
new = bubbleSort(10)
print(bubbleSort.array)
This is what outputs
Traceback (most recent call last):
File "/home/musab/Documents/Sorting Algorithms/Bubble sort.py", line 38, in <module>
print(bubbleSort.array)
AttributeError: type object 'bubbleSort' has no attribute 'array'
In your case, you have a class called bubbleSort and an instance of this class called new, which you create using new = bubbleSort(10).
Since bubbleSort only refers to the class itself, it has no knowledge of member fields of any particular instance (the fields you create using self.xyz = abc inside of the class functions. And this is good, imagine having two instances
b1 = bubbleSort(10)
b2 = bubbleSort(20)
and you want to access the array of b1, you need to specify this somehow. The way to do it is to call b1.array.
Therefore, in your case you need to print(new.array).
bubbleSort is a class type, each object of this class type has its own array. To access array, one must do it through a class object. __init__ is called when creating a class object.
give the following a try:
bubbleSortObj = bubbleSort(10) # create a bubbleSort object
print(bubbleSortObj.array) # print the array before sort
bubbleSortObj.sort() # sort the array
print(bubbleSortObj.array) # print the array after sort
Notes
In __init__ you've got:
self.array = [] # Random array
self.sorted = self.array # Sorted array
In this case, array and sorted point to the same list and changing one would change the other. To make a copy of a list, one approach (among many) is to call sorted = list(array)
If there are any local function variables you can remove the self, eg, self.count = 0 can just be count = 0, as it's not needed again once it's used, and doesn't need to be a class member

multiprocessing.Manager().dict() can't update second level value

i have a dict x ,format like: x_dic = {0:{'length':2,'current':0}},but when i use Manager().dict() to pass x_dic to child process, i found the value in 'current' can't update by child process.
method 1:
dic[i]['current'] += 1
method 2:
current_val = dic[i]['current']
current_val += 1
dic[i]['current'] = current_val
if __name__ == '__main__':
# set config of logger
print("{}:{}:{}".format(time.localtime().tm_hour,
time.localtime().tm_min, time.localtime().tm_sec))
print(os.getpid())
# set parameter
lock = multiprocessing.Lock()
pool = multiprocessing.Pool(processes=2, initializer=start_process)
# set test dic
testdic = multiprocessing.Manager().dict()
x = {0:{'length':2,'current':0}}
testdic.update(x)
# before multi
print('now value testdic',dict(testdic))
# running
partialmulti = partial(multi_core, testdic=testdic)
for i, _ in enumerate(pool.imap_unordered(partialmulti,[0,0,0])):
print('finish process: ',i)
pool.close()
pool.join()
# after multiprocessing
print('after multi',dict(testdic))
pool.terminate()
You can try with multiprocessing.Process()
import multiprocessing as mp
m=mp.Manager()
x_dict=m.dict({0:{'length':2,'current':0}})
procs=[]
no_of_processes=2
for i in range(no_of_processes):
p=mp.Process(target=func_name, args=(x_dict,)) #func_name takes one argument as x_dict and does all the manupulations as required
p.start()
procs.append(p) #Just creating a pointer to the current process and storing in a list
for proc in procs:
proc.join()
print(x_dict)
when pass a python dict to Manager().dict(), python dict as the second level won't be changed. The solution is to pass anther Manager().dict() as the second level. For example:
valuedic = multiprocessing.Manager().dict()
valuedic.update({'length':0,'current':1})
x = {0:valuedic}
testdic.update(x)
then the valuedic will be successfully changed after multiprocessing.

Taking info from file and creating a dictionary

The goal of mine is to create a dictionary called 'sum_of_department' contains the department as the key and the total annual salary of all employees combined as a value. So far this is what I have but I'm a bit lost on how to add all the department names along with a sum of all of the employees salary in that dictionary. The current dictionary i tried displays only the amount of the salary and how many times its seen in the file. this is where i need the help.
import requests
# endpoint
endpoint = "https://data.cityofchicago.org/resource/xzkq-xp2w.json"
# optional parameters
parameters = {"$limit":20,}
# make request
response = requests.get(endpoint, params=parameters)
# Get the response data as a python object.
data = response.json()
count_by_department = {}
sum_by_department = {}
#loop through the data
for i in data:
if ('department' and 'salary_or_hourly' and 'annual_salary' in i):
department = i['department']
pay_type = i['salary_or_hourly']
anual_salary = i['annual_salary']
# print(i['annual_salary'])
else:
# handle case where there is no department property in that record
department = 'undefined'
pay_type = 'n/a'
anual_salary = 'n/a'
# print(department,"," ,pay_type)
# exclude the cases where the pay type is Hourly
if(pay_type != 'Salary' ):
pay_type = 0
# print(department,"," ,pay_type)
# update the sum_by_department and count_by_department dictionaries
if (department in count_by_department):
count_by_department[department] += 1
else:
count_by_department[department] = 1
if (anual_salary in sum_by_department):
sum_by_department[anual_salary] +=1
else:
sum_by_department[anual_salary] = 1
# print(count_by_department)
# print(sum_by_department)
You should add each person's annual_salary to the sum_by_department array while looping. Also, do not forget to convert your annual_salary variable to the float type, because adding them together as strings won't work.
Example script:
import requests
# endpoint
endpoint = "https://data.cityofchicago.org/resource/xzkq-xp2w.json"
# optional parameters
parameters = {"$limit":20,}
# make request
response = requests.get(endpoint, params=parameters)
# Get the response data as a python object.
data = response.json()
count_by_department = {}
sum_by_department = {}
#loop through the data
for i in data:
if ('department' and 'salary_or_hourly' and 'annual_salary' in i):
department = i['department']
pay_type = i['salary_or_hourly']
annual_salary = float(i['annual_salary'])
# print(i['annual_salary'])
else:
# handle case where there is no department property in that record
department = 'undefined'
pay_type = 'n/a'
annual_salary = 0
# print(department,"," ,pay_type)
# exclude the cases where the pay type is Hourly
if(pay_type != 'Salary' ):
pay_type = 0
# print(department,"," ,pay_type)
# update the sum_by_department and count_by_department dictionaries
if (department in count_by_department):
count_by_department[department] += 1
sum_by_department[department] += annual_salary
else:
count_by_department[department] = 1
sum_by_department[department] = annual_salary
#import pdb; pdb.set_trace();
print('count_by_department = ', count_by_department)
print('sum_by_department = ', sum_by_department)
Tip:
Uncomment the pdb line to debug interactively. The Python Debugger (pdb for short) halts the program while it's still running (i.e. in memory), so you can interact with it and inspect all variables.

Issue with set object with condition

Unable to figure out why am I getting the output as "No" only in the below code.
Shouldn't it print "Yes" for those 2 set values
import re
import subprocess
from plumbum import local, cmd
s = subprocess.check_output(["opatch", "lsinventory"])
output = s.decode("utf-8")
patches = [27923320, 27547329, 21171382, 21463894, 18961555, 28432129]
patches_found = set(re.findall(r'\b(?:%s)\b' % '|'.join(map(str, patches)), output))
patches_missing = set(map(str, patches)) - patches_found
for item in patches_missing:
if item in ["27923320", "27547329"]:
print("Yes", item)
else:
print("No")
The items 27923320 and 27547329 in the list patches are integers, while "27923320" and "27547329" are strings. This is what you want:
for item in patches_missing:
if item in [27923320, 27547329]:
print("Yes", item)
else:
print("No")

How to print results from this function

I'm new to Python and programming in general and need a little help with this (partially finished) function. It's calling a text file with a bunch of rows of comma delimited data (age, salary, education and so on). However, I've run into a problem from the outset. I don't know how to return the results.
My aim is to create dictionaries for each category and for each row to be sorted and tallied.
e.g. 100 people over 50, 200 people under 50 and so on.
Am I in the correct ball park?
file = "adultdata.txt"
def make_data(file):
try:
f = open(file, "r")
except IOError as e:
print(e)
return none
large_list = []
avg_age = 0
row_count_under50 = 0
row_count_over50 = 0
#create 2 dictionaries per category
employ_dict_under50 = {}
employ_dict_over50 = {}
for row in f:
edited_row = row.strip()
my_list = edited_row.split(",")
try:
#Age Category
my_list[0] = int(my_list[0])
#Work Category
if my_list[-1] == " <=50K":
if my_list[1] in employ_dict_under50:
employ_dict_under50[my_list[1]] += 1
else:
employ_dict_under50[my_list[1]] = 1
row_count_u50 += 1
else:
if my_list[1] in emp_dict_o50:
employ_dict_over50[my_list[1]] += 1
else:
employ_dict_over50[my_list[1]] = 1
row_count_o50 += 1
# Other categories here
print(my_list)
#print(large_list)
#return
# Ignored categories here - e.g. my_list[insert my list numbers here] = None
I do not have access to your file but I had a go at correcting most of the errors you had in your code.
These are a list of the mistakes I found in your code:
your function make_data is essentially useless and is out of scope. You need to remove it entirely
When using a file object f, you need to use readline to extract data from the file.
It is also best to use a with statement when using IO resources like files
You had numerous variables which were badly named in the inner loop and did not exist
You declared a try in the inner loop without a catch. You can remove the try because you are not trying to catch any Error
You have some very basic errors which are related to general programming, can I assume your new to this? If thats the case then you should probably follow some more beginner tutorials online until you get a grasp of what commands you need to use to perform basic tasks.
Try compare your code to this and see if you can understand what i'm trying to say:
file = "adultdata.txt"
large_list = []
avg_age = 0
row_count_under50 = 0
row_count_over50 = 0
#create 2 dictionaries per category
employ_dict_under50 = {}
employ_dict_over50 = {}
with open(file, "r") as f:
row = f.readline()
edited_row = row.strip()
my_list = edited_row.split(",")
#Age Category
my_list[0] = int(my_list[0])
#Work Category
if my_list[-1] == " <=50K":
if my_list[1] in employ_dict_under50:
employ_dict_under50[my_list[1]] += 1
else:
employ_dict_under50[my_list[1]] = 1
row_count_under50 += 1
else:
if my_list[1] in employ_dict_over50:
employ_dict_over50[my_list[1]] += 1
else:
employ_dict_over50[my_list[1]] = 1
row_count_over50 += 1
# Other categories here
print(my_list)
#print(large_list)
#return
I cannot say for certain if this code will work or not without your file but it should give you a head start.

Resources