Make key lowercase in List of Dictionaries (Python3) - python-3.x

Been looking through Stackoverflow and documentations for 2 days now, I am a beginner, and I just can't progress. I am using Python 3.8.
I have a list of dictionaries:
books = [{'Type': 'Book', 'Date': '2011', 'Publication Year': '2011', 'Place Published': 'New York', 'Publisher': 'Simon & Schuster', 'Author': 'Walter Isaacson', 'ISBN': '978-1-4516-4853-9', 'Title': 'Test Steve Jobs'}, {'Type': 'Book', 'Date': '2001', 'Publication Year': '2001', 'Place Published': 'Oxford', 'Publisher': 'Oxford University press', 'Author': 'Peter Hall', 'ISBN': '978-0-19-924775-2', 'Title': 'Test Varieties of capitalism: the institutional foundations of comparative advantage'}]
print(books)
I want to make the key "Type" into a lowercase "type".
But with the following List Comprehension it somehow makes the key to a value and vice versa.
lower_list = [ { v:k.lower() for k,v in d.items() } for d in books ]
print(lower_list)
I end up with [{'Book': 'type',.... when it should be [{'type': 'Book',....
I am struggling with understanding the list comprehension syntax still, so would be grateful for 1. somebody explaining what my list comprehension does in plain English and 2. how to change it to achieve what I am looking for. :)
Thank you!

So your first problem:
lower_list = [ { k.lower():v for k,v in d.items() } for d in books ] ?
You was inverting key and values.
Your last question how to skip lowercasing the ISBN key:
[ { k if k is "ISBN" else k.lower():v.lower() for k,v in d.items()} for d in books ]
But you should consider using a for loop: if your need more operations or conditions, it would start to be difficult to modify further.
my_final_books = []
for d in books:
for k,v in d.items():
if k is "ISBN":
key = k
else:
key = k.lower()
# or ternary form key = k if k is "ISBN" else k.lower()
my_final_books.append({key:v})
# do more logic here

Related

How to remove common item from list of dictionaries after grouping

I have a list of dictionaries like below. I want to group the dictionaries based on grade, and convert the list of dictionaries to single dictionaries with key as grade value and value as list of dictionaries
Input:
[
{'name':'abc','mark':'99','grade':'A'},
{'name':'xyz','mark':'90','grade':'A'},
{'name':'123','mark':'70','grade':'C'},
]
I want my output like below:
{
A: [ {'name': 'abc','mark':'99'}, {'name': 'xyz','mark':'90'} ],
C: [ {'name': '123','mark':'70'} ]
}
I tried sorted and groupby; but not able to remove grade from dictionary.
Use a loop with dict.setdefault:
l = [{'name':'abc','mark':'99','grade':'A'},
{'name':'xyz','mark':'90','grade':'A'},
{'name':'123','mark':'70','grade':'C'},
]
out = {}
for d in l:
# avoid mutating the original dictionaries
d = d.copy()
# get grade, try to get the key in "out"
# if the key doesn't exist, initialize with an empty list
out.setdefault(d.pop('grade'), []).append(d)
print(out)
Output:
{'A': [{'name': 'abc', 'mark': '99'},
{'name': 'xyz', 'mark': '90'}],
'C': [{'name': '123', 'mark': '70'}],
}

create new dictionary based on keys and split the dictionary values

I am relatively new to python programming. I was trying some challenges in online to thorough my programming skills. I got stuck with the below code. Please someone help here.
ress = {'product': ['Mountain Dew Spark', 'pepsi'], 'quantity': ['7', '5']}
prods_list = []
prods_dict = {}
for k , v in ress.items():
if "product" in k:
if len(ress['product']) > 1:
entity_names = {}
entity_list = []
for i in range(len(ress['product'])):
prod = "product_" + str(i)
entity_names['product'] = ress['product'][i]
entity_names['quantity'] = ress['quantity'][i]
entity_list.append(entity_names)
prods_dict[prod] = entity_list
prods_list.append(prods_dict)
print(prods_list)
i am expecting output as below
Expected output:
[{"product_0":
{"quantity" : "7",
"product" : "mountain dew spark"}
},
{"product_1" : {
"quantity" : "5",
"product" : "pepsi"
}}]
Actual output:
[{'product_0': [{'product': 'pepsi', 'quantity': '5'},
{'product': 'pepsi', 'quantity': '5'}],
'product_1': [{'product': 'pepsi', 'quantity': '5'},
{'product': 'pepsi', 'quantity': '5'}]}]
Please note i want my code work for single values as well like ress = {'product': ['Mountain Dew Spark'], 'quantity': ['7']}
This is one way you can achieve it with regular loops:
ress = {'product': ['Mountain Dew Spark', 'pepsi'], 'quantity': ['7', '5']}
prods_list = []
for key, value in ress.items():
for ind, el in enumerate(value):
prod_num = 'product_' + str(ind)
# If this element is already present
if (len(prods_list) >= ind + 1):
# Add to existing dict
prods_list[ind][prod_num][key] = el
else:
# Otherwise - create a new dict
prods_list.append({ prod_num : { key : el } })
print(prods_list)
The first loop goes through the input dictionary, the second one through each of its lists. The code then determines if a dictionary for that product is already in the output list by checking the output list length. If it is, the code simply appends new inner dict for that product. If it is not - the code creates an outer dict for that product - and an inner one for this particular value set.
Maybe using a list comprehension along with enumerate and zip might be easier:
>>> res = {'product': ['Mountain Dew Spark', 'pepsi'], 'quantity': ['7', '5']}
>>> prods_list = [
... {f'product_{i}': {'quantity': int(q), 'product': p.lower()}}
... for i, (q, p) in enumerate(zip(res['quantity'], res['product']))
... ]
>>> prods_list
[{'product_0': {'quantity': 7, 'product': 'mountain dew spark'}}, {'product_1': {'quantity': 5, 'product': 'pepsi'}}]
This assumes that there will be no duplicate product entries. In that case, you would need to use a traditional for loop.

extract dictionary elements from nested list in python

I have a question.
I have a nested list that looks like this.
x= [[{'screen_name': 'BreitbartNews',
'name': 'Breitbart News',
'id': 457984599,
'id_str': '457984599',
'indices': [126, 140]}],
[],
[],
[{'screen_name': 'BreitbartNews',
'name': 'Breitbart News',
'id': 457984599,
'id_str': '457984599',
'indices': [98, 112]}],
[{'screen_name': 'BreitbartNews',
'name': 'Breitbart News',
'id': 457984599,
'id_str': '457984599',
'indices': [82, 96]}]]
There are some empty lists inside the main list.
What I am trying to do is to extract screen_name and append them as a new list including the empty ones (maybe noting them as 'null').
y=[]
for i in x :
for j in i :
if len(j)==0 :
n = 'null'
else :
n = j['screen_name']
y.append(n)
I don't know why the code above outputs a list,
['BreitbartNews',
'BreitbartNews',
'BreitbartNews',
'BreitbartNews',
'BreitbartNews']
which don't reflect the empty sublist.
Can anyone help me how I can refine my code to make it right?
You are checking the lengths of the wrong lists. Your empty lists are in the i variables.
The correct code would be
y=[]
for i in x :
if len(i) == 0:
n = 'null'
else:
n = i[0]['screen_name']
y.append(n)
It may help to print(i) in each iteration to better understand what is actually happening.

How to substring the column name in python

I have a column named 'comment1abc'
I am writing a piece of code where I want to see that if a column contains certain string 'abc'
df['col1'].str.contains('abc') == True
Now, instead of hard coding 'abc', I want to use a substring like operation on column 'comment1abc' (to be precise, column name, not the column values)so that I can get the 'abc' part out of it. For example below code does a similar job
x = 'comment1abc'
x[8:11]
But how do I implement that for a column name ? I tried below code but its not working.
for col in ['comment1abc']:
df['col123'].str.contains('col.names[8:11]')
Any suggestion will be helpful.
Sample dataframe:
f = {'name': ['john', 'tom', None, 'rock', 'dick'], 'DoB': [None, '01/02/2012', '11/22/2014', '11/22/2014', '09/25/2016'], 'location': ['NY', 'NJ', 'PA', 'NY', None], 'code': ['abc1xtr', '778abc4', 'a2bcx98', None, 'ab786c3'], 'comment1abc': ['99', '99', '99', '99', '99'], 'comment2abc': ['99', '99', '99', '99', '99']}
df1 = pd.DataFrame(data = f)
and sample code:
for col in ['comment1abc', 'comment2abc']:
df1[col][df1['code'].str.contains('col.names[8:11]') == True] = '1'
I think the answer would be simple like this:
for col in ['comment1abc', 'comment2abc']:
x = col[8:11]
df1[col][df1['code'].str.contains('x') == True] = '1'
Trying to use a column name within .str.contains() wasn't a good idea. Better use a string.

Comparing like words between two dictionaries

I am using python 3.x,
I have 2 dictionaries (both very large but will substitute here). The values of the dictionaries contain more than one word:
dict_a = {'key1': 'Large left panel', 'key2': 'Orange bear rug', 'key3': 'Luxo jr. lamp'}
dict_a
{'key1': 'Large left panel',
'key2': 'Orange bear rug',
'key3': 'Luxo jr. lamp'}
dict_b = {'keyX': 'titanium panel', 'keyY': 'orange Ball and chain', 'keyZ': 'large bear musket'}
dict_b
{'keyX': 'titanium panel',
'keyY': 'orange Ball and chain',
'keyZ': 'large bear musket'}
I am looking for a way to compare the individual words contained in the values of dict_a to the words contained in the values of dict_b and return a dictionary or data-frame that contains the word, and the keys from dict_a and dict_b it is associated with:
My desired output (not formatted any certain way):
bear: key2 (from dict_a), keyZ(from dict_b)
Luxo: key3
orange: key2 (from dict_a), keyY (from dict_b)
I've got code that works for looking up a specific word in a single dictionary but it's not sufficient for what I need to accomplish here:
def search(myDict, lookup):
aDict = {}
for key, value in myDict.items():
for v in value:
if lookup in v:
aDict[key] = value
return aDict
print (key, value)
dicts = {'a': {'key1': 'Large left panel', 'key2': 'Orange bear rug',
'key3': 'Luxo jr. lamp'},
'b': {'keyX': 'titanium panel', 'keyY': 'orange Ball and chain',
'keyZ': 'large bear musket'} }
from collections import defaultdict
index = defaultdict(list)
for dname, d in dicts.items():
for key, words in d.items():
for word in words.lower().split(): # lower() to make Orange/orange match
index[word].append((dname, key))
index now contains:
{'and' : [('b', 'keyY')],
'ball' : [('b', 'keyY')],
'bear' : [('a', 'key2'), ('b', 'keyZ')],
'chain' : [('b', 'keyY')],
'jr.' : [('a', 'key3')],
'lamp' : [('a', 'key3')],
'large' : [('a', 'key1'), ('b', 'keyZ')],
'left' : [('a', 'key1')],
'luxo' : [('a', 'key3')],
'musket' : [('b', 'keyZ')],
'orange' : [('a', 'key2'), ('b', 'keyY')],
'panel' : [('a', 'key1'), ('b', 'keyX')],
'rug' : [('a', 'key2')],
'titanium': [('b', 'keyX')] }
Update to comments
Since your actual dictionary is a mapping from string to list (and not string to string) change your loops to
for dname, d in dicts.items():
for key, wordlist in d.items(): # changed "words" to "wordlist"
for words in wordlist: # added extra loop to iterate over wordlist
for word in words.split(): # removed .lower() since text is always uppercase
index[word].append((dname, key))
Since your lists have only one item you could just do
for dname, d in dicts.items():
for key, wordlist in d.items():
for word in wordlist[0].split(): # assumes single item list
index[word].append((dname, key))
If you have words that you don't want to be added to your index you can skip adding them to the index:
words_to_skip = {'-', ';', '/', 'AND', 'TO', 'UP', 'WITH', ''}
Then filter them out with
if word in words_to_skip:
continue
I noticed that you have some words surrounded by parenthesis (such as (342) and (221)). If you want to get rid the parenthesis do
if word[0] == '(' and word[-1] == ')':
word = word[1:-1]
Putting this all together we get
words_to_skip = {'-', ';', '/', 'AND', 'TO', 'UP', 'WITH', ''}
for dname, d in dicts.items():
for key, wordlist in d.items():
for word in wordlist[0].split(): # assumes single item list
if word[0] == '(' and word[-1] == ')':
word = word[1:-1] # remove outer parenthesis
if word in words_to_skip: # skip unwanted words
continue
index[word].append((dname, key))
I think you can do what you want pretty easily. This code produces output in the format {word: {key: name_of_dict_the_key_is_in}}:
def search(**dicts):
result = {}
for name, dct in dicts.items():
for key, value in dct.items():
for word in value.split():
result.setdefault(word, {})[key] = name
return result
You call it with the input dictionaries as keyword arguments. The keyword you use for each dictionary will be the string used to describe it in the output dictionary, so use something like search(dict_a=dict_a, dict_b=dict_b).
If your dictionaries might have some of the same keys, this code might not work right, since the keys could collide if they have the same words in their values. You could make the outer dict contain a list of (key, name) tuples, instead of an inner dictionary, I suppose. Just change the assignment line to result.setdefault(word, []).append((key, name)). That would be less handy to search in though.

Resources