get dict value by part of the key python - python-3.x

I want to get value from the dictionary by the part of the key, e.g. I have a dict with a compound key
tr_dict = {'UTABI-OSGAN': {"properties": {"id": "789"}},
'ABOKA-OSGAN': {"properties": {"id": "111"}},
'FE-DERIG': {"properties": {"id": "243"}}}
and I want to get values with key started 'UTABI' (the other case when key endswith e.g. 'DERIG')
I suppose it looks something like
start = 'UTABI'
tr_dict.get(start + '[-A-Z]{2,5}')
I know this syntax is incorrect but is it possible to do something like this?

Short answer: no. Dicts are not SQL databases, you have to give the exact key.
The brute force solution is to loop over the dict's keys and use string methods to find the relevant ones, ie:
for key in tr_dict:
if key.startswith("UTABI-"):
print("found {} : {}".format(key, tr_dict[key]))
which of course is O(n) and kind of defeat the whole point of having dicts. This is ok if you only need to do this lookup once for a given tr_dict, but sub-optimal if tr_dict has a long lifetime and will be looked up more than once for a given "partial" key.
Another solution which requires more upfront processing but allow for O(1) access afterward is to preprocess the whole dict once to build a new one with keys you can lookup directly:
from collections import defaultdict
lookups = [
# (key: callback)
("UTABI", lambda k: k.startswith("UTABI-")),
("DERIG", lambda k: k.endswith("-DERIG")),
]
index = defaultdict(list)
for newkey, match in lookups:
for oldkey in tr_dict:
if match(oldkey):
index[newkey].append(tr_dict[oldkey])
This is overkill for a one-shot lookup, but much better if you have to lookup those keys more than once for a given tr_dict.

The syntax you propose is interpreted as "give me the "UTABI[-A-Z]{2,5}" key".
As you want to filter "by intention" you can say :
filtered_dict = {key: value for key, value in tr_dict if key.startswith('UTABI')}

Here's one way of doing it,
return_values = {k:v for k,v in tr_dict.items() if k.startswith('UTABI') or k.endswith('DERIG')}
print(return_values)
Outputs:
{'UTABI-OSGAN': {'properties': {'id': '789'}}, 'FE-DERIG': {'properties': {'id': '243'}}}
And here's the expanded form, that does the same thing
return_values = []
for k,v in tr_dict.items():
if k.startswith('UTABI') or k.endswith('DERIG'): # Change your requirements here
return_values.append(v)
print(return_values)

Related

Extracting string from lists of dictionaries (or generator)

I am scraping data with scrapetube to get the video IDs of all the videos from a YouTube channel. The scrape code returns a generator object which I have converted to a list of dictionaries containting other dictionaries, lists and string. The scraping code works, but here still some sample data. I am only interested in the string video Id --> see picture for illustration purposes
How to iterate through all the video IDs in the string videoId and save them in a new variable (list or dataframe) for further processing?
import scrapetube
vid = scrapetube.get_channel('UC_zxivooFdvF4uuBosUnJxQ')
type(vid) #generator file
video = next(vid) #extract values from generator & then convert it
videoL = list(vid) #convert it to a list
#code not working
for item in videoL['videoId']:
entry = {}
videoId = item['videoId']
for i in range(len(videoId)):
entry.append(int(videoId[i][0:10]))
#error message: TypeError: list indices must be integers or slices, not str
I used code snippet from this post but can't seem to make it work.
It's helpful when you know the terminology so let's go through it step by step.
What is a generator?
A generator, like it's name implies, generates values on demand.
Their usefulness in this case is that if you don't want to have all the data in memory, you only iterate over one generated value at a time and only extract what you need.
Consider this:
def gen_one_million():
for i in range(0, 1_000_000):
yield i
for i in gen_one_million():
# do something with i
Rather than having a million elements in a list or some container in memory, you only get one at a time. If you want them all in a list it's very easy to do with list(gen_one_million()) but you're not tied to having them all in memory if you don't need them.
What is a list and how do I use them?
A list in python is a container represented by brackets []. To access elements in a list you can index into it i = my_list[0] or iterate over it.
for i in my_list:
# do something with i
What is a dict and how do I use them?
A dict is a python key/value container type represented by curly braces and a colon between the key and value. {key: value}
To access values in a dict you can reference the key who's value you want i = my_dict[key] where key is a string or integer or some other hashable type. You can also iterate over it.
for key in my_dict:
# do something with the key
for value in my_dict.values():
# do something with the key
for key, value in my_dict.items():
# do something with the key and value
How does my case fit into all this?
Looking at your sample data it looks like you already have it converted from a generator to a list.
[
{
'videoId': '8vCvSmAIv1s',
'thumbnail': {
'thumbnails': [
{
'url': 'https://i.ytimg.com/vi/8vCvSmAIv1s/hqdefault.jpg?sqp=-oaymwEbCKgBEF5IVfKriqkDDggBFQAAiEIYAXABwAEG&rs=AOn4CLDn3-yb8BvctGrMxqabxa_nH-UYzQ',
'width': 168,
'height': 94}, # etc..
}
]
}
}
]
However, since you just need to iterate over it and access the 'videoID' key in each generated dict, there's no reason to convert.
Just iterate directly over the generator and access the key of each generated dict.
video_ids = []
for item in vid:
video_ids.append(item['videoId'])
Or even better, as a list comprehension.
video_ids = [item['videoId'] for item in vid]

Iterating thru a not so ordinary Dictionary in python 3.x

Maybe it is ordinary issue regarding iterating thru a dict. Please find below imovel.txt file, whose content is as follows:
{'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
As you can see this is not a ordinary dictionary, with a key value pair; but a key with a list as key and another list as value
My code is:
#/usr/bin/python
def load_dict_from_file():
f = open('../txt/imovel.txt','r')
data=f.read()
f.close()
return eval(data)
thisdict = load_dict_from_file()
for key,value in thisdict.items():
print(value)
and yields :
['primeiro', 'segundo', 'terceiro'] ['101', '201', '301']
I would like to print a key,value pair like
{'primeiro':'101, 'segundo':'201', 'terceiro':'301'}
Given such txt file above, is it possible?
You should use the builtin json module to parse but either way, you'll still have the same structure.
There are a few things you can do.
If you know both of the base key names('Andar' and 'Apto') you can do it as a one line dict comprehension by zipping the values together.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# One line dict comprehension
newdict = {key: value for key, value in zip(thisdict['Andar'], thisdict['Apto'])}
print(newdict)
If you don't know the names of the keys, you could call next on an iterator assuming they're the first 2 lists in your structure.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# create an iterator of the values since the keys are meaningless here
iterator = iter(thisdict.values())
# the first group of values are the keys
keys = next(iterator, None)
# and the second are the values
values = next(iterator, None)
# zip them together and have dict do the work for you
newdict = dict(zip(keys, values))
print(newdict)
As other folks have noted, that looks like JSON, and it'd probably be easier to parse it read through it as such. But if that's not an option for some reason, you can look through your dictionary this way if all of your lists at each key are the same length:
for i, res in enumerate(dict[list(dict)[0]]):
ith_values = [elem[i] for elem in dict.values()]
print(ith_values)
If they're all different lengths, then you'll need to put some logic to check for that and print a blank or do some error handling for looking past the end of the list.

How to convert all dict key from str to float

I have got this current dictionary :
mydict = { "123.23":10.50, "45.22":53, "12":123 }
and I would to get this dictionary (with key as float):
mydict = { 123:23:10.50, 45.22:53, 12:123 }
I know that I can iterate over key and recreate a new dict like this:
new_dict = {}
for k in mydict.keys():
new_dict[float(k)]=mydict[k]
but I expect that it may be possible to convert dict key "inline" ( without to have to recreate a new dict ) ...
What is the most efficient method to do it ?
I suggest you to use a dictionary comprehension, which is easy to understand, as follows:
my_dict = { "123.23":10.50, "45.22":53, "12":123 }
my_dict = {float(i):j for i,j in mydict.items()}
print(my_dict) # {123.23: 10.5, 45.22: 53, 12.0: 123}
Use comprehension :
new_dict = { float(k): v for k, v in mydict.items() }
I expect that it may be possible to convert dict key "inline" ( without to have to recreate a new dict ) ...
What is the most efficient method to do it ?
Unless it materially matters to your runtime and you have time to waste profiling things and trying out various configurations, I'd strongly recommend just creating a second dict using a dict comprehension and focusing on actually relevant concerns: because dict views are "live" updating the dict as you iterate the keys directly may have odd side-effects e.g. you might find yourself iterating it twice as you first iterate the original keys, then try the keys you added; or the iteration might break entirely as deletions lead to internal storage compaction and the iterator gets invalidated.
So to change the key types without creating a new dict, you need to first copy the keys to a list, then iterate that and move values from one key to another:
for k in list(mydict.keys()):
mydict[float(k)] = mydict.pop(k)
However because of the deletions this may or may not be more efficient than creating a new dict with the proper layout, so the "optimisation" would be anything but.

Merging two lists of nested dictionaries by the similar values in Python

I have two lists of nested dictionaries:
lofd1 = [{'A': {'facebook':{'handle':'https://www.facebook.com/pages/New-Jersey/108325505857259','logo_id': None}, 'contact':{'emails':['nj#nj.gov','state#nj.gov']},'state': 'nj', 'population':'12345', 'capital':'Jersey','description':'garden state'}}]
lofd2 = [{'B':{'building_type':'ranch', 'city':'elizabeth', 'state':'nj', 'description':'the state close to NY'}}]
I need to:
Merge similar dictionaries in the lists, using the value of the 'state' key (for example, merge all dictionaries where "state" = "nj" into a single dictionary
It should include key/value combinations that are present in both dictionaries once (for example, "state" for both should be "nj")
It should include key/value combinations, that are not present in one of the dictionaries (for exmaple, "population", "capital" from the lofd1 and "building_type", "city" from lofd2).
Some of the values in dictionaries should be excluded, for example, 'logo_id':None
Put values in "description" from both dictionaries into a list of strings, for example '"description" : ['garden state', 'the state close to NY']'
The final dataset should look like this:
lofd_final = [{'state': 'nj', 'facebook':{'handle':'https://www.facebook.com/pages/New-Jersey/108325505857259'},'population':'12345', 'capital':'Jersey', 'contact':{'emails':['nj#nj.gov','state#nj.gov']}, 'description': ['garden state','the state close to NY'],'building_type':'ranch', 'city':'elizabeth'}]
What would be an efficient solution?
This is a solution very specific to your case. In terms of time complexity it is; O(n*m), n being the number of dicionaries in a list and m being the number of keys in a dictionary. You only ever look at each key in each dictionary once.
def extract_data(lofd, output):
for d in lofd:
for top_level_key in d: # This will be the A or B key from your example
data = d[top_level_key]
state = data['state']
if state not in output: # Create the state entry for the first time
output[state] = {}
# Now update the state entry with the data you care about
for key in data:
# Handle descriptions
if key == 'description':
if 'description' not in output[state]:
output[state]['description'] = [data['description']]
else:
output[state]['description'].append(data['description'])
# Handle all other keys
else:
# Handle facebook key (exclude logo_id)
if key == 'facebook':
del data['facebook']['logo_id']
output[state][key] = data[key]
output = {}
extract_data(lofd1, output)
extract_data(lofd2, output)
print(list(output.values()))
The output will be a dict of dicts, with the top level keys as the states. To convert it to how you specified just extract the values into a flat list: list(output.values()) (see above example).
Note: I am assuming a deep copy is not needed. So after you extract the data, I'm assuming you don't go and manipulate the values in lofd1 and lofd2. Also this is purely based on the specs that were given, e.g. if there are more nested keys that need to be excluded, you will need to add extra filters yourself.

How do you do you get a value that was randomly chosen from a dictionary

This is currently my code.
if Pokémon == 'Charmander':
selectyourmove = input('Select your move: Flamethrower, Fire Fang,
Scratch or Ember: ')#select your move
if selectyourmove == 'Flamethrower':
numberchoosing1 = random.randint(20, 22)#randomly decides
damage of the chosen move in the range
print(choice, 'has lost' ,numberchoosing1, 'health out of its'
,HP, 'health!')
My dictionary is quite simple. It is:
HP = {'Char':'60', 'Squir':'50', 'Pika':'80', 'Eve':'50', 'Bulb':'70', 'Clef':'100'}
Also all these have been defined.
How do I get a value that was randomly chosen from a dictionary
The 1st way is to use dict.popitem:
Remove and return an arbitrary (key, value) pair from the dictionary.
popitem() is useful to destructively iterate over a dictionary, as often used in set algorithms. If the dictionary is empty, calling popitem() raises a KeyError.
Note, that this method randomness actually comes from implementation of hashing algorithm and python dict elements layout. That's more obscurity than randomness.
The 2nd way, the truely 'random', is using the random.choice. It doesn't modify the dict, as chooses random index in the list supplied to it:
import random
hp = {'Char':'60', 'Squir':'50', 'Pika':'80', 'Eve':'50', 'Bulb':'70', 'Clef':'100'}
print(random.choice(list(hp.keys())))
Illustration of working principle:
>>> random.choice(list(HP.keys()))
'Pika'
>>> random.choice(list(HP.keys()))
'Clef'
>>> random.choice(list(HP.keys()))
'Pika'
The list is constructed here from .keys(), but when you need pairs (like from popitem()) you could use .items():
>>> random.choice(list(HP.items()))
('Clef', '100')
>>> random.choice(list(HP.items()))
('Pika', '80')
>>> random.choice(list(HP.items()))
('Char', '60')
The same way, of course the .values() will work producing only right-hand elements of dict items though, thus won't give you much satisfaction unlike .keys() or .items() does.
PS: Then if you need the reproduce prev. run, you can fix the 'randomness' with random.seed
That depends on what you mean by "value". A dictionary is a set of key,value pairs, so technically the values of your dictionary are just the strings '50', '60', '50', '100', '70', '80', and the keys are the strings 'Eve', 'Char', 'Squir', 'Clef', 'Bulb', 'Pika'.
You can these collections by using HP.keys() and HP.values(), and you can use list() to cast these collections to lists. Then, you can use random.choice to get a random value.
So to get a random key from your dictionary (which it seems like is what you actually want), you could do:
import random
keys = HP.keys()
key_list = list(keys)
choice = random.choice(key_list)
Or, more concisely:
import random
choice = random.choice(list(HP.keys()))
Then you can get the associated value for that key with HP[choice]

Resources