How may I dynamically create global variables within a function based on input in Python - python-3.x

I'm trying to create a function that returns a dynamically-named list of columns. Usually I can manually name the list, but I now have 100+ csv files to work with.
My goal:
Function creates a list, and names it based on dataframe name
Created list is callable outside of the function
I've done my research, and this answer from an earlier post came very close to helping me.
Here is what I've adapted
def test1(dataframe):
# Using globals() to get dataframe name
df_name = [x for x in globals() if globals()[x] is dataframe][0]
# Creating local dictionary to use exec function
local_dict = {}
# Trying to generate a name for the list, based on input dataframe name
name = 'col_list_' + df_name
exec(name + "=[]", globals(), local_dict)
# So I can call this list outside the function
name = local_dict[name]
for feature in dataframe.columns:
# Append feature/column if >90% of values are missing
if dataframe[feature].isnull().mean() >= 0.9:
name.append(feature)
return name
To ensure the list name changes based on the DataFrame supplied to the function, I named the list using:
name = 'col_list_' + df_name
The problem comes when I try to make this list accessible outside the function:
name = local_dict[name].
I cannot find away to assign a dynamic list name to the local dictionary, so I am forced to always call name outside the function to return the list. I want the list to be named based on the dataframe input (eg. col_list_df1, col_list_df2, col_list_df99).
This answer was very helpful, but it seems specific to variables.
global 'col_list_' + df_name returns a syntax error.
Any help would be greatly appreciated!

Related

target_transform in torchvision.datasets.ImageFolder seems not to work

I am using PuyTorch 1.13 with Python 3.10.
I have a problem where I import pictures from a folder structure using
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform,
is_valid_file=is_valid_file)
In this command labels are assigned automatically according to which subdirectory belongs an image.
I wanted to assign different labels and use target_transform for this purpose (e.g. I wanted to use a word from the file name to assign an appropriate label).
I have used
def target_transform(id):
print(2)
return id * 2
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=target_transform, is_valid_file=is_valid_file)
Next,
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=lambda id:2*id, is_valid_file=is_valid_file)
or
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=
torchvision.transforms.Lambda(lambda id:2*id), is_valid_file=is_valid_file)
But none of these affect the labels. In addition, in the first example I included the print statemet to see whether the function is called but it is not. I have serached the use of this funciton but the exmaples I have found do not work and the documentation is scarce in this respect. Any idea what is wrogn with the code?

How to create Multi Dimensional Dictionary

how to make a Multidimensional Dictionary with multiple keys and value and how to print its keys and values?
from this format:
main_dictionary= { Mainkey: {keyA: value
keyB: value
keyC: value
}}
I tried to do it but it gives me an error in the manufacturer. here is my code
car_dict[manufacturer] [type]= [( sedan, hatchback, sports)]
Here is my error:
File "E:/Programming Study/testupdate.py", line 19, in campany
car_dict[manufacturer] [type]= [( sedan, hatchback, sports)]
KeyError: 'Nissan'
And my printing code is:
for manufacuted_by, type,sedan,hatchback, sports in cabuyao_dict[bgy]:
print("Manufacturer Name:", manufacuted_by)
print('-' * 120)
print("Car type:", type)
print("Sedan:", sedan)
print("Hatchback:", hatchback)
print("Sports:", sports)
Thank you! I'm new in Python.
I think you have a slight misunderstanding of how a dict works, and how to "call back" the values inside of it.
Let's make two examples for how to create your data-structure:
car_dict = {}
car_dict["Nissan"] = {"types": ["sedan", "hatchback", "sports"]}
print(car_dict) # Output: {'Nissan': {'types': ['sedan', 'hatchback', 'sports']}}
from collections import defaultdict
car_dict2 = defaultdict(dict)
car_dict2["Nissan"]["types"] = ["sedan", "hatchback", "sports"]
print(car_dict2) # Output: defaultdict(<class 'dict'>, {'Nissan': {'types': ['sedan', 'hatchback', 'sports']}})
In both examples above, I first create a dictionary, and then on the row after I add the values I want it to contain. In the first example, I give car_dict the key "Nissan" and set it's values to a new dictionary containing some values.
In the second example I use defaultdict(dict) which basically has the logic of "if i am not given a value for key then use the factory (dict) to create a value for it.
Can you see the difference of how to initiate the values inside of both of the different methods?
When you called car_dict[manufacturer][type] in your code, you hadn't yet initiated car_dict["Nissan"] = value, so when you tried to retrieve it, car_dict returned a KeyError.
As for printing out the values, you can do something like this:
for key in car_dict:
manufacturer = key
car_types = car_dict[key]["types"]
print(f"The manufacturer '{manufacturer}' has the following types:")
for t in car_types:
print(t)
Output:
The manufacturer 'Nissan' has the following types:
sedan
hatchback
sports
When you loop through a dict, you are looping through only the keys that are contained in it by default. That means that we have to retrieve the values of key inside of the loop itself to be able to print them correctly.
Also as a side note: You should try to avoid using Built-in's names such as type as variable names, because you then overwrite that functions namespace, and you can have some problems in the future when you have to do comparisons of types of variables.

converting a python string to a variable

I have read almost every similar question but none of them seems to solve my current problem.
In my python code, I am importing a string from my bashrc and in the following, I am defining the same name as a variable to index my dictionary. Here is the simple example
obs = os.environ['obs']
>> obs = 'id_0123'
id_0123 = numpy.where(position1 == 456)
>> position1[id_0123] = 456
>> position2[id_0123] = 789
But of course, when I do positions[obs], it throws an error since it is a string rather than an index (numpy.int64). So I have tried to look for a solution to convert my string into a variable but all solution suggesting to either convert into a dictionary or something else and assign the string to an integer, But I can not do that since my string is dynamic and will constantly change. In the end, I am going to have about 50 variables and I need to check the current obs corresponding to which variable, so I could use it as indices to access the parameters.
Edit:
Position1 and Position 2 are just bumpy arrays, so depending on the output of os.environ (which is 'id_0123' in this particular case), they will print an array element. So I can not assign 'id_0123' another string or number since I am using that exact name as a variable.
The logic is that there are many different arrays, I want to use the output of os.environ as an input to access the element of these arrays.
If you wanted to use a dictionary instead, this would work.
obs = 'id_0123'
my_dict = {obs: 'Tester'}
print (my_dict [obs])
print (my_dict ['id_0123'])
You could use the fact that (almost) everything is a dictionary in Python to create storage container that you index with obs:
container = lambda:None
container.id_0123 = ...
...
print(positions[container.__dict__[obs]])
Alternatively, you can use globals() or locals() to achieve the desired behavior:
import numpy
import os
def environment_to_variable(environment_key='obs', variable_values=globals()):
# Get the name of the variable we want to reference from the bash environment
variable_name = os.environ[environment_key]
# Grab the variable value that matches the named variable from the environment. Variable you want to reference must be visible in the global namespace by default.
variable_value = variable_values[variable_name]
return variable_value
positions = [2, 3, 5, 7, 11, 13]
id_0123 = 2 # could use numpy.where here
id_0456 = 4
index = environment_to_variable(variable_values=locals())
print(positions[index])
If you place this in a file called a.py, you can observe the following:
$ obs="id_0123" python ./a.py
5
$ obs="id_0456" python ./a.py
11
This allows you to index the array differently at runtime, which is what it seems like your intention is.

loop through a list with a function and store as variable

I have a list of variables I need to run though a function to produce a json table. I want to loop through this list (list_db) and create a new variable to look through them manually in spyder. I am having trouble creating those new variables from the for loop. I thought i could use the name of the items in list as the new variable name, but i cant get it to work. Here is what I have:
for i in list_db:
p = str(i)
p = getDF(i) #function to run
What am I missing? What is the more standard way of doing this i cant think of?
It seems variable names don't really act how you are expecting. When you do p = str(i), you are assigning to the variable p. And then your next line, p = getDF(i) overwrites the value of p. It does not write to a variable whose name is str(i) like you are expecting.
If you want to store values into named slots, you can use a dictionary:
p = {}
for i in list_db:
p[i] = getDF(i)

Concatenating FOR loop output

I am very new to Python (first week of active use). I have some bash scripting experience but have decided to learn Python.
I have a variable of multiple strings which I am using to build a URL in FOR loop. The output of URL is JSON and I would like to concatenate complete output into one file.
I will put random URL for privacy reasons.
The code looks like this:
==================
numbers = ['24246', '83367', '37643', '24245', '24241', '77968', '63157', '76004', '71665']
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
result = restAPI.json
==================
the problem is that if I do print(result) I will get only output of last iteration, i.e. www.google.com/test/71665&test2
Creating a list by adding text = [] worked (content was concatenated) but I would like to keep the original format.
text = []
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
Does anyone have idea how to do this
When the for loop ends, the variable assigned inside the for loop only keeps the last value. I.e. Every time your code for loops through, the restAPI variable gets reset each time.
If you wanted to keep each URL, you could append to a list outside the scope of the for loop every time, i.e.
restAPI = s.get(urljoin(baseurl, ...
url_list.append(restApi.json)
Or if you just wanted to print...
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
print(restAPI.json)
If you added them to a list, you could perform seperate functions with the new list of URLs.
If you think there might be duplicates, feel free to use a set() instead (which automatically removes the dupes inside the iterable as new values are added). You can use set_name.add(restAPI.json)
To be better, you could implement a dict and assign the id as the key and the json object as the value. So you could:
dict_obj = dict()
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
dict_obj[id] = restAPI.json
That way you can query the dictionary later in the script.
Note that if you're querying many URLs, storing the JSON's in memory might be intensive depending on your hardware.

Resources