target_transform in torchvision.datasets.ImageFolder seems not to work - pytorch

I am using PuyTorch 1.13 with Python 3.10.
I have a problem where I import pictures from a folder structure using
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform,
is_valid_file=is_valid_file)
In this command labels are assigned automatically according to which subdirectory belongs an image.
I wanted to assign different labels and use target_transform for this purpose (e.g. I wanted to use a word from the file name to assign an appropriate label).
I have used
def target_transform(id):
print(2)
return id * 2
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=target_transform, is_valid_file=is_valid_file)
Next,
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=lambda id:2*id, is_valid_file=is_valid_file)
or
data = ImageFolder(root='./faces/', loader=img_loader, transform=transform, target_transform=
torchvision.transforms.Lambda(lambda id:2*id), is_valid_file=is_valid_file)
But none of these affect the labels. In addition, in the first example I included the print statemet to see whether the function is called but it is not. I have serached the use of this funciton but the exmaples I have found do not work and the documentation is scarce in this respect. Any idea what is wrogn with the code?

Related

Stuck using pandas to build RPG item generator

I am trying to build a simple random item generator for a game I am working on.
So far I am stuck trying to figure out how to store and access all of the data. I went with pandas using .csv files to store the data sets.
I want to add weighted probabilities to what items are generated so I tried to read the csv files and compile each list into a new set.
I got the program to pick a random set but got stuck when trying to pull a random row from that set.
I am getting an error when I use .sample() to pull the item row which makes me think I don't understand how pandas works. I think I need to be creating new lists so I can later index and access the various statistics of the items once one is selected.
Once I pull the item I was intending on adding effects that would change the damage and armor and such displayed. So I was thinking of having the new item be its own list then use damage = item[2] + 3 or whatever I need
error is: AttributeError: 'list' object has no attribute 'sample'
Can anyone help with this problem? Maybe there is a better way to set up the data?
here is my code so far:
import pandas as pd
import random
df = [pd.read_csv('weapons.csv'), pd.read_csv('armor.csv'), pd.read_csv('aether_infused.csv')]
def get_item():
item_class = [random.choices(df, weights=(45,40,15), k=1)] #this part seemed to work. When I printed item_class it printed one of the entire lists at the correct odds
item = item_class.sample()
print (item) #to see if the program is working
get_item()
I think you are getting slightly confused with lists vs list elements. This should work. I stubbed your dfs with simple ones
import pandas as pd
import random
# Actual data. Comment it out if you do not have the csv files
df = [pd.read_csv('weapons.csv'), pd.read_csv('armor.csv'), pd.read_csv('aether_infused.csv')]
# My stubs -- uncomment and use this instead of the line above if you want to run this specific example
# df = [pd.DataFrame({'weapons' : ['w1','w2']}), pd.DataFrame({'armor' : ['a1','a2', 'a3']}), pd.DataFrame({'aether' : ['e1','e2', 'e3', 'e4']})]
def get_item():
# I removed [] from the line below -- choices() already returns a list of length 1
item_class = random.choices(df, weights=(45,40,15), k=1)
# I added [0] to choose the first element of item_class which is a list of length 1 from the line above
item = item_class[0].sample()
print (item) #to see if the program is working
get_item()
prints random rows from random dataframes that I setup such as
weapons
1 w2

How may I dynamically create global variables within a function based on input in Python

I'm trying to create a function that returns a dynamically-named list of columns. Usually I can manually name the list, but I now have 100+ csv files to work with.
My goal:
Function creates a list, and names it based on dataframe name
Created list is callable outside of the function
I've done my research, and this answer from an earlier post came very close to helping me.
Here is what I've adapted
def test1(dataframe):
# Using globals() to get dataframe name
df_name = [x for x in globals() if globals()[x] is dataframe][0]
# Creating local dictionary to use exec function
local_dict = {}
# Trying to generate a name for the list, based on input dataframe name
name = 'col_list_' + df_name
exec(name + "=[]", globals(), local_dict)
# So I can call this list outside the function
name = local_dict[name]
for feature in dataframe.columns:
# Append feature/column if >90% of values are missing
if dataframe[feature].isnull().mean() >= 0.9:
name.append(feature)
return name
To ensure the list name changes based on the DataFrame supplied to the function, I named the list using:
name = 'col_list_' + df_name
The problem comes when I try to make this list accessible outside the function:
name = local_dict[name].
I cannot find away to assign a dynamic list name to the local dictionary, so I am forced to always call name outside the function to return the list. I want the list to be named based on the dataframe input (eg. col_list_df1, col_list_df2, col_list_df99).
This answer was very helpful, but it seems specific to variables.
global 'col_list_' + df_name returns a syntax error.
Any help would be greatly appreciated!

loop to read multiple files

I am using Obspy _read_segy function to read a segy file using following line of code:
line_1=_read_segy('st1.segy')
However I have a large number of files in a folder as follow:
st1.segy
st2.segy
st3.segy
.
.
st700.segy
I want to use a for loop to read the data but I am new so can any one help me in this regard.
Currently i am using repeated lines to read data as follow:
line_2=_read_segy('st1.segy')
line_2=_read_segy('st2.segy')
The next step is to display the segy data using matplotlib and again i am using following line of code on individual lines which makes it way to much repeated work. Can someone help me with creating a loop to display the data and save the figures .
data=np.stack(t.data for t in line_1.traces)
vm=np.percentile(data,99)
plt.figure(figsize=(60,30))
plt.imshow(data.T, cmap='seismic',vmin=-vm, vmax=vm, aspect='auto')
plt.title('Line_1')
plt.savefig('Line_1.png')
plt.show()
Your kind suggestions will help me a lot as I am a beginner in python programming.
Thank you
If you want to reduce code duplication, you use something called functions. And If you want to repeatedly do something, you can use loops. So you can call a function in a loop, if you want to do this for all files.
Now, for reading the files in folder, you can use glob package of python. Something like below:
import glob, os
def save_fig(in_file_name, out_file_name):
line_1 = _read_segy(in_file_name)
data = np.stack(t.data for t in line_1.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap='seismic', vmin=-vm, vmax=vm, aspect='auto')
plt.title(out_file_name)
plt.savefig(out_file_name)
segy_files = list(glob.glob(segy_files_path+"/*.segy"))
for index, file in enumerate(segy_files):
save_fig(file, "Line_{}.png".format(index + 1))
I have not added other imports here, which you know to add!. segy_files_path is the folder where your files reside.
You just need to dynamically open the files in a loop. Fortunately they all follow the same naming pattern.
N = 700
for n in range(N):
line_n =_read_segy(f"st{n}.segy") # Dynamic name.
data = np.stack(t.data for t in line_n.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap="seismic", vmin=-vm, vmax=vm, aspect="auto")
plt.title(f"Line_{n}")
plt.show()
plt.savefig(f"Line_{n}.png")
plt.close() # Needed if you don't want to keep 700 figures open.
I'll focus on addressing the file looping, as you said you're new and I'm assuming simple loops are something you'd like to learn about (the first example is sufficient for this).
If you'd like an answer to your second question, it might be worth providing some example data, the output result (graph) of your current attempt, and a description of your desired output. If you provide that reproducible example and clear description of the problem you're having it'd be easier to answer.
Create a list (or other iterable) to hold the file names to read, and another container (maybe a dict) to hold the result of your read_segy.
files = ['st1.segy', 'st2.segy']
lines = {} # creates an empty dictionary; dictionaries consist of key: value pairs
for f in files: # f will first be 'st1.segy', then 'st2.segy'
lines[f] = read_segy(f)
As stated in the comment by #Guimoute, if you want to dynamically generate the file names, you can create the files list by pasting integers to the base file name.
lines = {} # creates an empty dictionary; dictionaries have key: value pairs
missing_files = []
for i in range(1, 701):
f = f"st{str(i)}.segy" # would give "st1.segy" for i = 1
try: # in case one of the files is missing or can’t be read
lines[f] = read_segy(f)
except:
missing_files.append(f) # store names of missing or unreadable files

How to use extract the hidden layer features in H2ODeepLearningEstimator?

I found H2O has the function h2o.deepfeatures in R to pull the hidden layer features
https://www.rdocumentation.org/packages/h2o/versions/3.20.0.8/topics/h2o.deepfeatures
train_features <- h2o.deepfeatures(model_nn, train, layer=3)
But I didn't find any example in Python? Can anyone provide some sample code?
Most Python/R API functions are wrappers around REST calls. See http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/_modules/h2o/model/model_base.html#ModelBase.deepfeatures
So, to convert an R example to a Python one, move the model to be the this, and all other args should shuffle along. I.e. the example from the manual becomes (with dots in variable names changed to underlines):
prostate_hex = ...
prostate_dl = ...
prostate_deepfeatures_layer1 = prostate_dl.deepfeatures(prostate_hex, 1)
prostate_deepfeatures_layer2 = prostate_dl.deepfeatures(prostate_hex, 2)
Sometimes the function name will change slightly (e.g. h2o.importFile() vs. h2o.import_file() so you need to hunt for it at http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/index.html

Abaqus Python script -- Reading 'TENSOR_3D_FULL' data from *.odb file

What I want: strain values LE11, LE22, LE12 at nodal points
My script is:
#!/usr/local/bin/python
# coding: latin-1
# making the ODB commands available to the script
from odbAccess import*
import sys
import csv
odbPath = "my *.odb path"
odb = openOdb(path=odbPath)
assembly = odb.rootAssembly
# count the number of frames
NumofFrames = 0
for v in odb.steps["Step-1"].frames:
NumofFrames = NumofFrames + 1
# create a variable that refers to the reference (undeformed) frame
refFrame = odb.steps["Step-1"].frames[0]
# create a variable that refers to the node set ‘Region Of Interest (ROI)’
ROINodeSet = odb.rootAssembly.nodeSets["ROI"]
# create a variable that refers to the reference coordinate ‘REFCOORD’
refCoordinates = refFrame.fieldOutputs["COORD"]
# create a variable that refers to the coordinates of the node
# set in the test frame of the step
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= NODAL)
# count the number of nodes
NumofNodes =0
for v in ROIrefCoords.values:
NumofNodes = NumofNodes +1
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the current frame
currFrame = odb.steps["Step-1"].frames[i1+1]
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the strain 'LE'
Str = currFrame.fieldOutputs["LE"]
ROIStr = Str.getSubset(region=ROINodeSet, position= NODAL)
# initialize list
list = [[]]
# loop over all the nodes in each frame
for i2 in range(NumofNodes):
strain = ROIStr.values [i2]
list.insert(i2,[str(strain.dataDouble[0])+";"+str(strain.dataDouble[1])+\
";"+str(strain.dataDouble[3]))
# write the list in a new *.csv file (code not included for brevity)
odb.close()
The error I get is:
strain = ROIStr.values [i2]
IndexError: Sequence index out of range
Additional info:
Details for ROIStr:
ROIStr.name
'LE'
ROIStr.type
TENSOR_3D_FULL
OIStr.description
'Logarithmic strain components'
ROIStr.componentLabels
('LE11', 'LE22', 'LE33', 'LE12', 'LE13', 'LE23')
ROIStr.getattribute
'getattribute of openOdb(r'path to .odb').steps['Step-1'].frames[1].fieldOutputs['LE'].getSubset(position=INTEGRATION_POINT, region=openOdb(r'path to.odb').rootAssembly.nodeSets['ROI'])'
When I use the same code for VECTOR objects, like 'U' for nodal displacement or 'COORD' for nodal coordinates, everything works without a problem.
The error happens in the first loop. So, it is not the case where it cycles several loops before the error happens.
Question: Does anyone know what is causing the error in the above code?
Here the reason you get an IndexError. Strains are (obviously) calculated at the integration points; according to the ABQ Scripting Reference Guide:
A SymbolicConstant specifying the position of the output in the element. Possible values are:
NODAL, specifying the values calculated at the nodes.
INTEGRATION_POINT, specifying the values calculated at the integration points.
ELEMENT_NODAL, specifying the values obtained by extrapolating results calculated at the integration points.
CENTROID, specifying the value at the centroid obtained by extrapolating results calculated at the integration points.
In order to use your code, therefore, you should get the results using position= ELEMENT_NODAL
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= ELEMENT_NODAL)
With
ROIStr.values[0].data
You will then get an array containing the 6 independent components of your tensor.
Alternative Solution
For reading time series of results for a nodeset, you can use the function xyPlot.xyDataListFromField(). I noticed that this function is much faster than using odbread. The code also is shorter, the only drawback is that you have to get an abaqus license for using it (in contrast to odbread which works with abaqus python which only needs an installed version of abaqus and does not need to get a network license).
For your application, you should do something like:
from abaqus import *
from abaqusConstants import *
from abaqusExceptions import *
import visualization
import xyPlot
import displayGroupOdbToolset as dgo
results = session.openOdb(your_file + '.odb')
# without this, you won't be able to extract the results
session.viewports['Viewport: 1'].setValues(displayedObject=results)
xyList = xyPlot.xyDataListFromField(odb=results, outputPosition=NODAL, variable=((
'LE', INTEGRATION_POINT, ((COMPONENT, 'LE11'), (COMPONENT, 'LE22'), (
COMPONENT, 'LE33'), (COMPONENT, 'LE12'), )), ), nodeSets=(
'ROI', ))
(Of course you have to add LE13 etc.)
You will get a list of xyData
type(xyList[0])
<type 'xyData'>
Containing the desired data for each node and each output. It size will therefore be
len(xyList)
number_of_nodes*number_of_requested_outputs
Where the first number_of_nodes elements of the list are the LE11 at each nodes, then LE22 and so on.
You can then transform this in a NumPy array:
LE11_1 = np.array(xyList[0])
would be LE11 at the first node, with dimensions:
LE.shape
(NumberTimeFrames, 2)
That is, for each time step you have time and output variable.
NumPy arrays are also very easy to write on text files (check out numpy.savetxt).

Resources