I have to deal with csv image data from a camera which exports the data with a header. In that header is a simple function for converting CCD counts into power density. This equation includes both the dark offset level as well as a calibration factor. Here is an example from one line of an image file:
Power Density,=,(n - 232) * 4.182e-005 W/cm^2
Notice the commas. The csv header can be expected to have the same structure each time with different constants for dark level (232) and power density conversion (4.182e-005).
What I would like to be able to do is grab the last cell, strip off the units at the end (W/cm^2), and use what is left to define a function in Python. Something like
f = lambda n: '(n - 232) * 4.182e-005'
Is it possible to do so? If so, how?
eval and exec, which use compile, are both ways to dynamically convert code as text to a compiled function. If you dynamically create a new function, you only need to do the conversion once.
row = "Power Density,=,(n - 232) * 4.182e-005 W/cm^2".split(',')
expr = row[2].replace( ' W/cm^2', '')
# f = eval("lambda n:" + expr) # based on your original idea
exec("def f(n): return " + expr) # more flexible
print(f(0))
# -0.00970224
The lambda eval and def exec have the same result, other than f.name, but as usual, the def form is more flexible, even if the flexibility is not needed here.
The usual caveats about executing untrusted code apply. If you are working with photo files not your own and were worried about an adversary feeding you a poisoned file, then indeed you might want to tokenize expr and check that is only has the tokens expected.
I found a way to do it using eval, but I expect that it isn't very pythonic so I would still be interested in seeing other answers.
Here row is the row of interest from a csv.reader object, i.e. the same string I posted in the question divided at the commas.
# Strip the units from the string
strng = row[2].replace( ' W/cm^2', '')
# Define a function based on the string
def f( n):
return eval( strng)
# Evaluate a value
print( f( 0))
# Returns: -0.00970224
Related
Hello I'm struggling to find a cleaner way to handle multiple StringVar assignments from a text file import. Whatever method I end up using needs to be adaptable for more StringVars in the future as this project grows.
Each line in the text file import has a key preceding a value "key: value" where the delimiter is ": ". While I currently have the order of the text file lines fixed, that will not be the case in the future for this project. Certain settings the program I'm making may influence what values are stored in this text file and some keys may not be written if there is no data associated with them.
From what I understand, I need the StringVars to fill in dynamic tkitner GUI Label and Entry elements so I don't see a way around using StringVars. Below is a snippet from my save-file import function which is assigns StringVar values from the input file data.
My current solution to manually set each StringVar to a value from a list is a bit clunky but better than a massive if-else statement evaluating the label which was my first pass.
# Read in the data
raw = open_regatta_save_file(file_path)
# Process the regatta save file.
row = 0
# Process Regatta Information
regatta_labels = ["Regatta_Name", "Regatta_Host", "Regatta_Location", "Regatta_Start_Day",
"Regatta_Start_Month", "Regatta_Start_Year", "Regatta_End_Day", "Regatta_End_Month",
"Regatta_End_Year", "Regatta_Type", "Regatta_Throw_Outs"]
regatta_values = [""] * len(regatta_labels)
while raw[row] != ":::BOAT INFORMATION:::": # Header for next section of input file
line = raw[row]
if ': ' in line and line.startswith(':::') is False:
# This is a line with data in it and not a header
[label, data] = line.split(': ')
if data is None or data == '':
# Either the provided data was blank or missing
continue
else:
if label in regatta_labels:
label_index = regatta_labels.index(label)
regatta_values[label_index] = data
row += 1
# Set regatta parameter values. Since regatta_labels control the order of the regatta_values list,
# these variables can be assigned in order.
self.regatta_name.set(regatta_values[0])
self.regatta_host.set(regatta_values[1])
self.regatta_location.set(regatta_values[2])
self.regatta_start_day.set(regatta_values[3])
self.regatta_start_month.set(regatta_values[4])
self.regatta_start_year.set(regatta_values[5])
self.regatta_end_day.set(regatta_values[6])
self.regatta_end_month.set(regatta_values[7])
self.regatta_end_year.set(regatta_values[8])
# Process Rest of file using similar method.
I'm hoping someone can help me understand Python StringVars better so that I can approach something simpler like:
StringVar1, StringVar2 ... = [list with values to assign to StringVars]
Thank you for your help.
I have large 3D same-sized array of data like density, temperature, pressure, entropy, … . I want to run a same function (like divergence()) for each of these arrays. The easy way is as follows:
div_density = divergence(density)
div_temperature = divergence(temperature)
div_pressure = divergence(pressure)
div_entropy = divergence(entropy)
Considering the fact that I have several arrays (about 100), I'd like to use a loop as follows:
var_list = ['density', 'temperature', 'pressure', 'entropy']
div = np.zeros((len(var_list)))
for counter, variable in enumerate(var_list):
div[Counter] = divergence(STV(variable))
I'm looking for a function like STV() which simply changes "string" to the "variable name". Is there a function like that in python? If yes, what is that function (by using such function, data should not be removed from the variable)?
These 3D arrays are large and because of the RAM limitation cannot be saved in another list like:
main_data=[density, temperature, pressure, entropy]
So I cannot have a loop on main_data.
One workaround is to use exec as follows
var_list = ['density', 'temperature', 'pressure', 'entropy']
div = np.zeros((len(var_list)))
for counter, variable in enumerate(var_list):
s = "div[counter] = divergence("+variable+")"
exec(s)
exec basically executes the string given as the argument in the python interpreter.
how about using a dictionary? that links the variable content to names.
Instead of using variable names density = ... use dict entries data['density'] for the data:
data = {}
# load ur variables like:
data['density'] = ...
divs = {}
for key, val in data.items():
divs[key] = divergence(val)
Since the data you use is large and the operations you try to do are computational expensive I would have a look at some of the libraries that provide methods to handle such data structures. Some of them also use c/c++ bindings for the expensive calculations (such as numpy does). Just to name some: numpy, pandas, xarray, iris (especially for earth data)
I am using Obspy _read_segy function to read a segy file using following line of code:
line_1=_read_segy('st1.segy')
However I have a large number of files in a folder as follow:
st1.segy
st2.segy
st3.segy
.
.
st700.segy
I want to use a for loop to read the data but I am new so can any one help me in this regard.
Currently i am using repeated lines to read data as follow:
line_2=_read_segy('st1.segy')
line_2=_read_segy('st2.segy')
The next step is to display the segy data using matplotlib and again i am using following line of code on individual lines which makes it way to much repeated work. Can someone help me with creating a loop to display the data and save the figures .
data=np.stack(t.data for t in line_1.traces)
vm=np.percentile(data,99)
plt.figure(figsize=(60,30))
plt.imshow(data.T, cmap='seismic',vmin=-vm, vmax=vm, aspect='auto')
plt.title('Line_1')
plt.savefig('Line_1.png')
plt.show()
Your kind suggestions will help me a lot as I am a beginner in python programming.
Thank you
If you want to reduce code duplication, you use something called functions. And If you want to repeatedly do something, you can use loops. So you can call a function in a loop, if you want to do this for all files.
Now, for reading the files in folder, you can use glob package of python. Something like below:
import glob, os
def save_fig(in_file_name, out_file_name):
line_1 = _read_segy(in_file_name)
data = np.stack(t.data for t in line_1.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap='seismic', vmin=-vm, vmax=vm, aspect='auto')
plt.title(out_file_name)
plt.savefig(out_file_name)
segy_files = list(glob.glob(segy_files_path+"/*.segy"))
for index, file in enumerate(segy_files):
save_fig(file, "Line_{}.png".format(index + 1))
I have not added other imports here, which you know to add!. segy_files_path is the folder where your files reside.
You just need to dynamically open the files in a loop. Fortunately they all follow the same naming pattern.
N = 700
for n in range(N):
line_n =_read_segy(f"st{n}.segy") # Dynamic name.
data = np.stack(t.data for t in line_n.traces)
vm = np.percentile(data, 99)
plt.figure(figsize=(60, 30))
plt.imshow(data.T, cmap="seismic", vmin=-vm, vmax=vm, aspect="auto")
plt.title(f"Line_{n}")
plt.show()
plt.savefig(f"Line_{n}.png")
plt.close() # Needed if you don't want to keep 700 figures open.
I'll focus on addressing the file looping, as you said you're new and I'm assuming simple loops are something you'd like to learn about (the first example is sufficient for this).
If you'd like an answer to your second question, it might be worth providing some example data, the output result (graph) of your current attempt, and a description of your desired output. If you provide that reproducible example and clear description of the problem you're having it'd be easier to answer.
Create a list (or other iterable) to hold the file names to read, and another container (maybe a dict) to hold the result of your read_segy.
files = ['st1.segy', 'st2.segy']
lines = {} # creates an empty dictionary; dictionaries consist of key: value pairs
for f in files: # f will first be 'st1.segy', then 'st2.segy'
lines[f] = read_segy(f)
As stated in the comment by #Guimoute, if you want to dynamically generate the file names, you can create the files list by pasting integers to the base file name.
lines = {} # creates an empty dictionary; dictionaries have key: value pairs
missing_files = []
for i in range(1, 701):
f = f"st{str(i)}.segy" # would give "st1.segy" for i = 1
try: # in case one of the files is missing or can’t be read
lines[f] = read_segy(f)
except:
missing_files.append(f) # store names of missing or unreadable files
What I want: strain values LE11, LE22, LE12 at nodal points
My script is:
#!/usr/local/bin/python
# coding: latin-1
# making the ODB commands available to the script
from odbAccess import*
import sys
import csv
odbPath = "my *.odb path"
odb = openOdb(path=odbPath)
assembly = odb.rootAssembly
# count the number of frames
NumofFrames = 0
for v in odb.steps["Step-1"].frames:
NumofFrames = NumofFrames + 1
# create a variable that refers to the reference (undeformed) frame
refFrame = odb.steps["Step-1"].frames[0]
# create a variable that refers to the node set ‘Region Of Interest (ROI)’
ROINodeSet = odb.rootAssembly.nodeSets["ROI"]
# create a variable that refers to the reference coordinate ‘REFCOORD’
refCoordinates = refFrame.fieldOutputs["COORD"]
# create a variable that refers to the coordinates of the node
# set in the test frame of the step
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= NODAL)
# count the number of nodes
NumofNodes =0
for v in ROIrefCoords.values:
NumofNodes = NumofNodes +1
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the current frame
currFrame = odb.steps["Step-1"].frames[i1+1]
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the strain 'LE'
Str = currFrame.fieldOutputs["LE"]
ROIStr = Str.getSubset(region=ROINodeSet, position= NODAL)
# initialize list
list = [[]]
# loop over all the nodes in each frame
for i2 in range(NumofNodes):
strain = ROIStr.values [i2]
list.insert(i2,[str(strain.dataDouble[0])+";"+str(strain.dataDouble[1])+\
";"+str(strain.dataDouble[3]))
# write the list in a new *.csv file (code not included for brevity)
odb.close()
The error I get is:
strain = ROIStr.values [i2]
IndexError: Sequence index out of range
Additional info:
Details for ROIStr:
ROIStr.name
'LE'
ROIStr.type
TENSOR_3D_FULL
OIStr.description
'Logarithmic strain components'
ROIStr.componentLabels
('LE11', 'LE22', 'LE33', 'LE12', 'LE13', 'LE23')
ROIStr.getattribute
'getattribute of openOdb(r'path to .odb').steps['Step-1'].frames[1].fieldOutputs['LE'].getSubset(position=INTEGRATION_POINT, region=openOdb(r'path to.odb').rootAssembly.nodeSets['ROI'])'
When I use the same code for VECTOR objects, like 'U' for nodal displacement or 'COORD' for nodal coordinates, everything works without a problem.
The error happens in the first loop. So, it is not the case where it cycles several loops before the error happens.
Question: Does anyone know what is causing the error in the above code?
Here the reason you get an IndexError. Strains are (obviously) calculated at the integration points; according to the ABQ Scripting Reference Guide:
A SymbolicConstant specifying the position of the output in the element. Possible values are:
NODAL, specifying the values calculated at the nodes.
INTEGRATION_POINT, specifying the values calculated at the integration points.
ELEMENT_NODAL, specifying the values obtained by extrapolating results calculated at the integration points.
CENTROID, specifying the value at the centroid obtained by extrapolating results calculated at the integration points.
In order to use your code, therefore, you should get the results using position= ELEMENT_NODAL
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= ELEMENT_NODAL)
With
ROIStr.values[0].data
You will then get an array containing the 6 independent components of your tensor.
Alternative Solution
For reading time series of results for a nodeset, you can use the function xyPlot.xyDataListFromField(). I noticed that this function is much faster than using odbread. The code also is shorter, the only drawback is that you have to get an abaqus license for using it (in contrast to odbread which works with abaqus python which only needs an installed version of abaqus and does not need to get a network license).
For your application, you should do something like:
from abaqus import *
from abaqusConstants import *
from abaqusExceptions import *
import visualization
import xyPlot
import displayGroupOdbToolset as dgo
results = session.openOdb(your_file + '.odb')
# without this, you won't be able to extract the results
session.viewports['Viewport: 1'].setValues(displayedObject=results)
xyList = xyPlot.xyDataListFromField(odb=results, outputPosition=NODAL, variable=((
'LE', INTEGRATION_POINT, ((COMPONENT, 'LE11'), (COMPONENT, 'LE22'), (
COMPONENT, 'LE33'), (COMPONENT, 'LE12'), )), ), nodeSets=(
'ROI', ))
(Of course you have to add LE13 etc.)
You will get a list of xyData
type(xyList[0])
<type 'xyData'>
Containing the desired data for each node and each output. It size will therefore be
len(xyList)
number_of_nodes*number_of_requested_outputs
Where the first number_of_nodes elements of the list are the LE11 at each nodes, then LE22 and so on.
You can then transform this in a NumPy array:
LE11_1 = np.array(xyList[0])
would be LE11 at the first node, with dimensions:
LE.shape
(NumberTimeFrames, 2)
That is, for each time step you have time and output variable.
NumPy arrays are also very easy to write on text files (check out numpy.savetxt).
Can any one tell me, how can I write my output of Fortran program in CSV format? So I can open the CSV file in Excel for plotting data.
A slightly simpler version of the write statement could be:
write (1, '(1x, F, 3(",", F))') a(1), a(2), a(3), a(4)
Of course, this only works if your data is numeric or easily repeatable. You can leave the formatting to your spreadsheet program or be more explicit here.
I'd also recommend the csv_file module from FLIBS. Fortran is well equipped to read csv files, but not so much to write them. With the csv_file module, you put
use csv_file
at the beginning of your function/subroutine and then call it with:
call csv_write(unit, value, advance)
where unit = the file unit number, value = the array or scalar value you want to write, and advance = .true. or .false. depending on whether you want to advance to the next line or not.
Sample program:
program write_csv
use csv_file
implicit none
integer :: a(3), b(2)
open(unit=1,file='test.txt',status='unknown')
a = (/1,2,3/)
b = (/4,5/)
call csv_write(1,a,.true.)
call csv_write(1,b,.true.)
end program
output:
1,2,3
4,5
if you instead just want to use the write command, I think you have to do it like this:
write(1,'(I1,A,I1,A,I1)') a(1),',',a(2),',',a(3)
write(1,'(I1,A,I1)') b(1),',',b(2)
which is very convoluted and requires you to know the maximum number of digits your values will have.
I'd strongly suggest using the csv_file module. It's certainly saved me many hours of frustration.
The Intel and gfortran (5.5) compilers recognize:
write(unit,'(*(G0.6,:,","))')array or data structure
which doesn't have excess blanks, and the line can have more than 999 columns.
To remove excess blanks with F95, first write into a character buffer and then use your own CSV_write program to take out the excess blanks, like this:
write(Buf,'(999(G21.6,:,","))')array or data structure
call CSV_write(unit,Buf)
You can also use
write(Buf,*)array or data structure
call CSV_write(unit,Buf)
where your CSV_write program replaces whitespace with "," in Buf. This is problematic in that it doesn't separate character variables unless there are extra blanks (i.e. 'a ','abc ' is OK).
I thought a full simple example without any other library might help. I assume you are working with matrices, since you want to plot from Excel (in any case it should be easy to extend the example).
tl;dr
Print one row at a time in a loop using the format format(1x, *(g0, ", "))
Full story
The purpose of the code below is to write in CSV format (that you can easily import in Excel) a (3x4) matrix.
The important line is the one labeled 101. It sets the format.
program testcsv
IMPLICIT NONE
INTEGER :: i, nrow
REAL, DIMENSION(3,4) :: matrix
! Create a sample matrix
matrix = RESHAPE(source = (/1,2,3,4,5,6,7,8,9,10,11,12/), &
shape = (/ 3, 4 /))
! Store the number of rows
nrow = SIZE(matrix, 1)
! Formatting for CSV
101 format(1x, *(g0, ", "))
! Open connection (i.e. create file where to write)
OPEN(unit = 10, access = "sequential", action = "write", &
status = "replace", file = "data.csv", form = "formatted")
! Loop across rows
do i=1,3
WRITE(10, 101) matrix(i,:)
end do
! Close connection
CLOSE(10)
end program testcsv
We first create the sample matrix. Then store the number of rows in the variable nrow (this is useful when you are not sure of the matrix's dimension beforehand). Skip a second the format statement. What we do next is to open (create or replace) the CSV file, names data.csv. Then we loop over the rows (do statement) of the matrix to write a row at a time (write statement) in the CSV file; rows will be appended one after another.
In more details how the write statement works is: WRITE(U,FMT) WHAT. We write "what" (the i-th row of the matrix: matrix(i,:)), to connection U (the one we created with the open statement), formatting the WHAT according to FMT.
Note that in the example FMT=101, and 101 is the label of our format statement:
format(1x, *(g0, ", "))
what this does is: "1x" insert a white space at the beginning of the row; the "*" is used for unlimited format repetition, which means that the format in the following parentheses is repeated for all the data left in the object we are printing (i.e. all elements in the matrix's row). Thus, each row number is formatted as: 'g0, ", "'.
g is a general format descriptor that handles floats as well as characters, logicals and integers; the trailing 0 basically means: "use the least amount of space needed to contain the object to be formatted" (avoids unnecessary spaces). Then, after the formatted number, we require the comma plus a space: **", ". This produces our comma-separated values for a row of the matrix (you can use other separators instead of "," if you need). We repeat for every row and that's it.
(The spaces in the format are not really needed, thus one could use format(*(g0,","))
Reference: Metcalf, M., Reid, J., & Cohen, M. (2018). Modern Fortran Explained: Incorporating Fortran 2018. Oxford University Press.
Tens seconds work with a search engine finds me the FLIBS library, which includes a module called csv_file which will write strings, scalars and arrays out is CSV format.