Python PyVisa convert queried binary data to ascii data - python-3.x

I'm currently using a keysight VNA product and I control it using PyVisa. Since I have a rapid changing system, I wish to query binary data instead of ascii data from the machine since it is about 10 times faster. The issue I am having is to convert the data to ascii again.
Minimal exampel code:
import pyvisa as visa
import numpy as np
device_adress = ''TCPIP0::localhost::hislip1,4880::INSTR''
rm = visa.ResourceManager('C:\\Windows\\System32\\visa32.dll')
device = rm.open_resource(device_adres)
# presetting device for SNP data measurment
# ...
device.query_ascii_values('CALC:DATA:SNP? 2', container = np.ndarray) # works super but is slow
device.write('FORM:DATA REAL,64')
device.query_binary_values('CALC:DATA:SNP? 2', container = np.ndarray) # 10 times faster but how to read data
Official docs to query binary doesn't give me anything. I found the functions for the code on git here and some helper functions for converting data here, but I am still unable to convert the data such that the converted data is the same as the one I got from the ascii query command. If possible I would like the 'container=np.ndarray' to kept.
Functions from the last link that I have tested:
bin_data = device.query_binary_values('CALC:DATA:SNP? 2', container = np.ndarray)
num = from_binary_block(bin_data) # "Convert a binary block into an iterable of numbers."
ascii_data = to_ascii_block(num) # "Turn an iterable of numbers in an ascii block of data."
but the data from query_ascii_values and the values of ascii_data don't match. Any help is higly appreciated.
Edit:
With the following code
device.write(f"SENS:SWE:POIN 5;")
data_bin = device.query_binary_values('CALC:DATA? SDATA', container=np.ndarray)
I got
data_bin = array([-5.0535379e-34, 1.3452465e-43, -1.7349754e+09, 1.3452465e-43,
-8.6640313e+22, 8.9683102e-44, 5.0314407e-06, 3.1389086e-43,
4.8143607e-36, 3.1389086e-43, -4.1738553e-12, 1.3452465e-43,
-1.5767541e+11, 8.9683102e-44, -2.8241991e+32, 1.7936620e-43,
4.3024710e+16, 1.3452465e-43, 2.1990014e+07, 8.9683102e-44],
dtype=float32)

Related

Reading a set of HDF5 files and then slicing the resulting datasets without storing them in the end

I think some of my question is answered here:1
But the difference that I have is that I'm wondering if it is possible to do the slicing step without having to re-write the datasets to another file first.
Here is the code that reads in a single HDF5 file that is given as an argument to the script:
with h5py.File(args.H5file, 'r') as df:
print('Here are the keys of the input file\n', df.keys())
#interesting point here: you need the [:] behind each of these and we didn't need it when
#creating datasets not using the 'with' formalism above. Adding that even handled the cases
#in the 'hits' and 'truth_hadrons' where there are additional dimensions...go figure.
jetdset = df['jets'][:]
haddset = df['truth_hadrons'][:]
hitdset = df['hits'][:]
Then later I do some slicing operations on these datasets.
Ideally I'd be able to pass a wild-card into args.H5file and then the whole set of files, all with the same data formats, would end up in the three datasets above.
I do not want to store or make persistent these three datasets at the end of the script as the output are plots that use the information in the slices.
Any help would be appreciated!
There are at least 2 ways to access multiple files:
If all files follow a naming pattern, you can use the glob
module. It uses wildcards to find files. (Note: I prefer
glob.iglob; it is an iterator that yields values without creating a list. glob.glob creates a list which you frequently don't need.)
Alternatively, you could input a list of filenames and loop on
the list.
Example of iglob:
import glob
for fname in glob.iglob('img_data_0?.h5'):
with h5py.File(fname, 'r') as h5f:
print('Here are the keys of the input file\n', h5.keys())
Example with a list of names:
filenames = [ 'img_data_01.h5', 'img_data_02.h5', 'img_data_03.h5' ]
for fname in filenames:
with h5py.File(fname, 'r') as h5f:
print('Here are the keys of the input file\n', h5.keys())
Next, your code mentions using [:] when you access a dataset. Whether or not you need to add indices depends on the object you want returned.
If you include [()], it returns the entire dataset as a numpy array. Note [()] is now preferred over [:]. You can use any valid slice notation, e.g., [0,0,:] for a slice of a 3-axis array.
If you don't include [:], it returns a h5py dataset object, which
behaves like a numpy array. (For example, you can get dtype and shape, and slice the data). The advantage? It has a smaller memory footprint. I use h5py dataset objects unless I specifically need an array (for example, passing image data to another package).
Examples of each method:
jets_dset = h5f['jets'] # w/out [()] returns a h5py dataset object
jets_arr = h5f['jets'][()] # with [()] returns a numpy array object
Finally, if you want to create a single array that merges values from 3 datasets, you have to create an array big enough to hold the data, then load with slice notation. Alternatively, you can use np.concatenate() (However, be careful, as concatenating a lot of data can be slow.)
A simple example is shown below. It assumes you know the shape of the dataset, and they are the same for all 3 files. (a0, a1 are the axes lengths for 1 dataset) If you don't know them, you can get them from the .shape attribute
Example for method 1 (pre-allocating array jets3x_arr):
a0, a1 = 100, 100
jets3x_arr = np.empty(shape=(a0, a1, 3)) # add dtype= if not float
for cnt, fname in enumerate(glob.iglob('img_data_0?.h5')):
with h5py.File(fname, 'r') as h5f:
jets3x_arr[:,:,cnt] = h5f['jets']
Example for method 2 (using np.concatenate()):
a0, a1 = 100, 100
for cnt, fname in enumerate(glob.iglob('img_data_0?.h5')):
with h5py.File(fname, 'r') as h5f:
if cnt == 0:
jets3x_arr= h5f['jets'][()].reshape(a0,a1,1)
else:
jets3x_arr= np.concatenate(\
(jets3x_arr, h5f['jets'][()].reshape(a0,a1,1)), axis=2)

Stuck using pandas to build RPG item generator

I am trying to build a simple random item generator for a game I am working on.
So far I am stuck trying to figure out how to store and access all of the data. I went with pandas using .csv files to store the data sets.
I want to add weighted probabilities to what items are generated so I tried to read the csv files and compile each list into a new set.
I got the program to pick a random set but got stuck when trying to pull a random row from that set.
I am getting an error when I use .sample() to pull the item row which makes me think I don't understand how pandas works. I think I need to be creating new lists so I can later index and access the various statistics of the items once one is selected.
Once I pull the item I was intending on adding effects that would change the damage and armor and such displayed. So I was thinking of having the new item be its own list then use damage = item[2] + 3 or whatever I need
error is: AttributeError: 'list' object has no attribute 'sample'
Can anyone help with this problem? Maybe there is a better way to set up the data?
here is my code so far:
import pandas as pd
import random
df = [pd.read_csv('weapons.csv'), pd.read_csv('armor.csv'), pd.read_csv('aether_infused.csv')]
def get_item():
item_class = [random.choices(df, weights=(45,40,15), k=1)] #this part seemed to work. When I printed item_class it printed one of the entire lists at the correct odds
item = item_class.sample()
print (item) #to see if the program is working
get_item()
I think you are getting slightly confused with lists vs list elements. This should work. I stubbed your dfs with simple ones
import pandas as pd
import random
# Actual data. Comment it out if you do not have the csv files
df = [pd.read_csv('weapons.csv'), pd.read_csv('armor.csv'), pd.read_csv('aether_infused.csv')]
# My stubs -- uncomment and use this instead of the line above if you want to run this specific example
# df = [pd.DataFrame({'weapons' : ['w1','w2']}), pd.DataFrame({'armor' : ['a1','a2', 'a3']}), pd.DataFrame({'aether' : ['e1','e2', 'e3', 'e4']})]
def get_item():
# I removed [] from the line below -- choices() already returns a list of length 1
item_class = random.choices(df, weights=(45,40,15), k=1)
# I added [0] to choose the first element of item_class which is a list of length 1 from the line above
item = item_class[0].sample()
print (item) #to see if the program is working
get_item()
prints random rows from random dataframes that I setup such as
weapons
1 w2

How to load a big numpy array from a text file with Dask?

I have a text file containing data that I read to memory with numpy.genfromtxt enforcing a custom numpy.dtype. Although the text file is smaller than the available RAM, I often get a MemoryError (which I don't understand, but it is not the point of this question). When looking for ways to resolve it, I came across dask. In the API I found methods for data loading but none of them reads from text files, not to mention my need to support converters in genfromtxt().
I see there is a dask.dataframe.read_csv() method, but in my case I don't use pandas, but rather plain numpy.array with custom dtypes and colum names, as mentioned above. The text file I have is not CSV anyway (thus the abovementioned use of converters in genfromtxt()).
Any ideas on how could I handle it will be appreciated.
You should use the function dask.bytes.read_bytes with delimiter="\n" to read your file(s) and split them into blocks at line-endings. You get back a set of dask.delayed objects, which you can pass to numpy. Unfortunately, numpy wants a file-like, so you must pack the bytes again:
import dask
import dask.array as da
_, blocks = dask.bytes.read_bytes(files, delimiter="\n")
#dask.delayed
def parse(block):
return numpy.genfromtext(io.BytesIO(block), ...)
arrays = [da.from_delayed(parse(block), ...) for block in blocks]
arr = da.stack/concat(arrays)
SO editors rejected my edit to #mdurant's answer, thus, I post the working code (based on that answer) here:
import numpy
import dask
import dask.array as da
import io
fname = 'data.txt'
# data.txt is:
# 1 2
# 3 4
# 5 6
files = [fname]
_, blocks = dask.bytes.read_bytes(files, delimiter="\n")
my_type = numpy.dtype([
('f1', numpy.float64),
('f2', numpy.float64)
])
native_type = numpy.float
used_type = numpy.float64
# If the below line is uncommented, then creating the dask array will work, but it won't
# be possible to perform any operations on it
# used_type = my_type
# Debug
# print('blocks', blocks)
# print('type(blocks)', type(blocks))
# print('blocks[0]', blocks[0])
# print('type(blocks[0])', type(blocks[0]))
#dask.delayed
def parse(block):
r = numpy.genfromtxt(io.BytesIO(block[0]))
print('parse() about to return:\n', r, '\n')
return r
# Below I added shape, which seems compulsatory, the reason for which I don't understand
arrays = [da.from_delayed(value=parse(block), shape=(3, ), dtype=used_type) for block in blocks]
# da.concat did not not work for me
arr = da.stack(arrays)
# The below will not work if used_type is set to my_type
arr += 1
# Neither the below woudl work, it raises NotImplementedError
# arr['f1'] += 1
arr_np = arr.compute()
print('numpy array incremented by one: \n', arr_np)

pandas reading data from column in as float or int and not str despite dtype setting

i have an issue with pandas (0.23.4) on python 3.7 where the data is being read in as scientific notation instead of just a string despite setting the dtype setting. Here is an example of the data that is being read in
-------------------
codes
-------------------
001234544
00023455
123456789
A1253532
780E9000
00678E10
The problem comes with lines 5 and 6 of the above because they contain, i think, 'E' characters and they are being turned into scientific notation.
My reader is setup as follows.
accounts = pd.read_excel('gym_accounts.xlsx', sheet_name='Sheet1', dtype=str)
despite that dtype=str setting, it appears that pandas using something called ... a "sniffer" that detects the data type automatically and its being changed back to what I assume is float or int, and then changing it to scientific notation. One suggestion in another thread says to use something called a converter statement within the read_csv like the following
pd.read_csv('my.csv', converters = {i: str for i in range(0, 100)})
I am curious if this is a possible solution to my problem, but also i have no idea how long that range should be as it changes often. Is there any way to query the length of the column and feed that as a variable into that range call?
I looks like i can do something like len(accounts.index) ... but i cant do this till after the reader has read the file so something like this below doesnt work
accounts = pd.read_excel('gym_accounts.xlsx', sheet_name='Sheet1', converters = {i: str for i in range(0, gym_length)}))
gym_length = len(accounts.index)
the length check is after the .. i guess you call it ... data reader, so it doesnt work obviously.

Abaqus Python script -- Reading 'TENSOR_3D_FULL' data from *.odb file

What I want: strain values LE11, LE22, LE12 at nodal points
My script is:
#!/usr/local/bin/python
# coding: latin-1
# making the ODB commands available to the script
from odbAccess import*
import sys
import csv
odbPath = "my *.odb path"
odb = openOdb(path=odbPath)
assembly = odb.rootAssembly
# count the number of frames
NumofFrames = 0
for v in odb.steps["Step-1"].frames:
NumofFrames = NumofFrames + 1
# create a variable that refers to the reference (undeformed) frame
refFrame = odb.steps["Step-1"].frames[0]
# create a variable that refers to the node set ‘Region Of Interest (ROI)’
ROINodeSet = odb.rootAssembly.nodeSets["ROI"]
# create a variable that refers to the reference coordinate ‘REFCOORD’
refCoordinates = refFrame.fieldOutputs["COORD"]
# create a variable that refers to the coordinates of the node
# set in the test frame of the step
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= NODAL)
# count the number of nodes
NumofNodes =0
for v in ROIrefCoords.values:
NumofNodes = NumofNodes +1
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the current frame
currFrame = odb.steps["Step-1"].frames[i1+1]
# looping over all the frames in the step
for i1 in range(NumofFrames):
# create a variable that refers to the strain 'LE'
Str = currFrame.fieldOutputs["LE"]
ROIStr = Str.getSubset(region=ROINodeSet, position= NODAL)
# initialize list
list = [[]]
# loop over all the nodes in each frame
for i2 in range(NumofNodes):
strain = ROIStr.values [i2]
list.insert(i2,[str(strain.dataDouble[0])+";"+str(strain.dataDouble[1])+\
";"+str(strain.dataDouble[3]))
# write the list in a new *.csv file (code not included for brevity)
odb.close()
The error I get is:
strain = ROIStr.values [i2]
IndexError: Sequence index out of range
Additional info:
Details for ROIStr:
ROIStr.name
'LE'
ROIStr.type
TENSOR_3D_FULL
OIStr.description
'Logarithmic strain components'
ROIStr.componentLabels
('LE11', 'LE22', 'LE33', 'LE12', 'LE13', 'LE23')
ROIStr.getattribute
'getattribute of openOdb(r'path to .odb').steps['Step-1'].frames[1].fieldOutputs['LE'].getSubset(position=INTEGRATION_POINT, region=openOdb(r'path to.odb').rootAssembly.nodeSets['ROI'])'
When I use the same code for VECTOR objects, like 'U' for nodal displacement or 'COORD' for nodal coordinates, everything works without a problem.
The error happens in the first loop. So, it is not the case where it cycles several loops before the error happens.
Question: Does anyone know what is causing the error in the above code?
Here the reason you get an IndexError. Strains are (obviously) calculated at the integration points; according to the ABQ Scripting Reference Guide:
A SymbolicConstant specifying the position of the output in the element. Possible values are:
NODAL, specifying the values calculated at the nodes.
INTEGRATION_POINT, specifying the values calculated at the integration points.
ELEMENT_NODAL, specifying the values obtained by extrapolating results calculated at the integration points.
CENTROID, specifying the value at the centroid obtained by extrapolating results calculated at the integration points.
In order to use your code, therefore, you should get the results using position= ELEMENT_NODAL
ROIrefCoords = refCoordinates.getSubset(region=ROINodeSet,position= ELEMENT_NODAL)
With
ROIStr.values[0].data
You will then get an array containing the 6 independent components of your tensor.
Alternative Solution
For reading time series of results for a nodeset, you can use the function xyPlot.xyDataListFromField(). I noticed that this function is much faster than using odbread. The code also is shorter, the only drawback is that you have to get an abaqus license for using it (in contrast to odbread which works with abaqus python which only needs an installed version of abaqus and does not need to get a network license).
For your application, you should do something like:
from abaqus import *
from abaqusConstants import *
from abaqusExceptions import *
import visualization
import xyPlot
import displayGroupOdbToolset as dgo
results = session.openOdb(your_file + '.odb')
# without this, you won't be able to extract the results
session.viewports['Viewport: 1'].setValues(displayedObject=results)
xyList = xyPlot.xyDataListFromField(odb=results, outputPosition=NODAL, variable=((
'LE', INTEGRATION_POINT, ((COMPONENT, 'LE11'), (COMPONENT, 'LE22'), (
COMPONENT, 'LE33'), (COMPONENT, 'LE12'), )), ), nodeSets=(
'ROI', ))
(Of course you have to add LE13 etc.)
You will get a list of xyData
type(xyList[0])
<type 'xyData'>
Containing the desired data for each node and each output. It size will therefore be
len(xyList)
number_of_nodes*number_of_requested_outputs
Where the first number_of_nodes elements of the list are the LE11 at each nodes, then LE22 and so on.
You can then transform this in a NumPy array:
LE11_1 = np.array(xyList[0])
would be LE11 at the first node, with dimensions:
LE.shape
(NumberTimeFrames, 2)
That is, for each time step you have time and output variable.
NumPy arrays are also very easy to write on text files (check out numpy.savetxt).

Resources