How to iterate over column names with PyTable?

How to iterate over column names with PyTable? - python-3.x

I have a large matrix (15000 rows x 2500 columns) stored using PyTables and getting see how to iterate over the columns of a row. In the documentation I only see how to access each row by name manually.
I have columns like:
ID
X20160730_Day10_123a_2
X20160730_Day10_123b_1
X20160730_Day10_123b_2
The ID column value is a string like '10692.RFX7' but all other cell values are floats. This selection works and I can iterate the rows of results but I cannot see how to iterate over the columns and check their values:
from tables import *
import numpy
def main():
h5file = open_file('carlo_seth.h5', mode='r', title='Three-file test')
table = h5file.root.expression.readout
condition = '(ID == b"10692.RFX7")'
for row in table.where(condition):
print(row['ID'].decode())
for col in row.fetch_all_fields():
print("{0}\t{1}".format(col, row[col]))
h5file.close()
if __name__ == '__main__':
main()
If I just iterate with "for col in row" nothing happens. As the code is above, I get a stack:
10692.RFX7
Traceback (most recent call last):
File "tables/tableextension.pyx", line 1497, in tables.tableextension.Row.__getitem__ (tables/tableextension.c:17226)
KeyError: b'10692.RFX7'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tables/tableextension.pyx", line 126, in tables.tableextension.get_nested_field_cache (tables/tableextension.c:2532)
KeyError: b'10692.RFX7'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./read_carlo_pytable.py", line 31, in <module>
main()
File "./read_carlo_pytable.py", line 25, in main
print("{0}\t{1}".format(col, row[col]))
File "tables/tableextension.pyx", line 1501, in tables.tableextension.Row.__getitem__ (tables/tableextension.c:17286)
File "tables/tableextension.pyx", line 133, in tables.tableextension.get_nested_field_cache (tables/tableextension.c:2651)
File "tables/utilsextension.pyx", line 927, in tables.utilsextension.get_nested_field (tables/utilsextension.c:8707)
AttributeError: 'numpy.bytes_' object has no attribute 'encode'
Closing remaining open files:carlo_seth.h5...done

You can access a column value by name in each row:
for row in table:
print(row["10692.RFX7"])
Iterate over all columns:
names = table.coldescrs.keys()
for row in table:
for name in names:
print(name, row[name])

Related

Openpyxl recognize data: TypeError: 'method' object is not subscriptable and is not a valid coordinate or range error

I'm new to openpyxl. I need to copy several columns from a file and paste it on another file, with the same columns.
I'm starting my code, but getting an error:
file1 = load_workbook('PRODUCTION.xlsx') ws = file1.active column = ws.cell['ID'] print (column)
I get this error:
Traceback (most recent call last): File "c:\Users\Ana\Documents\PRODUCTION Project\production.py", line 29, in <module> column = ws.cell['ID'] TypeError: 'method' object is not subscriptable
And when I tried only column = ws ['ID']
I get:
Traceback (most recent call last): File "c:\Users\Ana\Documents\PRODUCTION Project\production.py", line 29, in <module> column = ws ['ID'] File "C:\Users\Ana\AppData\Local\Programs\Python\Python310\lib\site-packages\openpyxl\worksheet\worksheet.py", line 290, in __getitem__ min_col, min_row, max_col, max_row = range_boundaries(key) File "C:\Users\Ana\AppData\Local\Programs\Python\Python310\lib\site-packages\openpyxl\utils\cell.py", line 135, in range_boundaries raise ValueError(msg) ValueError: ID is not a valid coordinate or range PS C:\Users\Ana\Documents\PRODUCTION Project>
Thanks in advance.

How to call VBA WorksheetFunction.Match using xlwings worksheet.api?

I have an Excel table from which I need to look up a specific value in column A and want to get the row number. For example, column A in the Excel sheet contains numbers from 1 to 50 and cell B2=10.
I have tried to call WorksheetFunction.Match(arg1,arg2,arg3) (https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.excel.worksheetfunction.match?view=excel-pia#Microsoft_Office_Interop_Excel_WorksheetFunction_Match_System_Object_System_Object_System_Object_)
using xlwings worksheet.api, but get and "AttributeError: .WorksheetFunction" when using it in Python (the WorksheetFunction.Match() works fine in VBA).
import xlwings as xw
wb=xw.Book(r'test.xlsm')
ws=wb.sheets['Test']
row_number=ws.api.WorksheetFunction.Match(Range("B2").Value, Range("A1:A50"), 0)
In this example I expect to get row_number=10, but instead get:
Traceback (most recent call last):
File "C:\tools\anaconda3\5.3.0\envs\nilspy\lib\site-packages\xlwings\_xlwindows.py", line 117, in __getattr__
v = getattr(self._inner, item)
File "C:\tools\anaconda3\5.3.0\envs\nilspy\lib\site-packages\win32com\client\dynamic.py", line 527, in __getattr__
raise AttributeError("%s.%s" % (self._username_, attr))
AttributeError: <unknown>.WorksheetFunction
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\tools\anaconda3\5.3.0\envs\nilspy\lib\site-packages\IPython\core\interactiveshell.py", line 2961, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-24-2c337b4c3c07>", line 5, in <module>
ws.api.WorksheetFunction.Match(Range("B2").Value, Range("A1:A50"), 0)
File "C:\tools\anaconda3\5.3.0\envs\nilspy\lib\site-packages\xlwings\_xlwindows.py", line 137, in __getattr__
self._oleobj_.GetIDsOfNames(0, item)
pywintypes.com_error: (-2147352570, 'Unknown name.', None, None)
I'm grateful for all help!
Edit: The last row in the code should probably refer to ws.range, like this:
import xlwings as xw
wb=xw.Book(r'test.xlsm')
ws=wb.sheets['Test']
ws.api.WorksheetFunction.Match(ws.range('B2').value, ws.range('A1:A50'), 0)
However, it results in the same error.

You could use find ( I can't find any documentation on worksheet functions being called)
ws = wb.sheets['Test']
search_range = ws.range("A1:A" + str(ws.range("A" + str(ws.cells.rows.count)).end('up').row))
search_range.api.find(ws.range("B2").value).row
The value may not be present so you would be better setting result of find to a variable first and testing if None:
found = search_range.api.find(ws.range("B2").value)
r = found.row if found is not None else '-999' #some value to indicate not found

How to split strings from a CSV column into a list?

I want to make a list out of the words of a CSV column called 'text' that is composed of strings
This is what I have:
def html_words():
legits_text = pd.read_csv('/Users/pmpilla/Documents/phishing/html_text.csv', delimiter=',')
list_text = legits_text["text"].split(" ")
This is the error that I am getting:
> Traceback (most recent call last):
File "/Users/pmpilla/Documents/phishing/html_words/legit_path_words.py", line 70, in <module>
html_words()
File "/Users/pmpilla/Documents/phishing/html_words/legit_path_words.py", line 30, in html_words
list_text = legits_text["text"].split(" ")
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/generic.py", line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'split'

What you might need to do is:
list_text = legits_text["text"].str.split(" ")
you might also need to use the parameter expand=True to create new columns instead of a list.
Refer:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.split.html

Itertuples() in Tkinter function reveals an AttributeError: 'tuple' object has no attribute 'A'

Within a Tkinter function, I need to create the list named: 'value' extracting every 10 rows the value of dataframe column named: df['A'].
The following for-loop works perfectly out of a Tkinter function:
value = []; i = 0
for row in df.itertuples():
i = 1 + i
if i == 10:
value_app = row.A
value.append(value_app)
i=0
However within Tkinter function I have the following error:
Exception in Tkinter callback
Traceback (most recent call last):
File "/Users/anaconda/lib/python3.6/tkinter/__init__.py", line 1699, in __call__
return self.func(*args)
File "<ipython-input-1-38aed24ba6fc>", line 4174, in start
dfcx = self.mg(a,b,c,d,e)
File "<ipython-input-1-38aed24ba6fc>", line 4093, in mg
value_app = r.A
AttributeError: 'tuple' object has no attribute 'A'
A similar for-loop structure is running in another part of the same Tkinter function and is executed without errors.

If the column A is your first column you can do :
value_app = row[0]
I had the same problem and I think that it only sees it as regular arrays

create a dictionary from a csv file

my csv program is
studentid,firstname,midterm,final,total,grade
20135447,Delta,47.00,37.00,65.00,DC
20144521,Jeffrey,36.00,22.00,27.60,FF
l tried this code
with open('marks.csv')as file:
line=csv.reader(file)
mydict={rows[0]:rows[1:] for rows in line}
print(mydict)
l got the following traceback error
Traceback (most recent call last):
File "", line 3, in
File "", line 3, in
IndexError: list index out of range
but my desired output is
{20135447:['Delta','47.00','37.00','65.00','DC'], '20144521':['Jeffrey','36.00','22.00','27.60','FF']}
please help me

import csv
mydic = dict()
with open('marks.csv') as myfile:
line = csv.reader(myfile)
for row in line:
mydic[row[0]] = row[1:]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to iterate over column names with PyTable? - python-3.x

You can access a column value by name in each row: for row in table: print(row["10692.RFX7"]) Iterate over all columns: names = table.coldescrs.keys() for row in table: for name in names: print(name, row[name])

Related

Openpyxl recognize data: TypeError: 'method' object is not subscriptable and is not a valid coordinate or range error

How to call VBA WorksheetFunction.Match using xlwings worksheet.api?

How to split strings from a CSV column into a list?

Itertuples() in Tkinter function reveals an AttributeError: 'tuple' object has no attribute 'A'

create a dictionary from a csv file

Categories

Resources