Python passing a string value from a table to a function - python-3.x

I'm trying to go through a list of items and passing each one to a function one by one to create an Excel file with the same name as the argument passed. I am getting the error below which I believe is related to the '/' in the String name. Can anyone advise how I get it to ignore this?
>>> test.createExcel(filename)
Traceback (most recent call last):
File "<pyshell#97>", line 1, in <module>
test.createExcel(filename)
File "C:\Users\danie\OneDrive\JVC\project1.py", line 52, in createExcel
wb2.save(modelname+'.xlsx')
File "C:\Users\danie\AppData\Local\Programs\Python\Python37\lib\site-packages\openpyxl\workbook\workbook.py", line 392, in save
save_workbook(self, filename)
File "C:\Users\danie\AppData\Local\Programs\Python\Python37\lib\site-packages\openpyxl\writer\excel.py", line 291, in save_workbook
archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)
File "C:\Users\danie\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1240, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: '14 A4/32GB BLU.xlsx'

A filename cannot contain any of the following characters: \ / : * ? " < > |
In ur case, u could replace ur filename using str.replace('/','-') or any other character u'd like to.
eg:
wb.save(filename.replace('\','-'))
Or using the regular expression to replace it may work well.

Related

Import Excel xlsx to Python using Panda - Error Message - How to resolve?

import pandas as pd
data = pd.read_excel (r'C:\Users\royli\Downloads\Product List.xlsx',sheet_name='Sheet1' )
df = pd.DataFrame(data, columns= ['Product'])
print (df)
Error Message
Traceback (most recent call last):
File "main.py", line 3, in <module>
Traceback (most recent call last):
File "main.py", line 3, in <module>
data = pd.read_excel (r'C:\Users\royli\Downloads\Product List.xlsx',sheet_name='Sheet1' )
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/excel/_base.py", line 867, in __init__
self._reader = self._engines[engine](self._io)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/excel/_xlrd.py", line 22, in __init__
super().__init__(filepath_or_buffer)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/excel/_base.py", line 353, in __init__
self.book = self.load_workbook(filepath_or_buffer)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/io/excel/_xlrd.py", line 37, in load_workbook
return open_workbook(filepath_or_buffer)
File "/opt/virtualenvs/python3/lib/python3.8/site-packages/xlrd/__init__.py", line 111, in open_workbook
with open(filename, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\royli\\Downloads\\Product List.xlsx'

KeyboardInterrupt

Generally when I get that problem am gonna change \ symbols to \ \ symbols and generally its solved. Try it.
I had this problem in Visual Studio Code.
table = pd.read_excel('Sales.xlsx')
When running the program on Pycharm, there were no errors.
When trying to run the same program in Visual Studio Code, it showed an error, without any changes.
To fix it, I had to address the file with //. Ex:
table = pd.read_excel('C:\\Users\\paste\\Desktop\\archives\\Sales.xlsx')
I am using Pycharm and after reviewing the Post and replies, I was able to get this resolved (thanks very much). I didn't need to specify a worksheet, as there is only one sheet on the Excel file I am reading.
I had to add the r (raw string), and I also removed the drive specification c:
data = pd.read_excel(r'\folder\subfolder\filename.xlsx')

ignore missing files in loop - data did not show up

I have thousands of files as you can see the year range below. Some of the dates of the files are missing so I want to skip over them. But when I tried the method below, and calling data_in, the variable doesn't exist. Any help would be truly appreciated. I am new to python. Thank you.
path = r'file path here'
DataYears = ['2012','2013','2014', '2015','2016','2017','2018','2019', '2020']
Years = np.float64(DataYears)
NumOfYr = Years.size
DataMonths = ['01','02','03','04','05','06','07','08','09','10','11','12']
daysofmonth=[31,28,31,30,31,30,31,31,30,31,30,31]
for yy in range(NumOfYr):
for mm in range (12):
try:
data_in = pd.read_csv(path+DataYears[yy]+DataMonths[mm]+'/*.dat', skiprows=4, header=None, engine='python')
print('Reached data_in') # EDIT
a=data_in[0] #EDIT
except IOError:
pass
#print("File not accessible")
EDIT: Error added
Traceback (most recent call last):
File "Directory/Documents/test.py", line 23, in <module>
data_in = pd.read_csv(path+'.'+DataYears[yy]+DataMonths[mm]+'/*.cod', skiprows=4, header=None, engine='python')
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 448, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 880, in __init__
self._make_engine(self.engine)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1126, in _make_engine
self._engine = klass(self.f, **self.options)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 2269, in __init__
memory_map=self.memory_map,
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/common.py", line 431, in get_handle
f = open(path_or_buf, mode, errors="replace", newline="")
FileNotFoundError: [Errno 2] No such file or directory: 'Directory/Documents/201201/*.dat'
You can adapt the code below to get a list of your date folders:
import glob
# Gives you a list of your folders with the different dates
folder_names = glob.glob("Directory/Documents/")
print(folder_names)
Then with the list of folder, you can iterate through there contents. If you just want a list of all .dat files can do something like:
import glob
# Gives you a list of your folders with the different dates
file_names = glob.glob("Directory/Documents/*/*.dat")
print(file_names)
The code above searches the contents of your directories so you bypass your problem with missing dates. The prints are there so you can see the results of glob.glob().

Uppercases convert to lowercase when loading a file with h5py

Hello I can't load a hdf5 file with h5py:
$ python verif.py
Traceback (most recent call last):
File "verif.py", line 4, in <module>
h5f = h5py.File("../DeepFISH-Github_projects/DeepFISH/dataset/'+'LowRes_13434_overlapping_pairs.h5",'r')
File "/home/jeanpat/VirtualEnv/venv3/lib/python3.5/site-packages/h5py/_hl/files.py", line 272, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/home/jeanpat/VirtualEnv/venv3/lib/python3.5/site-packages/h5py/_hl/files.py", line 92, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-at6d2npe-build/h5py/_objects.c:2684)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-at6d2npe-build/h5py/_objects.c:2642)
File "h5py/h5f.pyx", line 76, in h5py.h5f.open (/tmp/pip-at6d2npe-build/h5py/h5f.c:1930)
OSError: Unable to open file (Unable to open file: name = '../deepfish-github_projects/deepfish/dataset/'+'lowres_13434_overlapping_pairs.h5', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0
The string containing the path to the file:
../DeepFISH-Github_projects/DeepFISH/dataset'+'LowRes_13434_overlapping_pairs.h5
seems to be modified by h5py
../deepfish-github_projects/deepfish/dataset/lowres_13434_overlapping_pairs.h5
I could modify the directory name, but it's weird.
In this line
h5f = h5py.File("../DeepFISH-Github_projects/DeepFISH/dataset/'+'LowRes_13434_overlapping_pairs.h5",'r')
you're trying to open a file with a literal '+' in its name. The outer quotes are double quotes, so the single quotes within the string are just part of the name. What you probably wanted to use is:
h5f = h5py.File("../DeepFISH-Github_projects/DeepFISH/dataset/" + "LowRes_13434_overlapping_pairs.h5",'r')
I don't know why the error message is all lower case, maybe the library tries to find the file in a case insensitive way if it doesn't find it by the original name, or the underlying file system is case insensitive and this is just how the OS reports the missing file error.

shutil.copy to a subdir

If I am trying to copy files to a subdir, as:
dirname = os.path.join(sys.argv[1], optdir)
print("dirname: "+dirname)
if not os.path.exists(dirname):
os.makedirs(dirname)
shutil.copy(files, dirname)
shutil.copy is giving error as:
dirname: ./8/opt2
Traceback (most recent call last):
File "/home/rudra/bin/latres.py", line 84, in <module>
shutil.copy(files, dirname)
File "/usr/lib64/python3.5/shutil.py", line 234, in copy
dst = os.path.join(dst, os.path.basename(src))
File "/usr/lib64/python3.5/posixpath.py", line 139, in basename
i = p.rfind(sep) + 1
AttributeError: 'list' object has no attribute 'rfind'
Which is possibly due to dst = os.path.join(dst, os.path.basename(src)) in the error msg, so, it is only getting opt2, and not the ./8 part of the dir name.
in this situation, how can I copy files to a subdir?
files is a list of file names but copy only copes a single file. So put it in a loop:
for fn in files:
shutil.copy(fn, dirname)

Converting a supposed excel file in csv in python

I am having an issue trying to use a code for converting a file into csv.
I am using the code below as a start
directory = 'C:\OI Data'
filename = 'OpenInterest08-24-16'
data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1', index_col=None)
data_xls.to_csv(os.path.join(directory,filename +'.csv'), encoding='utf-8')
and I am getting the following error:
Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile
execfile(filename, namespace)
File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/Public/Documents/Python Scripts/work.py", line 26, in
data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1', index_col=None)
File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
io = ExcelFile(io, engine=engine)
File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 227, in init
self.book = xlrd.open_workbook(io)
File "C:\Anaconda2\lib\site-packages\xlrd__init__.py", line 441, in open_workbook
ragged_rows=ragged_rows,
File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 91, in open_workbook_xls
biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1230, in getbof
bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1224, in bof_error
raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\n\n\n\n\n '
I am struggling to figure out the file format I am using
https://www.theice.com/marketdata/reports/icefuturesus/PreliminaryOpenInterest.shtml?futuresExcel=&tradeDate=8%2F24%2F16
opening the file myself I get the following
enter image description here
I am still a beginner at python and some help would be much appreciated.
Thanks
You can start by fixing this part:
data_xls.to_csv(os.path.join(directory,filename,'.csv'), encoding='utf-8')
What happens when you do that is:
'C:\OI Data\\OpenInterest08-24-16\\.csv'
Which is not what you want. Instead do:
os.path.join(directory,filename+'.csv')
Which will give you:
'C:\OI Data\\OpenInterest08-24-16.csv'
Also, this is not a problem here, but in general be careful with this because a single backslash and a character can indicate an escape sequence, e.g. \n is a newline:
directory = 'C:\OI Data'
Instead escape the backslash like so:
directory = 'C:\\OI Data'

Resources