FastText Error! ValueError: (file-name) cannot be opened for training - nlp

I Have installed fasttext module in Python and loaded the model [ 'cc.en.300.bin'].
I already made the data frame format according to the fasttext. and then generating the files
train.to_csv(" ecomm.train",columns=['Category_description'], index= False, header= False)
test.to_csv("ecom.test", columns=['Category_description'], index= False, header= False)
the files created successfully! then when I run this code
import fasttext
mod= fasttext.train_supervised(input='ecomm.train')
I get this error:
Traceback (most recent call last):
File "/Users/rosie/Documents/ProGraMinG/Python/pythonProject/FastText/FastText_overview.py", line 97, in <module>
mod= fasttext.train_supervised(input='ecomm.train')
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/fasttext/FastText.py", line 533, in train_supervised
fasttext.train(ft.f, a)
ValueError: ecomm.train cannot be opened for training!
{ UPDATE } !!!
Used both isfile() and exists() functions to check if the file exists:
path = 'Users/rosie/Documents/ProGraMinG/Python/pythonProject/FastText/ecomm.train'
check_file = os.path.isfile(path)
print("isfile method ",check_file)
check_file = os.path.exists(path)
print("exists method ",check_file)
Both methods returns ' False '.
I also checked if the file is readable or not
doc= open(' ecomm.train', 'r')
print('checking if the file is readable', doc.readable())
However, it returned 'True', now I'm confused. As for the size of the ' ecomm.train', it is 29.4 MB

Are you sure the file is readable, at the simple (local) path 'ecomm.train', from your Python process, given its current local orking directory?
For example, try specifying the file as its full absolute path – on MacOS probably something like /Users/yourusername/yourdirectory/etc/etc/ecomm.train. If that works, the problem was that your Python code's effective directory wasn't what you expected.
Alternatively, if the process that wrote the file was in some way a different user than the later process trying to read it, there might be permission errors.
Totally separate from fasttext, you could check, from the same code that's about to try fasttext operations, if the file is readable (via either the local path, or the absolute path) using a recipe lie that in this other answer: https://stackoverflow.com/a/44213239/130288
Even if it fails, it might give a more-explanatory error.

Related

Python xlwings: EventError: Command failed: Parameter error. (-50)

I wish I could use python to execute the Excel macro, so I tried to use the package xlwings to implement it.
The OS of my laptop is macOS Catalina (ver.: 10.15.7), my compiler is PyCharm (ver.: 2021.2.3), my Python version is 3.8.8, I used Anaconda (Ver.: 22.11.1) as my interpreter, my excel version is 16.66.1 (Microsoft Excel for Mac).
I faced the error "Command Error -1743: The User has declined permission" when I tried to use this package originally, and I solved this issue by installing an old compiler & using the old version of the compiler to run my code. My privacy setting for automation in the app Setting was shown below: (This is NOT the question I want to ask, but I'm not sure if it is also related to the issue I faced, so I still attached it here. I had uninstalled the old version of my compiler already.)
I wish I could implement an existing macro (called Hi) in an existing Excel file (called [StakeOverflow]HelloWorld.xlsm) through Python (represented as MY_PYTHON_FILE.py below), like the snapshot below:
My Excel file and my python code were stored on OneDrive. My macro code was shown below:
Sub Hi()
MsgBox "Good morning!"
End Sub
My python code was shown below:
import xlwings as xw
import time
wb = xw.Book('/Users/<MY NAME>/OneDrive/MY PATH DETAILS/[StakeOverflow]HelloWorld.xlsm')
time.sleep(10)
app = wb.app
macro_vba = app.macro("Hi")
macro_vba()
The code looks really simple, but I still faced the error. My Excel was not opened automatically, and I even could not open the Excel file manually thereafter. The error was shown below:
/Users/.../.conda/envs/Program/bin/python "/Users/.../OneDrive/.../MY_PYTHON_FILE.py"
Traceback (most recent call last):
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 4914, in open
impl= self.impl(name)
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/_xlmac.py", line 366, in __call__
raise KeyError(name_or_index)
KeyError: '[stakeoverflow] helloworld.xlsm'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/appscript/reference.py", line 482, in __call___
return self.AS_appdata.target().event (self._code, params, atts, codecs=self.AS_appdata).send(timeout, sendflags)
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/aem/aemsend.py", line 92, in send
raise EventError(errornum, errormsg, eventresult)
aem.aemsend.EventError: Command failed: Parameter error. (-50)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/.../OneDrive/.../MY_PYTHON_FILE.py", line 18, in <module>
wb = xw.Book('/Users/.../OneDrive/.../[stakeoverflow] helloworld.xlsm')
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 876, in __init__
impl= app.books.open(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/main.py", line 4921, in open
impl = self.impl.open(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/xlwings/ xlmac.py", line 420, in open
self.app.xl.open_workbook(
File "/Users/.../.conda/envs/Program/lib/python3.9/site-packages/aeosa/appscript/reference.py", line 518, in __call__
raise CommandError(self, (args, kargs), e, self.AS_appdata) from e
appscript.reference.CommandError: Command failed:
OSERROR: -50
MESSAGE: Parameter error.
COMMAND: app(pid=1647).open_workbook (workbook_file_name='/users/.../onedrive/.../[stakeoverflow] helloworld.xlsm', update_links=k.do_not_update_links, read_only=None, format=None, password=None, write_reserved_password=None, ignore_read_only_recommended=None, origin=None, delimiter=None, editable=None, notify=None, converter=None, add_to_mru=None, timeout=-1)
Process finished with exit code 1
I tried to use terminal to run my python code, but I could not solve the problem, either.
I tried to open the Excel file manually thereafter, the error was shown below:
Excel cannot open the file ’[StakeOverflow]HelloWorld.xlsm’
because the file format or file extension is not valid. Verify
that the file has not been corrupted and that the file
extension matches the format of the file.
I tried to Google this error, but few solutions was found. It seems that it is related to the issue of external storage location. I tried to create another Excel file with the same name & macro code on my desktop and try again, and the error would disappear. (However, our company stored the files on OneDrive, so I wish I could utilise the file online.)
Just wondering if anyone here faced this situation before?
I found the answer by myself today. The issue is related to the naming issue rather than the permission issue.
As we could see that the error is KeyError: '[stakeoverflow] helloworld.xlsm', which implies that the problem is here. (Maybe because the system could not find the Excel file with this name.)
I tried to change the name from [stakeoverflow] helloworld.xlsm to helloworld.xlsm, and the error was gone. It seems that when xlwings want to open an Excel file, it would check the validity of the file name and change the name into smaller cases. If the file name contains special characters which are not allowed (e.g., "[]"), then the error would occur.
Notice that I could store the Excel file with these special characters in our laptop & OneDrive, but xlwings did not accept them.
Hope it is helpful to those who face this issue when using xlwings!

Python3 eyed3 module, reading comments from mp3 file?

My goal is to use python3 (3.8 preferred) to read and eventually set the comments tag on an MP3 file.
I have installed eyed3 (python3.8 -m pip install eyed3) and the following code will load and can read all the tags, except when it comes to the comments. I've tried reading the documentation from the creator's website (http://eyed3.nicfit.net/) and the GitHub site with no luck and I'm not fully understanding the output from the following code:
import eyed3
music = "/path/to/valid/music/file.mp3"
audio = eyed3.load(music)
print(audio.tag.comments)
this spits out the following:
<eyed3.id3.tag.CommentsAccessor object at 0x7f54ca55d2b0>
I've tried doing a dir(audio.tag.comments) and that doesn't provide anything except a bunch of class bits, and the following functions "get", "remove", "set"
       
Using something like:
moo = audio.tag.comments.get()
throws an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mrhobbits/.local/lib/python3.8/site-packages/eyed3/utils/__init__.py", line 138, in wrapped_fn
return fn(*args, **kwargs)
TypeError: get() missing 1 required positional argument: 'description'
Yet I can't find anything that would show me what 'description' is, or maybe I have to drill deeper to get the comment information?
I should also mention that doing:
print(audio.tag.artist)
works just fine and returns:
Alestorm
I'm a bit lost here. Any help would be great.
Nevermind,
A quick search here revealed that the 'comments' is a list sort of object, and that doing:
audio.tag.comments[0].text
reveals the information I was looking for.

FileNotFoundError: [WinError 2] The system cannot find the file specified while loading model from s3

I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print() lines, showing your intent, you should print the actual command you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ... command directly, at the command prompt where you had been launching your python code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess system & Popen' call. I think those are via your call tosubprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with theload_d2v()`?)
That suggests the file that's no found might be the aws command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()?
(BTW, if my previous answer got you over your sklearn.externals.joblib problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)
try to add extension of your model file to your fname if you are confident the model file is there.
e.g. doc2vec_model.h3

shutil.rmtree() error when trying to remove NFS-mounted directory

Attempting to execute shutil.rmtree(path) on a directory managed by NFS consistently fails. Below you can see that os.rmdir(path) within shutil.rmtree(path) causes the exception. Is there a more robust way for me to achieve the expected result?
It appears that it removes all of the files, yet a hidden .nfs file remains in the directory for a short amount of time. I'm guessing that the process from which I'm calling rmtree has an open file handle to one of the files inside the directory, which, when deleted, apparently causes NFS to write a new hidden file. That would cause os.rmdir to fail on attempting to remove a non-empty directory.
Traceback (most recent call last):
File "/home/me/pre3/lib/python3.6/shutil.py", line 484, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/home/me/pre3/lib/python3.6/shutil.py", line 482, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty:
NFS details:
$ nfsstat -m
/home/me/nfs from XXX.YYY.ZZZ:/mnt/path/to/nfs
Flags: rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=50,retrans=2,sec=sys,mountaddr=REDACTED,mountvers=3,mountport=832,mountproto=udp,local_lock=none,addr=REDACTED
I'm using Python 3.6.6 on Ubuntu 16.04.
If the python logging module is logging to the target output directory, it will maintain an open file. A workaround is to call logging.shutdown() first, then called shutil.rmtree(path). This is not a general answer to the broader question, however.
You could try defining an error handler function to be passed to the onerror arg for shutil.rmtree: https://docs.python.org/3/library/shutil.html#shutil.rmtree
def handle_rmtree_err(function, path, excinfo):
...
shutil.rmtree(my_path, onerror=handle_rmtree_err)
There are all sorts of reasons why a process may be holding onto a file, so I can't tell you what the error handler should do exactly.
If you haven't figured out what is holding onto the file, try $ lsof | grep .nfsXXXX.
If all else fails you could time.sleep(secs) and retry shutil.rmtree.

I cant read the image file in the particular directory

I took some images in my camera and tried to resize them using opencv library but i think that i can't read the images I don't know the reason why.Thank you for the help in advance.
I have a python 3.8 version and the updated opencv library version.Not much of a background I guess.
import os,cv2
count=0
for file in os.listdir('E:\Projects\Python\Resixing images\Images'):
if file.endswith('.jpg'):
print(file)
img=cv2.imread(file)
img2=img.copy()
img2=cv2.resize(img2,(700,700))
name="resize"+str(count)+".jpg"
cv2.imwrite(name,img2)
count+=1
I receive an error message
P_20191107_214848_SRES.jpg
Traceback (most recent call last):
File "E:\Projects\Python\Resixing images\image changing res.py", line 7, in
img2=img.copy()
AttributeError: 'NoneType' object has no attribute 'copy'
[Finished in 9.7s]
Try this:
img=cv2.imread('E:\Projects\Python\Resixing images\Images' + '\' + file)
The problem was that you were sending only the name of the file to the python program, so the program tried to look for the image in the current directory and not at your specified path. The above change should fix the problem.
also, a good idea would be to have 2 // instead of 1 /, just to avoid any format specifier in the middle of things, or you could just use r to mention the path to be raw string
img=cv2.imread('E:\\Projects\\Python\\Resixing images\\Images' + '\\' + file)
img=cv2.imread(r'E:\Projects\Python\Resixing images\Images\' + file)

Resources