I'm trying to create a word augmenter using glove. This is my code:
import nlpaug.augmenter.word.word_embs as nawwe
aug = nawwe.WordEmbsAug(model_type='glove',model_path='/content/drive/MyDrive/data/Glove/glove.6B.300d.txt', action = 'insert')
(I'm reading the glove .txt file from my google drive).
But when I execute it, it gives this error:
load_word2vec_format() got an unexpected keyword argument 'no_header'
How should I fix that?
Related
I am using machine learning in my Python (version 3.8.5) code. In the preprocessing part, I need to hash encode few features. So earlier I have dumped a hash encoder pickle file using the features in the training phase. Saved the file with the name of 'hash_encoder.pkl'. Now in the testing phase, I need to transform the features using this pickle file. I'm using the following code given in screenshot to hash encode three string features as given in the first line.
In the encoder.transform line, I'm getting the error of "data_lock=mutiprocessing.Manager().Lock()".
At the end I'm also getting 'raise EOF error'.
I have tried using same version of pandas (1.1.3) to dump the hash_encoder file and also to load it. I'm not sure why is this coming up.
Can someone help me in understand or debugging this part?
I have added the screenshot of the error.
I am trying to visualize the EEG data fie which is in .edf file format.For this purpose I am using MNE python.
here is my code
import mne
file = "/home/test.edf"
data = mne.io.read_raw_edf(file,preload=True)
Whenever I run this code below error massage is showing
ValueError: not enough values to unpack (expected 3, got 0)
I could not figure out where is my wrong.
It's not possible to use use the file by specifying the path, event if you add a "~", the directory won't be identified. Better you try to be in the exact directory and try reading the file, i.e go to your home directory and then try specifying the file.
import mne
file = "test.edf"
data = mne.io.read_raw_edf(file)
I am writing an application which trains machine learning models ad-hoc, when I try to fetch the model like so:
model = tf.keras.models.load_model('./models/model.h5')
I get an error:
Unable to open file (unable to open file: name = 'models/model.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
In some special cases however the model might not be present on disk, at which point it should be created, trained and saved for later use. What would be the right approach to checking if a model is present? I could use inbuilt functionality in python to check if the file exists but it seems obvious to me that there should be a parameter on load_model which returns None instead of throwing error if the file is not present.
The Python way of checking if the file exists is the right way to go.
This may be personal, but it's not obvious that None should be returned. When you open a file, the file must exist.
You can:
import os.path
if os.path.isfile(fname):
model=load_model(fname)
else:
model = createAndTrainModel()
Or you can
try:
model=load_model(fname)
except:
model = createAndTrainModel()
I prefer the first.
I am trying to use Gensim with Glove instead of word2vec. To make the shape of Glove compatible with Gensim and use it, I am using the following lines of code:
import gensim
from gensim.scripts.glove2word2vec import glove2word2vec
glove_in = 'glove.840B.300d.txt'
word2vec_format_out = 'glove.840B.300d.txt.word2vec'
glove2word2vec(glove_in, word2vec_format_out)
model =
gensim.models.KeyedVectors.load_word2vec_format(word2vec_format_out,
encoding='utf-8', binary=True)
However, this last line of code gives the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 0:
invalid start byte
I have tried to open Glove first and then writing as a csv file, then re-open specifying encoding='utf-8'. I also tried several other things mentioned here, but the error keeps coming back. Does anyone know a solution for this?
Trying to load up a file in gensim with this line of code :
model = gensim.models.KeyedVectors.load_word2vec_format(r"C:/Users/dan/txt_sentoken/pos/cv000_29590.tx", binary=False)
However, I am getting this error:
ValueError: invalid literal for int() with base 10:'films'
Help how do I solve this error ?
Each corpus need to start with a line containing the vocab size and the vector size in that order.