When running the BERT NLP Model of HuggingFace, it ouputs the OSError below. Below is the code and Error,
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import requests
from bs4 import BeautifulSoup
import re
## Instantiate Model
# the nlp bert pre-trained model used here is from website huggingface. the website is " https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment "
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
## Encode and Calculate Sentiment ( Now entering string words to test the sentiment score )
tokens = tokenizer.encode('I hated this, absolutely the worst', return_tensors='pt' )
result = model(tokens)
print(result)
Error Below,
OSError: Can't load the model for
'nlptown/bert-base-multilingual-uncased-sentiment'. If you were trying
to load it from 'https://huggingface.co/models', make sure you don't
have a local directory with the same name. Otherwise, make sure
'nlptown/bert-base-multilingual-uncased-sentiment' is the correct path
to a directory containing a file named pytorch_model.bin, tf_model.h5,
model.ckpt or flax_model.msgpack.
Related
I am doing everything only on one jupyter notebook file.
I am trying to predict new store description by their category using logistic regression classification model and count vectorizer
All the code below are in SEQUENCE be it used or unused code
Below is my code:
from sklearn.feature_extraction.text import CountVectorizer
cv=CountVectorizer(stop_words='english', ngram_range=(1,1))
X_train_cv=cv.fit_transform(X_train.values.astype('str'))
X_test_cv=cv.transform(X_test.values.astype('str'))
from sklearn.linear_model import LogisticRegression
lr=LogisticRegression(solver='lbfgs')
lr.fit(X_train_cv,y_train)
y_pred_cv=lr.predict(X_test_cv)
from sklearn.metrics import classification_report
print(classification_report(y_test,y_pred_cv,target_names=['electronics','fashion','F&B','services']))
#i never use this code below as i am not doing on 2 notebook
import pickle
from datetime import datetime
model_path=['drive','mydrive','I125','models']
time=datetime.now().strfttime("%Y-%m-%d")
filename='lr-{}.pkl'.format(time)
templist=[]
templist.append(filename)
path1=os.sep.join(model_path+templist)
filename='countvectorizer-{}.pkl'.format(time)
templist=[]
templist.append(filename)
path2=os.sep.join(model_path+templist)
with open(path1,'wb')as f1:
pickle.dump(lr,f1)
with open(path2,'wb')as f2:
pickle.dump(cv,f2)
I am trying to predict a new description using the current classifier that i have. I only know how to use current classifier to predict new description if it's for separate notebook.
This is my code that i have to predict for new description:
#i never use this code below as i am not doing on 2 notebook
import os
import pickle
from google.colab import drive
drive.mount('/content/drive')
model_path=['drive','mydrive','I125','models']
filename=['lr-2022-10-10.pk1']
model_path=['drive','mydrive','I125','models']
filename=['countvectoriser-2022-10-10.pk1']
path2=os.sep.join(model_path+filename)
with open(path2,'rb')as f:
trained_cv=pickle.load(f)
path1=os.sep.join(model_path+filename)
with open(path1,'rb') as f:
model=pickle.load(f)
#i used this code below
import re
import string
def preprocess(text):
pattern_alphanumeric="\w*\d\w*"
pattern_punctuation="["+re.escape(string.punctuation)+"]"
text=re.sub(pattern_alphanumeric,'',text)
text=re.sub(pattern_punctuation,'',text).lower()
return text
new_text="This clothes so nice"
new_text_processed=preprocess(new_text)
def encode_text_to_vector(cv,test):
text_vector = cv.transform([text])
return text_vector
new_text_vector=encode_text_to_vector(trained_cv,new_text_processed) <--line with error
print(new_text_vector)
ERror:
trained_cv is undefined. (trained_cv is supposed to be the the saved logistic regression and count vectorizer if i have use different jupyter notebook)
I'm trying to obtain the "last_hidden_state" (as explained here) for code generation models over here. I am unable to figure out how to proceed, other than manually downloading each code-generation-model and checking if its key has that attribute using the following code -
import numpy as np
from datasets import load_dataset
from transformers import AutoTokenizer
from transformers import AutoModel, AutoModelForCausalLM
import torch
from sklearn.linear_model import LogisticRegression
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot").to(device)
inputs = tokenizer("def hello_world():", return_tensors="pt")
inputs = {k:v.to(device) for k,v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
print(outputs.keys())
So far, I tried this strategy on CodeParrot and InCoder with no success. Perhaps there is a better way to access the values of the hidden layers?
The hidden_states of output from CodeGenForCausalLM is already the last_hidden_state for the codegen model. See: link
where hidden_states = transformer_outputs[0] is the output of CodeGenModel (link) and the transformer_outputs[0] is the last_hidden_state
if not return_dict:
return tuple(v for v in [hidden_states, presents, all_hidden_states, all_self_attentions] if v is not None)
return BaseModelOutputWithPast(
last_hidden_state=hidden_states,
past_key_values=presents,
hidden_states=all_hidden_states,
attentions=all_self_attentions,
)
I have trained transformer using simpletransformers model on colab ,downloaded the searialized model and i have little issues on using it to make inferences.
Loading the model on model on jupyter works but while using it with fastapi gives an error
This is how I,m using it on jupyter:
from scipy.special import softmax
label_cols = ['art', 'politics', 'health', 'tourism']
model = torch.load("model.bin")
pred = model.predict(['i love politics'])[1]
preds = softmax(pred,axis=1)
preds
It gives the following result:array([[0.00230123, 0.97465035, 0.00475409, 0.01829433]])
I have tried using fastapi as follows but keeps on getting an error:
from pydantic import BaseModel
class Message(BaseModel):
text : str
model = torch.load("model.bin")
#app.post("/predict")
def predict_health(data: Message):
prediction = model.predict(data.text)[1]
preds = softmax(prediction, axis=1)
return {"results": preds}
Could you please specify the error you get, otherwise its quite hard to see debug
Also it seems that the model.predict function in the jupyter code gets an array as input, while in your fastapi code you are passing a string directly to that function.
So maybe try
...
prediction = model.predict([data.text])[1]
...
It's hard to say without the error.
In case it helps you could have a look at this article that shows how to build classification with Hugging Face transformers (Bart Large MNLI model) and FastAPI: https://nlpcloud.io/nlp-machine-learning-classification-api-production-fastapi-transformers-nlpcloud.html
I am trying to use this model https://github.com/aninda052/Disasters-on-social-media-NLP/blob/master/Disasters%20on%20social%20media.ipynb
, I searched for a way to save this model and use it with new dataset in other application an I find out use pickle, and I add this to code like this
import pickle
model_tfidf=LogisticRegression( C=30.0,class_weight='balanced', solver='newton-cg',
multi_class='multinomial', n_jobs=-1, random_state=5)
model_tfidf.fit(x_train_tfidf, y_train)
predicted_tfidf=model_tfidf.predict(x_test_tfidf)
Pkl_Filename = "Pickle_RL_Model.pkl"
with open(Pkl_Filename, 'wb') as file:
pickle.dump(model_tfidf, file)
after that I tried to create new project to load and use this model and the code is:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
import pandas as pd
import pickle
with open('Pickle_RL_Model.pkl', 'rb') as file:
Pickled_LR_Model = pickle.load(file)
x=["hi disaster","flood disaster","cry sad bad ","srong storm"]
tfd=TfidfVectorizer()
new_data_vec=tfd.fit_transform(x)
Ypredict = Pickled_LR_Model.predict(new_data_vec)
but I got error said:
X has 8 features per sample; expecting 16988
I don't know what I did wrong, any help please.
I want to use my Tensorflow algorithm in an Android app. The Tensorflow Android example starts by downloading a GraphDef that contains the model definition and weights (in a *.pb file). Now this should be from my Scikit Flow algorithm (part of Tensorflow).
At the first glance it seems easy you just have to say classifier.save('model/') but the files saved to that folder are not *.ckpt, *.def and certainly not *.pb. Instead you have to deal with a *.pbtxt and a checkpoint (without ending) file.
I'm stuck there since quite a while. Here a code example to export something:
#imports
import tensorflow as tf
import tensorflow.contrib.learn as skflow
import tensorflow.contrib.learn.python.learn as learn
from sklearn import datasets, metrics
#skflow example
iris = datasets.load_iris()
feature_columns = learn.infer_real_valued_columns_from_input(iris.data)
classifier = learn.LinearClassifier(n_classes=3, feature_columns=feature_columns,model_dir="modeltest")
classifier.fit(iris.data, iris.target, steps=200, batch_size=32)
iris_predictions = list(classifier.predict(iris.data, as_iterable=True))
score = metrics.accuracy_score(iris.target, iris_predictions)
print("Accuracy: %f" % score)
The files you get are:
checkpoint
graph.pbtxt
model.ckpt-1.meta
model.ckpt-1-00000-of-00001
model.ckpt-200.meta
model.ckpt-200-00000-of-00001
Many possible workarounds I found would require having the GraphDef in a variable (don't know how with Scikit Flow). Or a Tensorflow session which doesn't seem to be required using Scikit Flow.
To save as pb file, you need to extract the graph_def from the constructed graph. You can do that as--
from tensorflow.python.framework import tensor_shape, graph_util
from tensorflow.python.platform import gfile
sess = tf.Session()
final_tensor_name = 'results:0' #Replace final_tensor_name with name of the final tensor in your graph
#########Build your graph and train########
## Your tensorflow code to build the graph
###########################################
outpt_filename = 'output_graph.pb'
output_graph_def = sess.graph.as_graph_def()
with gfile.FastGFile(outpt_filename, 'wb') as f:
f.write(output_graph_def.SerializeToString())
If you want to convert your trained variables to constants (to avoid using ckpt files to load the weights), you can use:
output_graph_def = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [final_tensor_name])
Hope this helps!