catalogue.RegistryError: [E893] Could not find function 'Custom_Candidate_Gen.v1' in function registry 'misc' - python-3.x

I am currently building a spacy pipeline with custom NER,Entity Linker and Textcat components. For my Entity Linker component, I have modified the candidate_generator() to suit my use-case. I have used the ner_emersons demo project for reference. Following is my custom_functions code.
import spacy
from functools import partial
from pathlib import Path
from typing import Iterable, Callable
from spacy.training import Example
from spacy.tokens import DocBin
from spacy.kb import Candidate, KnowledgeBase, get_candidates
#spacy.registry.misc("Custom_Candidate_Gen.v1")
def create_candidates():
return custom_get_candidates
def custom_get_candidates(kb, span):
return kb.get_alias_candidates(span.text.lower())
#spacy.registry.readers("MyCorpus.v1")
def create_docbin_reader(file: Path) -> Callable[["Language"], Iterable[Example]]:
return partial(read_files, file)
def read_files(file: Path, nlp: "Language") -> Iterable[Example]:
# we run the full pipeline and not just nlp.make_doc to ensure we have entities and sentences
# which are needed during training of the entity linker
with nlp.select_pipes(disable="entity_linker"):
doc_bin = DocBin().from_disk(file)
docs = doc_bin.get_docs(nlp.vocab)
for doc in docs:
yield Example(nlp(doc.text), doc)
After training my entity linker and adding my textcat component to the pipeline, I am getting the following error:
catalogue.RegistryError: [E893] Could not find function 'Custom_Candidate_Gen.v1' in function registry 'misc'. If you're using a custom function, make sure the code is available. If the function is provided by a third-party package, e.g. spacy-transformers, make sure the package is installed in your environment.
Available names: spacy.CandidateGenerator.v1, spacy.EmptyKB.v1, spacy.KBFromFile.v1, spacy.LookupsDataLoader.v1, spacy.ngram_range_suggester.v1, spacy.ngram_suggester.v1
Why isn't my custom Candidate Generator getting registered?

Your options for having custom code loaded and registered when you load a model:
import this code directly in your script before loading the model
package it with your model with spacy package --code and load the model from the installed package name (rather than the directory)
provide this code in a separate package that uses entry points in setup.cfg to register the methods (which works fine, but wouldn't be my first choice in this situation)
See:

Related

Azure ML model deployment fail: Module not found error

I'm trying to deploy a model locally using Azure ML before deploying to AKS. I have a custom script that I want to import into my entry script (scoring script), but it's saying it is not found.
Here is the error:
Here's my entry script with the custom script import on line 1:
import rake_refactored as rake
from operator import itemgetter
import pandas as pd
import datetime
import re
import operator
import numpy as np
import json
# Called when the deployed service starts
def init():
global stopword_path
# AZUREML_MODEL_DIR is an environment variable created during deployment.
# It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)
# For multiple models, it points to the folder containing all deployed models (./azureml-models)
stopword_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'models/SmartStoplist.txt')
# load models
def preprocess(df):
df = rake.prepare_data(df)
text = rake.process_response(df, "RESPNS")
return text
# Use model to make predictions
def predict(df):
text = preprocess(df)
return rake.extract_keywords(stopword_path, text)
def run(data):
try:
# Find the data property of the JSON request
df = pd.read_json(json.loads(data))
prediction = predict(df)
return json.dump(prediction)
except Exception as e:
return str(e)
And here is my model artifact directory in Azure ML showing that it is in the same directory as the entry script (rake_score.py).
What am I doing wrong? I had a similar issue before with a sklearn package that I was able to add to the pip-package list when I built the environment, but my custom script isn't a pip package.
Not able to find rake_refactored in documentation and on the internet.
You can try below steps for importing rake.
Using pip
pip install rake-nltk
Directly from the repository
git clone https://github.com/csurfer/rake-nltk.git
python rake-nltk/setup.py install
Sample Code:
from rake_nltk import Rake
# Uses stopwords for english from NLTK, and all puntuation characters by
# default
r = Rake()
# Extraction given the text.
r.extract_keywords_from_text(<text to process>)
# Extraction given the list of strings where each string is a sentence.
r.extract_keywords_from_sentences(<list of sentences>)
# To get keyword phrases ranked highest to lowest.
r.get_ranked_phrases()
# To get keyword phrases ranked highest to lowest with scores.
r.get_ranked_phrases_with_scores()
Refer - https://github.com/csurfer/rake-nltk
In order to access my custom script in my scoring script I needed to explicitly define the source directory in my inference configuration:
from azureml.core.model import InferenceConfig
inference_config = InferenceConfig(
environment = env,
entry_script = "rake_score.py",
source_directory='./models'
)

How to undo #tf.keras.utils.register_keras_serializable

You use this to register a func into keras serialization framework. I did this in a notebook cell:
#tf.keras.utils.register_keras_serializable()
def foo():
return tf.constant(1)
But if I need to make changes to the method, you will get a ValueError: [...] has already been registered to [...].
Is there a way to unregister whatever that was, and then re-register with the updated func?
In Tensorflow Version 2.9.2 this can be done via:
import tensorflow as tf
tf.keras.utils.get_custom_objects().clear()
The tf.keras.utils.method get_custom_objects() returns the dict _GLOBAL_CUSTOM_OBJECTS that contains all the methods added by #tf.keras.utils.register_keras_serializable().
So including
tf.keras.utils.get_custom_objects().clear()
after importing tensorflow fixes this ValueError, at least in IPython Console from Spyder that I used for testing.

Derived Class of Pytorch nn.Module Cannot be Loaded by Module Import in Python

Using Python 3.6 with Pytorch 1.3.1. I have noticed that some saved nn.Modules cannot be loaded when the whole module is being imported into another module. To give an example, here is the template of a minimum working example.
#!/usr/bin/env python3
#encoding:utf-8
# file 'dnn_predict.py'
from torch import nn
class NN(nn.Module):##NN network
# Initialisation and other class methods
networks=[torch.load(f=os.path.join(resource_directory, 'nn-classify-cpu_{fold}.pkl'.format(fold=fold))) for fold in range(5)]
...
if __name__=='__main__':
# Some testing snippets
pass
The whole file works just fine when I run it in the shell directly. However, when I want to use the class and load the neural network in another file using this code, it fails.
#!/usr/bin/env python3
#encoding:utf-8
from dnn_predict import *
The error reads AttributeError: Can't get attribute 'NN' on <module '__main__'>
Does loading of saved variables or importing modules happen differently in Pytorch than other common Python libraries? Some help or pointer to the root cause will be really appreciated.
When you save a model with torch.save(model, PATH) the whole object gets serialised with pickle, which does not save the class itself, but a path to the file containing the class, hence when loading the model the exact same directory and file structure is required to find the correct class. When running a Python script, the module of that file is __main__, therefore if you want to load that module, your NN class must be defined in the script you're running.
That is very inflexible, so the recommended approach is to not save the entire model, but instead just save the state dictionary, which only saves the parameters of the model.
# Save the state dictionary of the model
torch.save(model.state_dict(), PATH)
Afterwards, the state dictionary can be loaded and applied to your model.
from dnn_predict import NN
# Create the model (will have randomly initialised parameters)
model = NN()
# Load the previously saved state dictionary
state_dict = torch.load(PATH)
# Apply the state dictionary to the model
model.load_state_dict(state_dict)
More details on the state dictionary and saving/loading the models: PyTorch - Saving and Loading Models

Python: cannot find reference to class when using import

I am new to python. I am using the anaconda prompt to run my code. I am trying to import a class from another module but I keep getting errors such as cannot find reference to the class.
P.S please don't negative mark this, or else I will lose the privilege of asking questions
I have already provided an __init__ function.
The module itself runs just fine.
I have used from parser import Parser
I have used from parser import * statement as well
My Parser class
class Parser(object):
def __init__(self, tokens):
self.tokens = tokens
self.token_index = 0
My main class
from parser import Parser
I expected it to normally import the class, but it is unable to.
I keep getting the cannot import name 'Parser' from 'parser' (unknown location) when I use from parser import Parser
parser is also the name of a Python module in the standard library.
It's likely there is a name conflict, and that Python is importing the module from the standard library.
I changed the module name to mparser and it works!

Why the import "from tensorflow.train import Feature" doesn't work

That's probably totally noob question which has something to do with python module importing, but I can't understand why the following is valid:
> import tensorflow as tf
> f = tf.train.Feature()
> from tensorflow import train
> f = train.Feature()
But the following statement causes an error:
> from tensorflow.train import Feature
ModuleNotFoundError: No module named 'tensorflow.train'
Can please somebody explain me why it doesn't work this way? My goal is to use more short notation in the code like this:
> example = Example(
features=Features(feature={
'x1': Feature(float_list=FloatList(value=feature_x1.ravel())),
'x2': Feature(float_list=FloatList(value=feature_x2.ravel())),
'y': Feature(int64_list=Int64List(value=label))
})
)
tensorflow version is 1.7.0
Solution
Replace
from tensorflow.train import Feature
with
from tensorflow.core.example.feature_pb2 import Feature
Explanation
Remarks about TensorFlow's Aliases
In general, you have to remember that, for example:
from tensorflow import train
is actually an alias for
from tensorflow.python.training import training
You can easily check the real module name by printing the module. For the current example you will get:
from tensorflow import train
print (train)
<module 'tensorflow.python.training.training' from ....
Your Problem
In Tensorflow 1.7, you can't use from tensorflow.train import Feature, because the from clause needs an actual module name (and not an alias). Given train is an alias, you will get an ImportError.
By doing
from tensorflow import train
print (train.Feature)
<class 'tensorflow.core.example.feature_pb2.Feature'>
you'll get the complete path of train. Now, you can use the import path as shown above in the solution above.
Note
In TensorFlow 1.9.0, from tensorflow.train import Feature will work, because tensorflow.train is an actual package, which you can therefore import. (This is what I see in my installed Tensorflow 1.9.0, as well as in the documentation, but not in the Github repository. It must be generated somewhere.)
Info about the path of the modules
You can find the complete module path in the docs. Every module has a "Defined in" section. See image below (taken from Module: tf.train):
I would advise against importing Feature (or any other object) from the non-public API, which is inconvenient (you have to figure out where Feature is actually defined), verbose, and subject to change in future versions.
I would suggest as an alternative to simply define
import tensorflow as tf
Feature = tf.train.Feature

Resources