Are there any pre-trained NER entity-linking models available? - nlp

I want to run entity-linking for a project of mine. I used Spacy for the NER on a corpus of documents. Is there an existing linking model I can simply use to link the entities found?
The documentation I have found seems to be how to train a custom one.
Examples:
https://spacy.io/api/kb
https://github.com/explosion/spaCy/issues/4511
Thanks!

spaCy does not distribute pre-trained entity linking models. See here for some comments on why not.

I found one - Facebook GENRE:
https://github.com/facebookresearch/GENRE

Related

Should I use tok2vec before ner in spacy?

I'm currently training a model for named entity recognition and I could not find out how the pipeline in spacy should be structured in order to achieve better results. Does it make sense to use tok2vec before the ner component?
The NER component requires a tok2vec (or Transformers) component as a source of features, and will not work without it.
For more details about pipeline structure and feature sources, this section of the docs may be helpful.

Is there a pretrained model that can detect and classify if a human is in a photo?

I am trying to find a pre-trained model that will classify images based on if there is a human present in the photo or not.
You can use the models trained on the COCO dataset for this.
For example, for Pytorch you can have a look at the official documentation concerning the provided models here.
There are more variety of models if you give it a simple search both for Pytorch and other frameworks.
You can check out the COCO homepage if you need more information concerning the dataset and the tasks it supports.
You may also find These useful:
Detecting people using Yolo-OpenCV
Yolo object detection in pytorch
Another Yolo implementation in Pytorch
Similar question on ai.stackexchange
You can also utilize frameworks such as Detectorn2, mmdetection for these tasks.(Or Tensorflow's ObjectDetectionAPI , ect)

What does merge.txt file mean in BERT-based models in HuggingFace library?

I am trying to understand what merge.txt file infers in tokenizers for RoBERTa model in HuggingFace library. However, nothing is said about it on their website. Any help is appreciated.
You can find a description here:
https://github.com/huggingface/transformers/issues/4777
https://github.com/huggingface/transformers/issues/1083#issuecomment-524303077

opennlp sample training data for disease

I'm using OpenNLP for data classification. I could not find TokenNameFinderModel for disease here. I know I can create my own model but I was wondering is there any large sample training data available for disease?
You can easily create your own training data-set using the modelbuilder addon and follow some rules as mentioned here to train create a good NER model.
you can find some help using modelbuilder addon here.
It is basically, you put all the information in a text file and the NER entities in another. The addon searches for a paticular entity and replace it with the required tag. Hence producing the tagged data. It must be pretty easy to use this tool!
Hope this helps!

dbpedia NLP dataset used for Named entity extraction

I went through their github files as well as the official site, I can't find the named entity tagging training corpus they used in splotlight.
How Can I found the dataset instead of a trained model?
see This link https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Web-service
In here, method for setting up dbpedia lookup offline is explained. Also they have given 4 tar files which are
redirects_en.nt
short_abstracts_en.nt
instance_types_en.nt
article_categories_en.nt
these are supposed to be training data for it.

Resources