is there a way to code a product recommendation system on RStudio using svm or knn? (Without using recommenderlab) - svm

I want to code a product recommendation system on RStudio, using KNN or SVM. But i dont wanna use recommenderlab.
Is there an easy way to code it? I have variables such as name, gender, price, product type, product brand etc etc.
I tried running SVM and KNN. I got the confusion matrixes with the product type.
But i don't know how to get a list of recommended products for a particular user.

Related

Should the dataset be domain specific when it comes to Named Entity Recognition?

For my final year undergraduate project, I intend to use named entity recognition to classify a fiction summary based on LOCATION, PERSON, and so on. When I was looking into datasets I couldn't find any labelled dataset of fiction summaries.
My doubt is, whether the the training dataset for NER should be specific to the domain? in my case, for fiction. If not even though I'm developing a model for fiction can I use dataset like 'conll2003' which is a dataset about news domain?
I would love replies as I'm stuck with this now without being able to proceed in my project.
Thanks in advance :)
I tried labelling an unlabelled fiction summary dataset manually but seems like it will be taking very much long time which I can't afford. That's why I wanted to know whether I can use labelled datasets which are not specific to the domain

Building on existing models on spacy

This is a question regarding training models on SPACY3.x.
I couldn't find a good answer/solution on StackOverflow hence the query.
If I am using the existing model in spacy like the en model and want to add my own entities in the model and train it, let's say since I work in the biomedical domain, things like virus name, shape, length, temperature, temperature value, etc. I don't want to lose the entities tagged by Spacy like organization names, country, etc.
All suggestions are appreciated.
Thanks
There are a few ways to do that.
The best way is to train your own model separately and then combine both models in one pipeline, with one before the other. See the double NER example project for an overview of that.
It's also possible to update the pretrained NER model, see this example project. However this isn't usually a good idea, and definitely not if you're adding completely different entities. You'll run into what's called "catastrophic forgetting", where even though you're technically updating the model, it ends up forgetting everything not represented in your current training data.

Predicting Trending Product

I am a beginner in Machine Learning . I want to build a model for predicting
trending product. Can you please tell me in which layout and what parameters
do I need in my dataset. Let's say I want to predict a certain product from certain category .So I will be collecting dataset from various e-commerce sites e.g ebay, amazon etc. of that category .
Please tell me in detail.
You will need an dataset with features like
Number of sales
Ratings
Recommendations
And many more.
This is be a classification problem. You need to classify the products as trendy or not trendy. Also you will need labels which describe the data as trendy or not trendy.

Can i predict data price based on a survey on azure machine learning?

I want to predict my input price based on a list of questions/answers using azure machine learning.
I built one using the "bayesian linear regression" but it seems that it is predicting the price based on the prices i have in my dataset and not based on the Q/A.
Am i in the wrong path or am i missing something?
Any suggestion would be helpful.
Check the Q/A s that you using is not having missing values. If there's any missing values follow data preprocessing techniques to fill those.
What kind of answers do you have as inputs? (yes/no, numeric values, different textual answers, etc...) In my opinion numerical values and yes/no inputs makes your model more accurate.
Try different regression algorithms (https://azure.microsoft.com/en-us/documentation/articles/machine-learning-algorithm-cheat-sheet/) and check their accuracy.
you need to set features and label properly. if you publish your experiment in Gallery using unlisted mode and paste the link here, we can take a look.

How can I convert probability into score?

I am now working on a document recommendation program and I am kinda stuck here.
For each document, I have a score assigned according to user's actions. Then, when a new document comes in, I need to predict how user will like it and rerank the whole documents again according to their scores. My solution is to use a threshold to divide those scores into "recommend" and "not recommend". Then naiveBayes or other classification models can either give me a label or return the possibility of that label (I am using NLTK package to do text analytics).
Am I on the right way? My question is when I get that possibility, how can I convert it into the score that I use to do the ranking? Or I should use logistic regression in scikit instead?
Thanks!
It sounds like you are trying to force a ranking problem into a classification problem. What you really want to do is learn how to rank the documents given a "query".
I would suggest trying out something like the SVM-Rank algorithm. It takes as input a set of "recommended" and "not recommended" vectors and then learns how to rank them so that the recommended ones come first. There is also a simple python tool in dlib you can use to do it. See here for an example: http://dlib.net/svm_rank.py.html

Resources