How recommendation systems work in production? - apache-spark

I am newbie to this and have started learning Spark. I have general question regarding how recommendation systems work in production environments or rather how is it deployed to production.
Below is a small example of a system for e-commerce website.
I understand that once the system is built, at the start we can feed the data to the engine(we can run the jobs or run the program/process for the engine) and it will give the results, which will be stored back to the database against each user. Next time when the user logins the website can fetch the data, previously computed data by the engine from the database and show as recommended items.
The confusion I have is how ‘these systems’, on the fly generate outputs based on the user activity. e.g. If I view a video on Youtube and I refresh the page, Youtube starts showing me similar videos.
So, do we have these recommendation engine running always in background and they keep updating the results based on the user’s activity? How it is done so fast and quick?

Short answer:
Inference is fast; training is the slow part.
Retraining is a tweak to the existing model, not from scratch.
Retraining is only periodic, rarely "on demand".
Long answer:
They generate this output based on a long baseline of user activity. This is not something the engine derives from your recent view: anyone viewing that video on the same day will get the same recommendations.
Some recommender systems will take your personal history into account, but this is usually just a matter of sorting the recommended list with respect to the predicted ratings for you personally. This is done with a generic model from your own ratings by genre and other characteristics of each video.
In general, updates are done once a day, and the changes are small. They don't retrain a model from scratch every time you make a request; the parameters and weights of your personal model are stored under your account. When you add a rating (not just viewing a video), your account gets flagged, and the model will be updated at the next convenient opportunity.
This update will begin with your current model, and will need to run for only a couple of epochs -- unless you've rated a significant number of videos with ratings that depart significantly from the previous predictions.

To add to Prune's answer, depending on the design of the system it may also be possible to take into account the user's most recent interactions.
There are two main ways of doing so:
Fold-in: your most recent interactions are used to recompute the parameters of your personal model while leaving the remainder of the model unchanged. This is usually quite fast and could be done in response to every request.
A model that takes interactions as input directly: some models can compute recommendations directly by taking a user's interactions as input, without storing user representations that can only be updated via retraining. For example, you can represent the user by simply averaging the representations of the items she most recently interacted with. Recomputing this representation can be done in response to every request.
In fact, this last approach seems to form part of the YouTube system. You can find details in the Covington et al. Deep Neural Networks For YouTube Recommendations paper.

Related

Training process to fine-tune a LLM to answer questions regarding movies in dialogue

I'm trying to fine-tune a GPT2 pre-trained model so that it can talk about movies in a dialogue manner (inspired by ChatGPT). And Im not quite sure on how to go about doing it.
This was my initial idea but I'm not entirely confident that it is the way to go:
Freeze most of the initial attention blocks and fine-tune the rest on lots of movie plots and information (release year, cast, etc). In order to acquire knowledge;
Freeze most of the initial attention blocks and fine-tune on a dataset of questions-answers regarding movies (using only the questions and answers, discarding the context since the knowledge should have been acquired in step 1). Using a prompt format that will simulate the conversations, for example: <|startofuserinput|> who dies at the end of the spider-man 3? <|endofuserinput|> \n####\n Harry, Peter's friend dies at the end of spider-man 3 to help Peter. <|endofresponse|>
Use the resulting model to keep generating responses to the user, taking into account the most recent prompt as well as the conversation history.
Is there anything inherently wrong with this method, any obvious flaw? If so, how should I proceed to accomplish such a task?

Recommendation System using Tensorflow.js and to show the result by Graphql

I've been thinking of an app recommends similar users to a user based on some digitized standards she can choose & change.
Every time the amount of user increases & decreases or the user changes her amount of certain standards, the result of recommendation would change too.
What part of Tensorflow.js I should use for implementing this : NLP or Simply training it with datasets? (I know the question is somewhat random, but my knowledge on tfjs is extremely ambiguous even though I tried sorry)
Can I show the everchanging results to user by subscription of Graphql?
I guess the basic idea is a part of collaborative filtering but..

Can chatbots learn or unlearn while chatting with trusted users

Can chatbots like [Rasa] learn from the trusted user - new additional employees, product ids, product categories or properties - or unlearn when these entities are no longer current ?
Or do I have to go through formal data collection, training sessions, testing (confidence rates > given ratio), before the new version be made operational.
If you have entity values that are being checked against a shifting list of valid values, it's more scalable to check those values against a database that is always up to date (e.g. your backend systems probably have a queryable list of current employees). Then if a user provides a value that used to be valid and now isn't, it will act the same as if a user provided an invalid value in the first place.
This way, the entity extraction can stay the same regardless of if some training examples go out of relevance -- though of course it's always good to try to keep your data up to date!
Many Chatbots do not have such a function. Except avanced ones like Alexa, with the keyword "Remember" available 2017 +/-. The user wants Alexa to commit to memory certain facts.
IMHO such a feature is a mark of "intelligence". It is not trivial to implement in ML systems where coefficients in their neural network models are updated by back-propagation after passing learning examples. Rule-based systems (such as CHAT80 a QA system on geography) store their knowledge in relations that can be updated more transparently.

tensorflow for classification of strings vs elasticsearch

So, a little bit on my problem.
TL;DR
Can I use machine-learning instead of Elastic Search to find results depending on the user's text input? Is it a good idea?
I am working on a car spare parts project, and we have split the car into 300 parts that we store on the database, with some data for each part (weight, availability, etc).
When the customer inputs the text of his part, we need to be able to classify the part, and map it to one in our database.
The current way it's being done is by people on our team manually mapping the customer text with the parts on our database, we want to automate that process.
We tried using MongoDB text search, but it was often inaccurate since parts have different names in different parts of the country.
So we wanted something that got more accurate results, and improved by the more data we have, we immediately considered TensorFlow, after some research and taking part of Google's Machine Learning Crash Course, I got to that point where it specified:
Models can't learn from string values, so you'll have to perform some feature engineering to convert those values to something numeric
That would be useful in the case we have limited number of features as strings, but we don't know what the user will input as a text.
So, my questions are:
1- Can we use Machine Learning to map text input by the user with some documents on our database?
2- If we can do that, is it a good idea to favor it over other search tools like ElasticSearch?
3- Can ElasticSearch improve its results the more data we have? How?
4- How would you go about this problem?
Note: I'd be doing that in Node.js, and since TensorFlow.js is new, I am inclining to go for other solutions, but if push comes to shove, and the results are much better, I would definitely go there.
TL;DR: Yes and yes.
TS;WM:
This is a perfectly suited problem for machine learning. Especially so, if you have a database of past customer texts that have already been mapped to parts. Ideally, you have hundreds of texts mapped to each part. If that is present, you can design and train a network. And models can learn from string values with some engineering, and it's not that bad.
I'm not sure ElasticSearch would improve much on the network. I don't know much about auto parts trading, but as a wild guess, "the large round thingy that helps change direction" would never be mapped to "steering wheel" by ES but could be learned easily by a network - provided there are at least some examples of people using that text to specify steering wheel.
You can but don't have to necessarily use tensorflow.js for your network. The AI could run on your server as a webservice, and you'd just send over the customer's text to it and it would send back it's recommendations of part SKUs and names.

suggest list of how-to articles based on text content

I have 20,000 messages (combination of email and live chat) between my customer and my support staff. I also have a knowledge base for my product.
Often times, the questions customers ask are quite simple and my support staff simply point them to the right knowledge base article.
What I would like to do, in order to save my support staff time, is to show my staff a list of articles that may likely be relevant based on the initial user's support request. This way they can just copy and paste the link to the help article instead of loading up the knowledge base and searching for the article manually.
I'm wondering what solutions I should investigate.
My current line of thinking is to run analysis on existing data and use a text classification approach:
For each message, see if there is a response with a link to a how-to article
If Yes, extract key phrases (microsoft cognitive services)
TF-IDF?
Treat each how-to as a 'classification' that belongs to sets of key phrases
Use some supervised machine learning, support vector machines maybe to predict which 'classification, aka how-to article' belongs to key phrase determined from a new support ticket.
Feed new responses back into the set to make the system smarter.
Not sure if I'm over complicating things. Any advice on how this is done would be appreciated.
PS: naive approach of just dumping 'key phrases' into search query of our knowledge base yielded poor results since the content of the help article is often different than how a person phrases their question in an email or live chat.
A simple classifier along the lines of a "spam" classifier might work, except that each FAQ would be a feature as opposed to a single feature classifier of spam, not-spam.
Most spam-classifiers start-off with a dictionary of words/phrases. You already have a start on this with your naive approach. However, unlike your approach a spam classifier does much more than a text search. Essentially, in a spam classifier, each word in the customer's email is given a weight and the sum of weights indicates if the message is spam or not-spam. Now, extend this to as many features as FAQs. That is, features like: FAQ1 or not-FAQ1, FAQ2 or not-FAQ2, etc.
Since your support people can easily identify which of the FAQs an e-mail requires then using a supervised learning algorithm would be appropriate. To reduce the impact of any miss-classification errors, then consider the application presenting a support person with the customer's email followed by the computer generated response and all the support person would have to-do is approve the response or modify it. Modifying a response should result in a new entry in the training set.
Support Vector Machines are one method to implement machine learning. However, you are probably suggesting this solution way too early in the process of first identifying the problem and then getting a simple method to work, as well as possible, before using more sophisticated methods. After all, if a multi-feature spam classifier works why invest more time and money in something else that also works?
Finally, depending on your system this is something I would like to work-on.

Resources