How to use secondary user actions with to improve recommendations with Spark ALS? - apache-spark

Is there a way to use secondary user actions derived from the user click stream to improve recommendations when using Spark Mllib ALS?
I have gone through the explicit and implicit feedback based example mentioned here : https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html that uses the same ratings RDD for the train() and trainImplicit() methods.
Does this mean I need to call trainImplicit() on the same model object with a RDD(user,item,action) for each secondary user action? Or train multiple models , retrieve recommendations based on each action and then combine them linearly?
For additional context, the crux of the question is if Spark ALS can model secondary actions like Mahout's spark item similarity job. Any pointers would help.

Disclaimer: I work with Mahout's Spark Item Similarity.
ALS does not work well for multiple actions in general. First an illustration. The way we consume multiple actions in ALS is to weight one above the other. For instance buy = 5, view = 3. ALS was designed in the days when ratings seemed important and predicting them was the question. We now know that ranking is more important. In any case ALS uses predicted ratings/weights to rank results. This means that a view is really telling ALS nothing since a rating of 3 means what? Like? Dislike? ALS tries to get around this by adding a regularization parameter and this will help in deciding if 3 is a like or not.
But the problem is more fundamental than that, it is one of user intent. When a user views a product (using the above ecom type example) how much "buy" intent is involved? From my own experience there may be none or there may be a lot. The product was new, or had a flashy image or other clickbait. Or I'm shopping and look at 10 things before buying. I once tested this with a large ecom dataset and found no combination of regularization parameter (used with ALS trainImplicit) and action weights that would beat the offline precision of "buy" events used alone.
So if you are using ALS, check your results before assuming that combining different events will help. Using two models with ALS doesn't solve the problem either because from buy events you are recommending that a person buy something, from view (or secondary dataset) you are recommending a person view something. The fundamental nature of intent is not solved. A linear combination of recs still mixes the intents and may very well lead to decreased quality.
What Mahout's Spark Item Similarity does is to correlate views with buys--actually it correlates a primary action, one where you are clear about user intent, with other actions or information about the user. It builds a correlation matrix that in effect scrubs the views of the ones that did not correlate to buys. We can then use the data. This is a very powerful idea because now almost any user attribute, or action (virtually the entire clickstream) may be used in making recs since the correlation is always tested. Often there is little correlation but that's ok, it's an optimization to remove from the calculation since the correlation matrix will add very little to the recs.
BTW if you find integration of Mahout's Spark Item Similarity daunting compared to using MLlib ALS, I'm about to donate an end-to-end implementation as a template for Prediction.io, all of which is Apache licensed open source.

Related

What are the best metrics for Multi-Object Tracking (MOT) evaluation and why?

I want to compare multiple computer vision Multi-Object Tracking (MOT) methods on my own dataset, so first I want to choose the best metrics for this task. I have carried out some research in scientific literature and I come to the conclusion that there are three main metrics sets:
Metrics from "Tracking of Multiple, Partially Occluded Humans based on Static Body Part
Detection"
CLEAR MOT metrics
ID scores
Therefore, I wonder to which of the above metrics should I attach the greatest importance?
And I would like to ask if anyone has encountered a similar issue and has any thoughts on this topic that could justify and help me to choose the best metrics for the above task.
I know this is old but I see nobody mentioning HOTA (https://arxiv.org/pdf/2009.07736.pdf). This metric has become the new standard for multi-object tracking as can be seen in the latest SOTA tracking research: https://arxiv.org/abs/2202.13514 and https://arxiv.org/pdf/2110.06864.pdf
The reason behind using a metric that is not MOTA and IDF1 is that they overemphasize detection and association respectively. HOTA explicitly measures both types of errors and combines these in a balanced way. HOTA also incorporates measuring the localization accuracy of tracking results which isn’t present in either MOTA or IDF1.
You can refer to the metrics used in the MOT Challenge.
Here's the results for the MOT 2020 Challenge and they have included the metrics used here:
https://motchallenge.net/results/MOT20/
Based on the MOT 20 paper, they said at section 4.1.7 (page 7):
As we have seen in this section, there are a number of reasonable performance measures to assess the quality of a tracking system, which makes it rather difficult to reduce the evaluation to one single number. To nevertheless give an intuition on how each tracker performs compared to its competitors, we compute and show the average rank for each one by ranking all trackers according to each metric and then averaging across all performance measures.
the metrics that you choose related to what your goal after multiple object tracking like : if your goals interest tracking the people inside scence you should the metric ID switch very low and so on ...
you should find the metrics related to your goals.

How to use spark ALS for multi-behavior implicit feedback recommendation

I want to use spark ALS for multi-behavior implicit feedback recommendation. There are several kinds of implicit user behavior data, such as browses, carts, deals etc.
I have checked numerous online sources for ALS implicit feedback recommendation, but almost all of them utilized only single source of data, in shopping case, the deal data.
I am wonder if whether only the deal data is needed or utilize all kinds of data for better results?
There is no general purpose, principled way to use ALS with multiple behaviors. Sometimes different behaviors are used to vary implicit ratings -- for example, viewing an item might be worth 0.1, viewing in multiple sessions might be worth 0.3, putting it in a cart 0.5, and a purchase 1.0. But this feels a bit hacky, and doesn't readily provide a way to exploit all the data you might have.
For a more principled approach that scales to handling lots of different features, I would take a look at the Universal Recommender. Disclaimer: I've never used it, I just think it sounds promising.
yes, you'd better using all deal data and user data. You use ALS to acquire user vector and deal vector, then compute the similarity of deal and user, if the user or deal have no vector, we can't get the similarity for next recommendation.
I had a test for ALS, and used the similarity of user and deal for training my model, it gave me big surprising,the auc follow as :
2018-06-05 21:25:28,138 INFO [21:25:28] [58] test-auc:0.968764 train-auc:0.972966
2018-06-05 21:25:28,442 INFO [21:25:28] [59] test-auc:0.968865 train-auc:0.973075
because I use all the deal and user information to train the model. but rmse is 3.6, maybe I should tune my parameter.

Spark Item Similarity Interpretation (Cross-Similarity and Similarity)

I've been using Spark Item Similarity through mahout by following the steps in this article:
https://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html
I was able to clean my data, setup a local-only spark/hadoop node and all that.
Now, my question relies more in the interpretation of the matrices. I've tried some Google queries with limited success.
I'm creating a multi-modal recommender - and one of my datasets is very similar to the Mahout example.
Example input:
Customer ActionName Product
11064612 view 241505
11086047 purchase 110915
11121878 view CERT_DL
11149030 purchase CERT_FS
11104130 view 111401
The output of mahout is 2 sets of matrices. A similarity matrix and a coocurrence matrix.
This is my similarity matrix (I assume mahout uses my "filter1" purchases)
**791207-WP** 791520-WP:11.350536461453885 791520:9.547158147208393 76130142:7.938639976084232 711215:7.0641921646893024 751309:6.805891904514283
So how would I interpret this? If someone purchased 791207-WP they could be interested in 791520-WP? (so I'd use the left part against purchases of a customer and rank products in the right part?).
The row for 791520-WP looks like this:
791520-WP 76151220:18.954662238247693 791604-WP:13.951210170984268
So, in theory, I'd recommend 76151220 to someone who bought 791520-WP, correct?
Part 2 of the question is interpreting the cross-similarity matrix. Remember my filter2 is "views".
How would I interpret this:
**790907** 76120956:14.2824428207241 791500-LXQ2:13.864741460885853 190907:10.735807818360627
I take this matrix as "someone who visited the 76120956 web page ended up purchasing 790907". So I should promote 790907 to customers who bought 76120956 and maybe even add a link between these 2 products on our site, for example.
Or is it "people who visited the webpage of 790907 ended up buying 76120956"?
My plan is not to use these as-is. I'll still use RowSimilarity and different sources to rank products - but I'm missing the basic interpretation of the outputs from mahout.
If you know of any documentation that clarifies this, that would be a great asset to have.
Thank you.
In both cases the matrix is telling you that the item-id key is similar to the listed items by the LLR value attached to each similar item. Similar in the sense that similar users purchased the items. In the second case it is saying that similar people viewed the items and this view also appears to have led of a purchase of the same item.
Cooccurrence works for purchases alone, cross-occurrence adds the check to make sure the view also correlated with a purchase. This allows you to use both for recommendations.
The output is meant to be used with a search engine generally and you would use a user's history of purchases and views as a 2 field query against the matrices, one in each field.
There are analogous methods to find item-based recommendations.
Better yet, use something like the Universal Recommender here: actionml.com/docs/ur with PredictionIO for an end-to-end system.

Content based recommendation in scale

This question is probably very repeated in the blogging and Q&A websites but I couldn't find any concrete answer yet.
I am trying to build a recommendation system for customers using only their purchase history.
Let's say my application has n products.
Compute item similarities for all the n products based on their attributes (like country, type, price)
When user needs recommendation - loop the previously purchased products p for user u and fetch the similar products (similarity is done in the previous step)
If am right we call this as content-based recommendation as opposed to collaborative filtering since it doesn't involve co-occurrence of items or user preferences to an item.
My problem is multi-fold:
Is there any existing scalable ML platform that addresses contend based recommendation (I am fine to adopt different technologies/language)
Is there a way to tweak Mahout to get this result?
Is classification a way to handle content based recommendation?
Is it something that a graph database good at solving?
Note: I looked at Mahout (since am familiar with Java and Mahout apparently utilizes Hadoop for distributed processing) for doing this in scale and advantage of having a well tested ML algorithms.
Your help is appreciated. Any examples would be really great. Thanks.
The so called item-item recommenders are natural candidates for precomputing the similarities, because the attributes of the items rarely change. I would suggest you precompute the item similarity between each item, and perhaps store the top K for each item, and if you have enough resources you could load the similarity matix into main memory for real time recommendation.
Check out my answer to this question for a way to do this in Mahout: Does Mahout provide a way to determine similarity between content (for content-based recommendations)?
The example is how to compute the textual similarity between the items, and than load the precomputed values into main memory.
For performance comparison about different data structures to hold the values check out this question: Mahout precomputed Item-item similarity - slow recommendation

Topic modeling using mallet

I'm trying to use topic modeling with Mallet but have a question.
How do I know when do I need to rebuild the model? For instance I have this amount of documents I crawled from the web, using topic modeling provided by Mallet I might be able to create the models and infer documents with it. But overtime, with new data that I crawled, new subjects may appear. In that case, how do I know whether I should rebuild the model from start till current?
I was thinking of doing so for documents I crawled each month. Can someone please advise?
So, is topic modeling more suitable for text under a fixed amount of topics (the input parameter k, no. of topics). If not, how do I really determine what number to use?
The answers to your questions depend in large part on the kind of data you're working with and the size of the corpus.
Regarding frequency, I'm afraid you'll just have to estimate how often your data changes in a meaningful way and remodel at that rate. You could start with a week and see if the new data lead to a significantly different model. If not, try two weeks and so on.
The number of topics you select is determined by what you're looking for in the model. The higher the number, the more fine-grained the results. If you want a broad overview of what's in your corpus, you could select say 10 topics. For a closer look, you could use 200 or some other suitably high number.
I hope that helps.

Resources