Using edge features for GCN in DGL - conv-neural-network

I'm trying to implement a graph convolutional network (GCN) in the Deep Graph Learning (DGL) package for Python. In many papers, edges have discrete features, and each possible value is associated with a different weight matrix or set of weight matrices. An example would be here. Is anyone familiar with how to implement a model like this in DGL? The DGL team's example of GCNs for graph classification, as does another example I found online.

Not sure whether the question still needs to be answered, but I guess it boils down to how to implement models like R-GCN or HGT with DGL. Some of these layers come build-in with DGL here. But it is also easy to implement your own computations. The following explanation only makes sense if you know the basic computational process of DGL during a forward pass through a graph layer (message, reduce, apply_node), if not DGL has good tutorials on that as well. To extend the usual graph computation to for example edges of different types you need to create a heterogenous Graph object and call multi_update_all on that graph object. You can pass a dictionary to that function which specifies the computation per edge type.

Related

How to batch a nested list of graphs in pytorch geometric

I am currently training a model which is a mix of graph neural networks and LSTM. However that means for each of my training sample, I need to pass in a list of graphs. The current batch class in torch_geometric supports batching with torch_geometric.data.Batch.from_data_list() but this only allows one graph for each data point. How else can I go about batching the graphs?
Use diagonal batching:
https://pytorch-geometric.readthedocs.io/en/latest/notes/batching.html
Simply, you will put all the graphs as subgraphs into one big graph. All the subgraphs will be isolated.
See the example from TUDataset:
https://colab.research.google.com/drive/1I8a0DfQ3fI7Njc62__mVXUlcAleUclnb?usp=sharing

pyspark ALS Collaborative Filtering - generating explanation of predictions

The pyspark ml recommendation package includes an ALS implementation based on the paper by Hu, Koren and Volinsky: http://yifanhu.net/PUB/cf.pdf for implicit feedback datasets.
https://spark.apache.org/docs/2.3.0/ml-collaborative-filtering.html
https://spark.apache.org/docs/2.3.1/api/python/_modules/pyspark/mllib/recommendation.html
Does the implementation provide a built in function call to generate, for a prediction for a given user and item, p_ui, the linear decomposition into contributing past actions as represented by equation (7) in the HKV article?
I.e. is there a built in way to retrieve the s^u_ij and c_uj matrices of the equation (7)?
Looking over api web docs, I don't see it. Hoping others may have come across it.

Is there any support for BiPlots when using PCA in spark.ml?

I have used kmeans and PCA to attempt to visualise high dimensional k-means clusters in two dimensions but have lost the meaning of the clusters in 2D.
Is there anyway to project the features onto to 2D plot to return some interpretability?
Any non-linear dimensionality reduction method might work better (also called "manifold learning", e.g. see sklearn's suite). The t-sne method is generally quite popular for this.
However, these do not take your cluster labels into account. If you wanted to do that (although generally you do not), you could add a penalty to the manifold learning technique that forces same-cluster points to be close together, for example.

Is it possible to achieve something similar to word2vec using a graphdb?

Otherwise said replace eigen vectors with pattern matching and graph traversal and emulate dimension reduction?
I mean that given a semantic graph of english words compute something similar to:
king - man = queen
Which means that I can subtract from a graph a subgraph and score the resulting subgraph given a metric.
I don't expect that this will be a single neo4j or gremlin query. I'm interested in the underlying mechanic involved in reasoning at the same time globaly and localy over a graph database.
I think it's important to remember the difference between graph databases as a storage solution and then using machine learning to extract connected graphs as vectors that represent features that are used to train a ML model proper.
The difference is that you can structure your data in such a way that makes it easier to find patterns that are suitable for creating a machine learning model. It's certainly a good idea to use Neo4j to do this but it's not something that comes out of the box. I've created a plugin to Neo4j that will extract hierarchical pattern matches from text using a genetic algorithm that I thought up. You can take a look here: http://www.kennybastani.com/2014/08/using-graph-database-for-deep-learning-text-classification.html
You can then use the resulting data to construct a word2vec model.

Setting feature weights for KNN

I am working with sklearn's implementation of KNN. While my input data has about 20 features, I believe some of the features are more important than others. Is there a way to:
set the feature weights for each feature when "training" the KNN learner.
learn what the optimal weight values are with or without pre-processing the data.
On a related note, I understand generally KNN does not require training but since sklearn implements it using KDTrees, the tree must be generated from the training data. However, this sounds like its turning KNN into a binary tree problem. Is that the case?
Thanks.
kNN is simply based on a distance function. When you say "feature two is more important than others" it usually means difference in feature two is worth, say, 10x difference in other coords. Simple way to achive this is by multiplying coord #2 by its weight. So you put into the tree not the original coords but coords multiplied by their respective weights.
In case your features are combinations of the coords, you might need to apply appropriate matrix transform on your coords before applying weights, see PCA (principal component analysis). PCA is likely to help you with question 2.
The answer to question to is called "metric learning" and currently not implemented in Scikit-learn. Using the popular Mahalanobis distance amounts to rescaling the data using StandardScaler. Ideally you would want your metric to take into account the labels.

Resources