Plot a graph with ipycytoscape, problems with number of nodes - layout

I am trying to visualize a graph using ipycytoscape and I got so far:
import networkx as nx
import ipycytoscape
G2 = nx.Graph()
G2.add_nodes_from([*'ABCDE'])
G2.add_edges_from([('A','B'),('B','C'),('D','E'),('F','A'),('C','A')])
mynodes= G2.nodes
myedges= G2.edges
nodes = [{"data":{"id":n}} for n in mynodes]
edges = [{"data":{"source":n0,"target":n1}} for n0,n1 in myedges]
JSON_graph_data = {"nodes":nodes,"edges":edges}
cytoscapeobj = ipycytoscape.CytoscapeWidget()
cytoscapeobj.graph.add_graph_from_json(JSON_graph_data)
cytoscapeobj
It works so far. G2 is a very small Graph for training purposes
But what could be the reason that when I passed a bigger one (40 nodes, several hundreds of edges) I dont get an error but I dont see anything coming out.

This seems like a bug. Sorry I missed your question, I'll be more activate on stackoverflow from now on.
If you're still interested please open an issue on: https://github.com/QuantStack/ipycytoscape/
Or join the chat: https://gitter.im/QuantStack/Lobby
I'll be happy to help!
Maybe the error won't even be reproducible anymore since we're moving kind of fast with the API.

Related

Blender Geometry Nodes - remove double instance and end pieces

I'm very new to blender and started testing out geometry nodes following a tutorial.
(Donuts and Gummies)
I am now attempting a small project to polish my skills.
I want to have only one instance of the traverse on the long shaft every n times. Currently, I have two instances in the middle and 4 at the ends. (see image below)
My objectives are:
Get one instance every n times on the shaft
Remove the end pieces completely.
I managed to solve my first issue by merging the edges like so
Still need assistance on the 2nd issue, please.
The solution to my second issue was fixed using the Trim Curve node.
I had to do some calculations to plugin the values, but that was the easy part.
Hope someone finds this useful.

How to write a multi-threaded program for plotting interactively?

I am trying to read data online from a website every 1 seconds using following code, then plot the result in real time. I mean I like to add new last_price in every second to the previous plot and update it.
import time
import requests
for i in range(2600):
time.sleep(1)
with requests.Session() as s:
data = {'ContractCode' : 'SAFTR98'}
r = s.post('http://cdn.ime.co.ir/Services/Fut_Live_Loc_Service.asmx/GetContractInfo', json = data ).json()
for key, value in r.items():
print(r[key]['ContractCode'])
last_prices = (r[key]['LastTradedPrice'])
I used animation.FuncAnimation but didn't work because it plots all the results after the 2600 iterations is done or I stop the program! So I thought maybe I can do the work with multi-threading. But I don't know how exactly should I use it? I searched about but couldn't understand the examples and how should I map it to my problem?
This is not a duplicated question. In the mentioned link I tried to solve the problem using Animations but in this question I am trying to find a new way using multi-threading.

n columns of data frame discarded

I am using spatstat package in R to read my road network shapefile which also has some additional attributes.
When i am reading my shapefiles and converting them to as.psp(before I make them an object of linnet), I am getting n columns of data frame discarded. I do not understand why? The columns being discarded are my covariates for a linear network, so I am not able to bring them into my analysis.
Could someone give me an idea why this happens and how to correct it?
Why it happens:
I would guess that we (spatstat authors) need to spend a bit of time discussing with the maptools guys how to handle the additional info in the SpatialLinesDataFrame object, and we haven't done that yet.
How to correct it:
You have to write some code on your own at the moment. You can extract the data from SpatialLinesDataFrame object by accessing the #data slot. Please provide specific data and how you need to use the additional data (what format do you need it in) if you need more help. You can find a few helpful commands here: https://cran.r-project.org/web/packages/spatstat/vignettes/shapefiles.pdf

How to detect near duplicate rows in Azure Machine Learning?

I am new to azure machine learning. We are trying to implement questions similarity algorithm using azure machine learning. We have large set of questions and answers. Our objective is to identify whether newly added questions are duplicates or not? Just like Stackoverflow suggests existing questions when we ask new questions?Can we use azure machine learning services to solve this? Can someone guide us in the right direction?
Yes you can use Azure Machine Learning studio and could use the method Jennifer proposed.
However, I would assume it is much better to run a R script against a database containing all current questions in your experiment and return a similarity metric for each comparison.
Have a look at the following paper for some examples (from simple/basic to more advanced) how you could do this:
https://www.researchgate.net/publication/4314910_Question_Similarity_Calculation_for_FAQ_Answering
A simple way to start would just be to implement a simple "bags of words" comparison. This will yield a distance matrix that you could use for clustering or use to give back similar questions. The following R code would so such a thing, in essence you build a large string with as first sentence the new question and then follow it with all known questions. This method will, obviously, not really take into consideration the meaning of the questions and would just trigger on equal word usage.
library(tm)
library(Matrix)
x <- TermDocumentMatrix( Corpus( VectorSource( strings.with.all.questions ) ) )
y <- sparseMatrix( i=x$i, j=x$j, x=x$v, dimnames = dimnames(x) )
plot( hclust(dist(t(y))) )
Yes, you can definitely do this with Azure Machine Learning. It sounds like you have a clustering problem (you are trying to group together similar questions).
There is a "Clustering: Find similar companies" sample that does a similar thing at https://gallery.cortanaanalytics.com/Experiment/60cf8e46935c4fafbf86f669121a24f0. You can read the description on that page and click the "Open in Studio" button in the right-hand sidebar to actually open the workspace in Azure Machine Learning Studio. In that sample, they are finding similar companies based on the text from the company's Wikipedia article (for example: Microsoft and Apple are similar companies because the word "computer" appears a lot in both articles). Your problem is very similar except you would use the text in your questions to find similar questions and cluster them into groups accordingly.
In k-means clustering, "k" is the number of clusters that you want to form, so this number will probably be pretty big for your specific problem. If you have 500 questions, perhaps start with 250 centroids? But mess around with this number and see what works. For performance reasons, you might want to start with a small dataset for testing and then run all of your data through the model after it seems to be grouping well.
Also, the documentation for K-means clustering is here.

How to find maximum independent set of a directed acyclic graph?

Say we have a graph that is similar to a linked list (or a directed acyclic graph). An independent set consists of nodes that don't share edges with any other node in the set. If each node is weighted, how can we calculate the max possible value of the independent set of nodes? I understand we have to use Dynamic Programming so I have a slight clue but I'm hoping someone could explain how they would approach it. Thank you!
I believe that this problem is NP-hard for arbitrary directed acyclic graphs. The corresponding problem for undirected graphs is known to be NP-hard, and that problem can be converted into the directed version of the problem by directing all of the edges in a way that makes the resulting graph a DAG. Any independent set in the original graph will be an independent set in the directed graph and vice-versa, so any solution to the directed case will solve the undirected case.
Your question talks about solving this problem on a linked list. If you're solving the problem just for linked lists, there is a polynomial-time solution using dynamic programming. As a hint, if you choose one node in the linked list, you have to skip the next node, then should maximize what remains. If you don't choose the node, you just maximize the value of the rest of the list. Taking the better of these two options and evaluating this bottom-up will give you a really fast DP algorithm.
Hope this helps!

Resources