How to label branches of a phylogenetic tree with specific mutations? - variant

I want to generate a phylogenetic tree of a bacterial population [MRSA] that can describe/annotate mutations per branch for that population. I was able to generate the phylogenetic tree but have not been able to assign specific mutations that could define each branch.
I came across a website called ‘NextClade’ that generates the exact graph we want but its only for viruses. I am trying to look for other tools that can generate the tree with required annotations for other species as well.
You can have a look at the tree by running an example in the site.
Nextclade website: https://clades.nextstrain.org/results
You can click on run an example
See the comparison graphs
Click on upper right corner tree option.
We can click on each individual branch and look at mutations leading to that branch.

Related

position nodes in force layout graph vertically

I read a couple of posts on position nodes in force layout but didn't find an answer to what I was looking for.
I have an object with nodes and links.
I' trying to create a graph which would show all the nodes top to bottom.
I was looking at the example code from here:
https://github.com/danielstern/force-graph-example
Here's a screenshot of the result:
I'm trying to find a way to position each node so the nodes without parents would be on the top and the ones connecting to them would be under them and so forth.
Here's an image to illustrate it:
Right now, all the nodes are scattered randomly.
I wanted to if I need to actually calculate the position of each node in a vertical view or is there a smarter/built-in way to achieve it.
I looked at this example which looked promising:
How to organise node positions in D3 Force layout
But in my case I don't have a way to differentiate between nodes levels so I don't think the yPostion would help.
I was also looking at thes post:
d3.js - How can I expand force directed graph horizontally?
According to #Lars Kotthoff:
"The point of the force layout is to automatically lay out a graph like this so that you don't have to specify the positions of the nodes yourself".
Since my graph is not really a tree, I don't think the tree view would match.
What would be my best approach to position the nodes?
Or perhaps there's a better library to achieve what I need?
I found this package:
d3-dag
It basically supports what I need:
"Often data sets are hierarchical, but are not in a tree structure..."
Here's an exmaple:
exmaple

2D graph rendering algorithm supporting dynamically adding/removing nodes/edges

I've spent hours searching for an answer to this, but in most cases either the
question is about plots/charts (rather than graphs as in "control flow graph"),
or the answer "just use graphviz" is a valid answer.
However I have some constraints and requirements that make "just use graphviz" a
non-answer.
The full graph is large enough that it's not possible to generate a graphviz
for all of it.
Nodes and edges will be dynamically added and removed.
Nodes have lots of information that will be hidden by default and will be
expanded on request (imagine every node as a table with expandable rows/cols)
I want to be able to show only a subset of the graph on request, e.g. for
features like "only show reachable part of the graph from this node" or "show
all simple paths from this node to this node".
Basically I want to be able to start drawing nodes and edges on a 2D plane, and
add new nodes and edges dynamically. It's fine if nodes/edges move around as new
stuff is added. While I don't yet have hard requirements for this, it'd be good
if it looked "nice" -- for example if a node has lots of incoming edges (this is
a directed graph) ideally it'd be in a central place on the plane with all other
nodes around it etc.
Anything that gets me going would be helpful. Thanks.
(I don't know what label to add to this, adding "graph-theory" because I don't know what else to add)

How is a monotree organized with git?

I've recently came across an article by Greg Kroah-Hartman on why the Linux Kernel has not a stable API and how the Kernel repository is organized as a monotree. When I discussed the article with a friend it became clear that we had a different understanding of what the term tree applied to:
tree refers to different sub-folders of a project.
It refers to the different forks of the git master branch.
In the first case contributors would not checkout the complete project, e.g. the Linux Kernel, but only a sub-folder. These could then be combined with e.g. git-subtree.
In the second case contributors would have to checkout the complete project and basically create fork of a monorepo.
So what does tree in monotree refer to and how can a project be organized as a monotree with git?
Let's make a few notes here:
The phrase monotree, or even the partial word mono, never appears in the referenced article.
The article has seven occurrences of the word tree.
In six of these seven occurrences, the entire phrase here is the main kernel tree. The one reference that does not use this full phrase just says the tree but clearly has the same intent as the other six.
You have tagged this with git linux monorepo (in case the tags change).
Your question amounts to either: What does the author mean by the phrase "the main kernel tree"? or What do people in general mean when they refer to a tree? These are valid questions but not particularly relevant to Git.
Tree in computer science tends to refer to the data structure, which is also pretty loosely defined; see the wikipedia entry. We have some collection of nodes and edges—mathematically, a graph G defined by its set of vertices V and edges E, where each vertex connects by edges to other vertices—and there are constraints on the graph so that it is minimally connected, or equivalently, maximally acyclic. (See https://en.wikiversity.org/wiki/Introduction_to_graph_theory/Proof_of_Theorem_4 and the answers to What's the difference between the data structure Tree and Graph?)
A tree object in Git specifically refers to the stored Git object of Git-type "tree" (one of four Git object types that are stored in the repository database—the other three are commit, blob, and annotated tag). Such an object stores <mode, name, hash-ID> triples, where the mode and hash-ID identify additional Git objects to associate with the name, which is an arbitrary1 string of bytes excluding NUL and slash (codes 0 and 0x2f or 47 respectively). A commit object stored in Git includes the hash ID of a single tree object. Reading the tree object and locating the sub-objects it lists, then recursively reading their own sub-objects if those objects are trees, results in constructing the minimally-connected graph that is a CS-style tree.
1There's a length limit due to the cache entry ce_namelen field, which has a 32-bit integer type. So no name component can exceed about 4 GB in length. Practically speaking, none should probably exceed 255 bytes, but tree objects in Git don't enforce any particular limit, as far as I know.
A file system tree in Linux is really just a string identifying an entity within the file system, though naming anything other than a directory results in a degenerate tree with just one node in it. By naming a directory, though, you can imply that anyone interpreting this string should read the directory's contents, which are names that (by being concatenated with the string identifying the directory itself) name another Linux file system tree, possibly a degenerate one with a single file or device node or whatever. This kind of recursive enumeration leads to building up a minimally-connected graph, just as with the Git tree object. (Perhaps unsurprisingly, the Linux directory objects have essentially the same constraints on names as the Git tree objects, though they usually have a much smaller maximum component name length, typically 255 bytes or fewer.)
Finally, the way the phrase the main kernel tree is used in the article refers to the Linux kernel repository—Linus Torvald's Git repository for the Linux kernel—and the entire ecosystem around it. There is a lot of room for arguments about the details. Here, I will just include a link to this particular InfoWorld article, which seems like a reasonable summary of the state of affairs as of the time it was written (August 2016).

How do i work with .osm files in terms of creation of nodes and edges so i can put it on a Djikstra Algorithm?

I have a MetroManila.osm file and i parse it so it can read roads only using this command: --keep = --keep-nodes=highway --keep-ways=. Is this the right command for filtering only the roads? What i want after parsing is to create a node where there's intersections or curves, or is it possible with just using the whole nodes in MetroManila.osm? Can i create an edge using it and how do i do it? Currently, i'm really lost on what to do since i'm fairly new on android studio and in osmdroid.
This hasn't really anything to do with Android or osmdroid.
For learning how to create a routing graph from an OSM file see the related answer at help.openstreetmap.org. Quoting from it:
parse all ways; throw away those that are not roads, and for the others, remember the node IDs they consist of, by incrementing a "link counter" for each node referenced.
parse all ways a second time; a way will normally become one edge, but if any nodes apart from the first and the last have a link counter greater than one, then split the way into two edges at that point. Nodes with a link counter of one and which are neither first nor last can be thrown away unless you need to compute the length of the edge.
(if you need geometry for your graph nodes) parse the nodes section of the XML now, recording coordinates for all nodes that you have retained.
Also take a look at routing in the OSM wiki to get some general hints and existing tools and libraries.
Regarding osmdroid: If you don't rely on offline routing then take a look at osmbonuspack, it supports online routing.
If you don't want to reinvent the wheel then just use one of the existing offline routing tools mentioned in the OSM wiki.

How to layout breadthfirst with compound nodes in Cytoscape?

I'm modifying the breadthfirst example on the Cytoscape site (the one with the cat at the top). When I create a new "predators" node and put cat and dog nodes in it, then change the edge that goes from cat to bird to now go from predators to bird, the layout gets completely messed up.
How can I use compound nodes with breadthfirst, or is there another layout I should be using? I don't want anything fancy, I just want the compound nodes to follow the same rules that a normal node follows.
I understand you want something simple, but the problem is that compound graphs are more complex than ordinary graphs.
A compound node does not have independent dimensions: http://js.cytoscape.org/#notation/compound-nodes
A layout has to take this into account and have a generic, customisable algorithm for placing the children such that the constraints for both the children and the parents are satisfied.
This doesn't fit with most layouts, and so only those layouts marked with explicit compound support will give you the results you're looking for. Ordinary layouts ignore compound nodes.
Typically, force-directed (physics simulation) layouts work best with compound graphs -- like Cola or Cose Bilkent. Tree-like layouts don't. There are too many cases where satisfying the layout rules for compound graphs would be impossible.
You can try Cola with the directionality constraint to make the result tree-like. Or you can write your own layout using the API if your compound graphs are simple enough and you can make enough assumptions about the topology of your graphs.

Resources