Is there a way to construct a matrix with simplex columns in Stan? The model I want to construct is similar to the following, where I model counts as dirichlet-multinomial:
data {
int g;
int c;
int<lower=0> counts[g, c];
parameters {
simplex [g] p;
model {
for (j in 1:c) {
p ~ dirichlet(rep_vector(1.0, g));
counts[, j] ~ multinomial(p);
However I would like to use a latent [g, c] matrix for further layers of a hierarchical model similar to the following:
parameters {
// simplex_matrix would have columns which are each a simplex.
simplex_matrix[g, c] p;
model {
for (j in 1:c) {
p[, j] ~ dirichlet(rep_vector(1.0, g));
counts[, j] ~ multinomial(p[, j]);
If there's another way to construct this latent variable that would of course also be great! I'm not massively familiar with stan having only implemented a few hierarchical models.

To answer the questions that you asked, you can declare an array of simplexes in the parameter block of a Stan program and use them to fill a matrix. For example,
parameters {
simplex[g] p[c];
model {
matrix[g, c] col_stochastic_matrix;
for (i in 1:c) col_stochastic_matrix[,c] = p[c];
However, you do not actually need to form a column stochastic matrix in the example you gave, since you can do the multinomial-Dirichlet model by indexing an array of simplexes like
data {
int g;
int c;
int<lower=0> counts[g, c];
parameters {
simplex [g] p[c];
model {
for (j in 1:c) {
p[j] ~ dirichlet(rep_vector(1.0, g));
counts[, j] ~ multinomial(p[j]);
Finally, you do not actually need to declare an array of simplexes at all, since they can be integrated out of the posterior distribution and recovered in the generated quantities block of the Stan program. See wikipedia for details but the essence of it is given by this Stan function
functions {
real DM_lpmf(int [] n, vector alpha) {
int N = sum(n);
real A = sum(alpha);
return lgamma(A) - lgamma(N + A)
+ sum(lgamma(to_vector(n) + alpha)
- sum(lgamma(alpha));


Check Delaunay "flip condition" using only cosines

While searching for methods of determining whether a point is within a circumcircle, I came across this answer, which used an interesting method of constructing a quadrilateral between the point and triangle, and testing the flip condition to see if the new point makes a better Delaunay triangle, and therefore is within the original triangle's circumcircle.
The Delaunay flip condition deals with angles, however, the answer I found instead just calculates the cosines of the angles. Rather than checking that the sum of angles is less than or equal to 180°, it takes the minimum of all (negated) cosines, comparing the two results to decide if the point is in the circle.
Here is the code from that answer (copied here for convenience):
#include <array>
#include <algorithm>
struct pnt_t
int x, y;
pnt_t ccw90() const
{ return { -y, x }; }
double length() const
{ return std::hypot(x, y); }
pnt_t &operator -=(const pnt_t &rhs)
x -= rhs.x;
y -= rhs.y;
return *this;
friend pnt_t operator -(const pnt_t &lhs, const pnt_t &rhs)
{ return pnt_t(lhs) -= rhs; }
friend int operator *(const pnt_t &lhs, const pnt_t &rhs)
{ return lhs.x * rhs.x + lhs.y * rhs.y; }
int side(const pnt_t &a, const pnt_t &b, const pnt_t &p)
int cp = (b - a).ccw90() * (p - a);
return (cp > 0) - (cp < 0);
void make_ccw(std::array<pnt_t, 3> &t)
if (side(t[0], t[1], t[2]) < 0)
std::swap(t[0], t[1]);
double ncos(pnt_t a, const pnt_t &o, pnt_t b)
a -= o;
b -= o;
return -(a * b) / (a.length() * b.length());
bool inside_circle(std::array<pnt_t, 3> t, const pnt_t &p)
std::array<int, 3> s =
{ side(t[0], t[1], p), side(t[1], t[2], p), side(t[2], t[0], p) };
unsigned outside = std::count(std::begin(s), std::end(s), -1);
if (outside != 1)
return outside == 0;
while (s[0] >= 0)
std::rotate(std::begin(t), std::begin(t) + 1, std::end(t));
std::rotate(std::begin(s), std::begin(s) + 1, std::end(s));
min_org = std::min({
ncos(t[0], t[1], t[2]), ncos(t[2], t[0], t[1]),
ncos(t[1], t[0], p), ncos(p, t[1], t[0]) }),
min_alt = std::min({
ncos(t[1], t[2], p), ncos(p, t[2], t[0]),
ncos(t[0], p, t[2]), ncos(t[2], p, t[1]) });
return min_org <= min_alt;
I'm having trouble understanding how this works.
How do "sum of angles" and "minimum of all cosines" relate? Cosines of certain angles are always negative, and I would think you could position your triangle to arbitrarily fall within that negative range. So how is this test valid?
Additionally, after collecting the two sets of "minimum cosines" (rather than the two sets of angle sums), the final test is to see which minimum is smallest. Again, I don't see how this relates to the original test of determining whether a triangle is valid by using the flip condition.
What am I missing?
The good news is that there is a well-known function for finding out if a Point D lies within the circumcircle of triangle ABC by computing the determinant shown below. If InCircle comes back greater than zero, then D lies within the circumcircle and a flip is required. The equation does assume that the triangle ABC is given in counterclockwise order (and, so, has a positive area).
I got this equation from the book Cheng, et al. "Delaunay Mesh Generation" (2013), but you should be able to find it in other places. An open-source Java implementation is available at, but I'm sure you can find examples elsewhere, some of which may be better suited to your needs.

Algorithm for doing many substring reversals?

Suppose I have a string S of length N, and I want to perform M of the following operations:
choose 1 <= L,R <= N and reverse the substring S[L..R]
I am interested in what the final string looks like after all M operations. The obvious approach is to do the actual swapping, which leads to O(MN) worst-case behavior. Is there a faster way? I'm trying to just keep track of where an index ends up, but I cannot find a way to reduce the running time (though I have a gut feeling O(M lg N + N) -- for the operations and the final reading -- is possible).
Yeah, it's possible. Make a binary tree structure like
struct node {
struct node *child[2];
struct node *parent;
char label;
bool subtree_flipped;
Then you can have a logical getter/setter for left/right child:
struct node *get_child(struct node *u, bool right) {
return u->child[u->subtree_flipped ^ right];
void set_child(struct node *u, bool right, struct node *c) {
u->child[u->subtree_flipped ^ right] = c;
if (c != NULL) { c->parent = u; }
Rotations have to preserve flipped bits:
struct node *detach(struct node *u, bool right) {
struct node *c = get_child(u, right);
if (c != NULL) { c->subtree_flipped ^= u->subtree_flipped; }
return c;
void attach(struct node *u, bool right, struct node *c) {
set_child(u, right, c);
if (c != NULL) { c->subtree_flipped ^= u->subtree_flipped; }
// rotates one of |p|'s child up.
// does not fix up the pointer to |p|.
void rotate(struct node *p, bool right) {
struct node *u = detach(p, right);
struct node *c = detach(u, !right);
attach(p, right, c);
attach(u, !right, p);
Implement splay with rotations. It should take a "guard" pointer that is treated as a NULL parent for the purpose of splaying, so that you can splay one node to the root and another to its right child. Do this and then you can splay both endpoints of the flipped region and then toggle the flip bits for the root and the two subtrees corresponding to segments left unaffected.
Traversal looks like this.
void traverse(struct node *u, bool flipped) {
if (u == NULL) { return; }
flipped ^= u->subtree_flipped;
traverse(u->child[flipped], flipped);
traverse(u->child[!flipped], flipped);
Splay tree may help you, it supports reverse operation in an array, with total complexity O(mlogn)
#F. Ju is right, splay trees are one of the best data structures to achieve your goal.
However, if you don't want to implement them, or a solution in O((N + M) * sqrt(M)) is good enough, you can do the following:
We will perform sqrt(M) consecutive queries and then rebuilt the array from the scratch in O(N) time.
In order to do that, for each query, we will store the information that the queried segment [a, b] is reversed or not (if you reverse some range of elements twice, they become unreversed).
The key here is to maintain the information for disjoint segments here. Notice that since we are performing at most sqrt(M) queries before rebuilding the array, we will have at most sqrt(M) disjoint segments and we can perform query operation on sqrt(M) segments in sqrt(M) time. Let me know if you need a detailed explanation on how to "reverse" these disjoint segments.
This trick is very useful while solving problems like that and it is worth to know it.
I solved the problem exactly corresponding to yours on HackerRank, during their contest, using the method I described.
Here is the problem
Here is my solution in C++.
Here is the discussion about the problem and a brief description of my method, please check my 3rd message there.
I'm trying to just keep track of where an index ends up
If you're just trying to follow one entry of the starting array, it's easy to do that in O(M) time.
I was going to just write pseudocode, but no hand-waving was needed so I ended up with what's probably valid C++.
// untested C++, but it does compile to code that looks right.
struct swap {
int l, r;
// or make these non-member functions for C
bool covers(int pos) { return l <= pos && pos <= r; }
int apply_if_covering(int pos) {
// startpos - l = r - endpos;
// endpos = l - startpos + r
pos = l - pos + r;
return pos;
int follow_swaps (int pos, int len, struct swap swaps[], int num_swaps)
// pos = starting position of the element we want to track
// return value = where it will be after all the swaps
for (int i = 0 ; i < num_swaps ; i++) {
pos = swaps[i].apply_if_covering(pos);
return pos;
This compiles to very efficient-looking code.

NxN matrix is given and we have to find

N things to select for N people, you were given a NxN matrix and cost at each element, you needed to find the one combination with max total weight, such that each person gets exactly one thing.
I found difficulty in making its dp state.
please help me and if possible then also write code for it
C++ style code:
double max_rec(int n, int r, int* c, double** m, bool* f)
if (r < n)
double max_v = 0.0;
int max_i = -1;
for (int i = 0; i < n; i++)
if (f[i] == false)
f[i] = true;
double value = m[r][i] + max_rec(n, r + 1, c, m, f);
if (value > max_v)
max_v = value;
max_i = i;
f[i] = false;
c[i] = max_i;
return max_v;
return 0.0;
int* max_comb(int n, double** m)
bool* f = new bool[n];
int* c = new int[n];
max_rec(n, 0, c, m, f);
delete [] f;
return c;
Call max_comb with N and your NxN matrix (2d array). Returns the column indices of the maximum combination.
Time complexity: O(N!)
I know this is bad but the problem does not have a greedy structure.
And as #mszalbach said, try to attempt the problem yourself before asking.
EDIT: can reduce to polynomial time by memoizing.

Reduce a string using grammar-like rules

I'm trying to find a suitable DP algorithm for simplifying a string. For example I have a string a b a b and a list of rules
a b -> b
a b -> c
b a -> a
c c -> b
The purpose is to get all single chars that can be received from the given string using these rules. For this example it will be b, c. The length of the given string can be up to 200 symbols. Could you please prompt an effective algorithm?
Rules always are 2 -> 1. I've got an idea of creating a tree, root is given string and each child is a string after one transform, but I'm not sure if it's the best way.
If you read those rules from right to left, they look exactly like the rules of a context free grammar, and have basically the same meaning. You could apply a bottom-up parsing algorithm like the Earley algorithm to your data, along with a suitable starting rule; something like
start <- start a
| start b
| start c
and then just examine the parse forest for the shortest chain of starts. The worst case remains O(n^3) of course, but Earley is fairly effective, these days.
You can also produce parse forests when parsing with derivatives. You might be able to efficiently check them for short chains of starts.
For a DP problem, you always need to understand how you can construct the answer for a big problem in terms of smaller sub-problems. Assume you have your function simplify which is called with an input of length n. There are n-1 ways to split the input in a first and a last part. For each of these splits, you should recursively call your simplify function on both the first part and the last part. The final answer for the input of length n is the set of all possible combinations of answers for the first and for the last part, which are allowed by the rules.
In Python, this can be implemented like so:
rules = {'ab': set('bc'), 'ba': set('a'), 'cc': set('b')}
all_chars = set(c for cc in rules.values() for c in cc)
# memoize
def simplify(s):
if len(s) == 1: # base case to end recursion
return set(s)
possible_chars = set()
# iterate over all the possible splits of s
for i in range(1, len(s)):
head = s[:i]
tail = s[i:]
# check all possible combinations of answers of sub-problems
for c1 in simplify(head):
for c2 in simplify(tail):
possible_chars.update(rules.get(c1+c2, set()))
# speed hack
if possible_chars == all_chars: # won't get any bigger
return all_chars
return possible_chars
Quick check:
In [53]: simplify('abab')
Out[53]: {'b', 'c'}
To make this fast enough for large strings (to avoiding exponential behavior), you should use a memoize decorator. This is a critical step in solving DP problems, otherwise you are just doing a brute-force calculation. A further tiny speedup can be obtained by returning from the function as soon as possible_chars == set('abc'), since at that point, you are already sure that you can generate all possible outcomes.
Analysis of running time: for an input of length n, there are 2 substrings of length n-1, 3 substrings of length n-2, ... n substrings of length 1, for a total of O(n^2) subproblems. Due to the memoization, the function is called at most once for every subproblem. Maximum running time for a single sub-problem is O(n) due to the for i in range(len(s)), so the overall running time is at most O(n^3).
Let N - length of given string and R - number of rules.
Expanding a tree in a top down manner yields computational complexity O(NR^N) in the worst case (input string of type aaa... and rules aa -> a).
Root of the tree has (N-1)R children, which have (N-1)R^2 children, ..., which have (N-1)R^N children (leafs). So, the total complexity is O((N-1)R + (N-1)R^2 + ... (N-1)R^N) = O(N(1 + R^2 + ... + R^N)) = (using binomial theorem) = O(N(R+1)^N) = O(NR^N).
Recursive Java implementation of this naive approach:
public static void main(String[] args) {
Map<String, Character[]> rules = new HashMap<String, Character[]>() {{
put("ab", new Character[]{'b', 'c'});
put("ba", new Character[]{'a'});
put("cc", new Character[]{'b'});
System.out.println(simplify("abab", rules));
public static Set<String> simplify(String in, Map<String, Character[]> rules) {
Set<String> result = new HashSet<String>();
simplify(in, rules, result);
return result;
private static void simplify(String in, Map<String, Character[]> rules, Set<String> result) {
if (in.length() == 1) {
for (int i = 0; i < in.length() - 1; i++) {
String two = in.substring(i, i + 2);
Character[] rep = rules.get(two);
if (rep != null) {
for (Character c : rep) {
simplify(in.substring(0, i) + c + in.substring(i + 2, in.length()), rules, result);
Bas Swinckels's O(RN^3) Java implementation (with HashMap as a memoization cache):
public static Set<String> simplify2(final String in, Map<String, Character[]> rules) {
Map<String, Set<String>> cache = new HashMap<String, Set<String>>();
return simplify2(in, rules, cache);
private static Set<String> simplify2(final String in, Map<String, Character[]> rules, Map<String, Set<String>> cache) {
final Set<String> cached = cache.get(in);
if (cached != null) {
return cached;
Set<String> ret = new HashSet<String>();
if (in.length() == 1) {
return ret;
for (int i = 1; i < in.length(); i++) {
String head = in.substring(0, i);
String tail = in.substring(i, in.length());
for (String c1 : simplify2(head, rules)) {
for (String c2 : simplify2(tail, rules, cache)) {
Character[] rep = rules.get(c1 + c2);
if (rep != null) {
for (Character c : rep) {
cache.put(in, ret);
return ret;
Output in both approaches:
[b, c]

Is there a circular hash function?

Thinking about this question on testing string rotation, I wondered: Is there was such thing as a circular/cyclic hash function? E.g.
h(abcdef) = h(bcdefa) = h(cdefab) etc
Uses for this include scalable algorithms which can check n strings against each other to see where some are rotations of others.
I suppose the essence of the hash is to extract information which is order-specific but not position-specific. Maybe something that finds a deterministic 'first position', rotates to it and hashes the result?
It all seems plausible, but slightly beyond my grasp at the moment; it must be out there already...
I'd go along with your deterministic "first position" - find the "least" character; if it appears twice, use the next character as the tie breaker (etc). You can then rotate to a "canonical" position, and hash that in a normal way. If the tie breakers run for the entire course of the string, then you've got a string which is a rotation of itself (if you see what I mean) and it doesn't matter which you pick to be "first".
"abcdef" => hash("abcdef")
"defabc" => hash("abcdef")
"abaac" => hash("aacab") (tie-break between aa, ac and ab)
"cabcab" => hash("abcabc") (it doesn't matter which "a" comes first!)
Update: As Jon pointed out, the first approach doesn't handle strings with repetition very well. Problems arise as duplicate pairs of letters are encountered and the resulting XOR is 0. Here is a modification that I believe fixes the the original algorithm. It uses Euclid-Fermat sequences to generate pairwise coprime integers for each additional occurrence of a character in the string. The result is that the XOR for duplicate pairs is non-zero.
I've also cleaned up the algorithm slightly. Note that the array containing the EF sequences only supports characters in the range 0x00 to 0xFF. This was just a cheap way to demonstrate the algorithm. Also, the algorithm still has runtime O(n) where n is the length of the string.
static int Hash(string s)
int H = 0;
if (s.Length > 0)
//any arbitrary coprime numbers
int a = s.Length, b = s.Length + 1;
//an array of Euclid-Fermat sequences to generate additional coprimes for each duplicate character occurrence
int[] c = new int[0xFF];
for (int i = 1; i < c.Length; i++)
c[i] = i + 1;
Func<char, int> NextCoprime = (x) => c[x] = (c[x] - x) * c[x] + x;
Func<char, char, int> NextPair = (x, y) => a * NextCoprime(x) * x.GetHashCode() + b * y.GetHashCode();
//for i=0 we need to wrap around to the last character
H = NextPair(s[s.Length - 1], s[0]);
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
H ^= NextPair(s[i - 1], s[i]);
return H;
static void Main(string[] args)
Console.WriteLine("{0:X8}", Hash("abcdef"));
Console.WriteLine("{0:X8}", Hash("bcdefa"));
Console.WriteLine("{0:X8}", Hash("cdefab"));
Console.WriteLine("{0:X8}", Hash("cdfeab"));
Console.WriteLine("{0:X8}", Hash("a0a0"));
Console.WriteLine("{0:X8}", Hash("1010"));
Console.WriteLine("{0:X8}", Hash("0abc0def0ghi"));
Console.WriteLine("{0:X8}", Hash("0def0abc0ghi"));
The output is now:
First Version (which isn't complete): Use XOR which is commutative (order doesn't matter) and another little trick involving coprimes to combine ordered hashes of pairs of letters in the string. Here is an example in C#:
static int Hash(char[] s)
//any arbitrary coprime numbers
const int a = 7, b = 13;
int H = 0;
if (s.Length > 0)
//for i=0 we need to wrap around to the last character
H ^= (a * s[s.Length - 1].GetHashCode()) + (b * s[0].GetHashCode());
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
H ^= (a * s[i - 1].GetHashCode()) + (b * s[i].GetHashCode());
return H;
static void Main(string[] args)
The output is:
You could find a deterministic first position by always starting at the position with the "lowest" (in terms of alphabetical ordering) substring. So in your case, you'd always start at "a". If there were multiple "a"s, you'd have to take two characters into account etc.
I am sure that you could find a function that can generate the same hash regardless of character position in the input, however, how will you ensure that h(abc) != h(efg) for every conceivable input? (Collisions will occur for all hash algorithms, so I mean, how do you minimize this risk.)
You'd need some additional checks even after generating the hash to ensure that the strings contain the same characters.
Here's an implementation using Linq
public string ToCanonicalOrder(string input)
char first = input.OrderBy(x => x).First();
string doubledForRotation = input + input;
string canonicalOrder
= (-1)
.GenerateFrom(x => doubledForRotation.IndexOf(first, x + 1))
.Skip(1) // the -1
.TakeWhile(x => x < input.Length)
.Select(x => doubledForRotation.Substring(x, input.Length))
.OrderBy(x => x)
return canonicalOrder;
assuming generic generator extension method:
public static class TExtensions
public static IEnumerable<T> GenerateFrom<T>(this T initial, Func<T, T> next)
var current = initial;
while (true)
yield return current;
current = next(current);
sample usage:
var sequences = new[]
"abcdef", "bcdefa", "cdefab",
"defabc", "efabcd", "fabcde",
"abaac", "cabcab"
foreach (string sequence in sequences)
then call .GetHashCode() on the result if necessary.
sample usage if ToCanonicalOrder() is converted to an extension method:
One possibility is to combine the hash functions of all circular shifts of your input into one meta-hash which does not depend on the order of the inputs.
More formally, consider
for(int i=0; i<string.length; i++) {
Where you could replace the ^= with any other commutative operation.
More examply, consider the input
to get the hash we take
hash("abcd") ^ hash("dabc") ^ hash("cdab") ^ hash("bcda").
As we can see, taking the hash of any of these permutations will only change the order that you are evaluating the XOR, which won't change its value.
I did something like this for a project in college. There were 2 approaches I used to try to optimize a Travelling-Salesman problem. I think if the elements are NOT guaranteed to be unique, the second solution would take a bit more checking, but the first one should work.
If you can represent the string as a matrix of associations so abcdef would look like
a b c d e f
a x
b x
c x
d x
e x
f x
But so would any combination of those associations. It would be trivial to compare those matrices.
Another quicker trick would be to rotate the string so that the "first" letter is first. Then if you have the same starting point, the same strings will be identical.
Here is some Ruby code:
def normalize_string(string)
myarray = string.split(//) # split into an array
index = myarray.index(myarray.min) # find the index of the minimum element
index.times do
myarray.push(myarray.shift) # move stuff from the front to the back
return myarray.join
p normalize_string('abcdef').eql?normalize_string('defabc') # should return true
Maybe use a rolling hash for each offset (RabinKarp like) and return the minimum hash value? There could be collisions though.
