Is generating all strings permutation NP Complete? - traveling-salesman

Calculating all string permutations of a given string can be solved in O(n!) by trying all possibilities.
Now, looking at the Travel Salesman Problem, we can solve it by trying all permutations of cities. Lets say we have cities A, B and C.
Lets say we start at city A. By calculating all permutations of BC string we get ABC ACB, then we just sum (in polynomial time the distance between AB, CB and CA for the first case...)
So isnt this a reduction of the all strings permutation to the travel salesman problem and isnt it a NP Complete problem?

I think you're confusing some concepts:
What you describe is not "reducing the all permutations problem to TSP", but the opposite: reducing TSP to the all permutations problem.
What that proves is that generating all permutations is NP-Hard (at least as hard as the hardest NP problem).
To prove something is NP-Complete, you would also have to prove that it's in NP. But this is not true, right out of the gate: NP is a set of decision problems, and the problem you described isn't a decision problem.
See also: What are the differences between NP, NP-Complete and NP-Hard?

Related

Flipping a three-sided coin

I have two related question on population statistics. I'm not a statistician, but would appreciate pointers to learn more.
I have a process that results from flipping a three sided coin (results: A, B, C) and I compute the statistic t=(A-C)/(A+B+C). In my problem, I have a set that randomly divides itself into sets X and Y, maybe uniformly, maybe not. I compute t for X and Y. I want to know whether the difference I observe in those two t values is likely due to chance or not.
Now if this were a simple binomial distribution (i.e., I'm just counting who ends up in X or Y), I'd know what to do: I compute n=|X|+|Y|, σ=sqrt(np(1-p)) (and I assume my p=.5), and then I compare to the normal distribution. So, for example, if I observed |X|=45 and |Y|=55, I'd say σ=5 and so I expect to have this variation from the mean μ=50 by chance 68.27% of the time. Alternately, I expect greater deviation from the mean 31.73% of the time.
There's an intermediate problem, which also interests me and which I think may help me understand the main problem, where I measure some property of members of A and B. Let's say 25% in A measure positive and 66% in B measure positive. (A and B aren't the same cardinality -- the selection process isn't uniform.) I would like to know if I expect this difference by chance.
As a first draft, I computed t as though it were measuring coin flips, but I'm pretty sure that's not actually right.
Any pointers on what the correct way to model this is?
First problem
For the three-sided coin problem, have a look at the multinomial distribution. It's the distribution to use for a "binomial" problem with more then 2 outcomes.
Here is the example from Wikipedia (https://en.wikipedia.org/wiki/Multinomial_distribution):
Suppose that in a three-way election for a large country, candidate A received 20% of the votes, candidate B received 30% of the votes, and candidate C received 50% of the votes. If six voters are selected randomly, what is the probability that there will be exactly one supporter for candidate A, two supporters for candidate B and three supporters for candidate C in the sample?
Note: Since we’re assuming that the voting population is large, it is reasonable and permissible to think of the probabilities as unchanging once a voter is selected for the sample. Technically speaking this is sampling without replacement, so the correct distribution is the multivariate hypergeometric distribution, but the distributions converge as the population grows large.
Second problem
The second problem seems to be a problem for cross-tabs. Then use the "Chi-squared test for association" to test whether there is a significant association between your variables. And use the "standardized residuals" of your cross-tab to identify which of the assiciations is more likely to occur and which is less likely.

What N ((1,0)T , I) mean related to Gaussian Distribution

Hi everyone I am reading a book "Element of Statistical Learning) and came across the below paragraph which i dont I understand. (explains how the training data was generated)
We generated 10 means mk from a bivariate Gaussian distribution N((0,1)T,I) and labeled this class as blue. Similraly, 10 more were drawn from from N((0,1)T,I) and labeled class Orange. Then for each class we generated 100 observations as follows: for each observation, we picked an mk at random with probability 1/10, and then generated a N(mk, I/5), thus leading to a mixture of Gaussian cluster for each class.
I would appreciate if you could explain the above paragraph and especially N((0,1)T,I)
by the way- (0,1) to the power of T for Transpose.
Is this notation mathmatically common or related to a specific computer language.
In the paragraph N stands for the Normal distribution; more specifically, in this case it stands for the Multivariate normal distribution. It is not specific to any programming languages. It comes from statistics and probability theory, but due to numerous appealing properties and important applications of this probability distribution it is also widely used in programming, so you should be able to perform the described procedure in any language.
The part (0,1)^T is a vector of means. That is, we have in mind a random vector of length two, where the first element on average is 0, and the second one on average is 1.
"I" stands for the 2x2 identity matrix whose role is the variance-covariance matrix. That is, the variance of both random vector components is 1 (i.e., the diagonal terms), while off-diagonal points are 0 and correspond to the covariance between the two random variables.

Clarification on the formulation of the Traveling Salesman

I have been doing research in the traveling salesman problem, and I have a question about how it is formulated. Or this might be a question on classification or name of sub-problems or variations on the problem.
In the traveling salesman problem are the cities places in a space and the distances between the cities measured to form a graph with weighted connections, or can the weights on the edges be arbitrarily chosen, even though they might make it impossible to lay the cities out on a map?
If one of those is considered the standard traveling salesman problem, is there a name for the other one?
TSP can be defined in a lot of ways. You're describing the symmetric Euclidean TSP, where weights correspond to the actual distances between the nodes and traveling clockwise on a tour between the nodes would give. As suggested by Phpdna, the triangle inequality is satisfied.
However, that's not the standard definition of the TSP. In fact, this IS the sub-problem or special case. The general problem can have any weight between each pair of nodes, and it doesn't have to be a Euclidean distance.
For example, if you were trying to formulate the shortest tour by the cost of travel rather than distance, you'd have the cost of travel between cities as the weight between the vertices... that could be anything. City A might be closest to city B on a Euclidean map, but the cost of travel from A to B might be phenomenally greater than from A to C to B for whatever reason. This is the general scenario. But either way, they're both NP-hard.
In the metric tsp it's satisfy the triangle inequality but if you have one-way streets or obstacles like mountains, canyons and so on it's not the metric tsp.

How to know one system is siginficantly better than another one?

I am studying lexical semantics. I have 65 pairs of synonyms with their sense relatedness. The dataset is derived from the paper:
Rubenstein, Herbert, and John B. Goodenough. "Contextual correlates of synonymy." Communications of the ACM 8.10 (1965): 627-633.
I extract sentences containing those synonyms, transfer the neighbouring words appearing in those sentences to vectors, calculate the cosine distance between different vectors, and finally get the Pearson correlation between the distances we calculate and the sense relatedness given by Rubenstein and Goodenough
I get the Pearson correlation for Method 1 is 0.79, and for Method 2 is 0.78, for example. How do I measure Method 1 is significantly better than Method 2 or not?
Well Strictly not a programming question, but since this question is unanswered in others stackexchange sites, i'll tell the approach i would take.
I would say there are other benchmarks to check your approaches on similar tasks. You can check how your method performs on those benchmarks and analyze the results. Some methods may capture similarity more while others relatedness and some both.
This is the link WordVec Demo which automatically scores your vectors and provides you the results.

How to rewrite the halve function in J?

in the J programming language,
-: i. 5
the above function computes the halves of all integers in [0,4]. Now let's say I'd like to re-write the -: function, just for the fun of it. My best guess so far was
]&%.2
but that doesn't seem to cut it. How do you do it?
%&2 NB. divide by two
0.5&* NB. multiply by one half
Note that ] % 2: would also work, but to ensure proper grammar you would either want to use that as the definition of a name, or you would want to put the expression in parenthesis.
I saw you were using %. probably because you were dividing a matrix and thought you needed to do a "matrix divide".
The matrix divide and matrix inverse they are talking about there is for matrix algebra, where you have a list of, well, essentially polynomials, and you want to do transformations on the polynomials all at once, so as to solve the equations. One of the things you can do really easily in J is matrix algebra, there are builtins for matrix divide and for inverting a matrix (as you have seen) and in the phrases section, there are short phrases for doing all of the typical matrix transformations. Taking the determinant, for example.
But when you are simply dividing a vector by a scalar to get a vector, or you are dividing a matrix by the corresponding elements of another matrix, well, that is just the % division symbol.
If you want to try and understand this, look at euler problem 101 (http://projecteuler.net/problem=101) and then google curve fitting on the Jsoftware.com site. Creating the matrixes from the observations, and the basic matrixes as shown allow you to solve for ax^2+bx+c = y where you have x and y and you want to determine a, b, and c. Just remember to use extended arithmetic for everything, as the resultant equations are very good but not perfect unless you do, and to solve the equation you need perfect equations.
Just a thought, unless you want to play with Matrix Algebra, you might not care.

Resources