Concorde TSP Solver - Asymmetric Instances - traveling-salesman

I am trying to use Concorde to solve some asymmetric instances of the TSP. Although the official website says Concorde does solve this kind of instances, I've seen people saying it doesn't (https://cs.stackexchange.com/a/16336, http://www.math.uwaterloo.ca/tsp/road/austria.html). I'm only doubting the official website because I have the following test instance:
NAME: test
TYPE: ATSP
DIMENSION: 4
EDGE_WEIGHT_TYPE: EXPLICIT
EDGE_WEIGHT_FORMAT: FULL_MATRIX
EDGE_WEIGHT_SECTION
999 | 2 | 2 | 2
2 |999| 2 | 2
2 | 2 |999| 2
2 | 2 | 2 |999
EOF
Concorde gives me, as expected:
Optimal Solution: 8.00
and on the .sol file, the route: 0 1 3 2.
But if I change the matrix to:
EDGE_WEIGHT_SECTION
999 |100| 3 |100
2 |999| 2 | 2
2 | 2 |999| 2
2 | 2 | 2 |999
EOF
The solution given now is 106 with the sequence 0 3 1 2.
No matter what numbers I put in the first row, Concorde never chooses the third city (index 2, value 3).
Does anybody have any idea why that is? Am I reading the input wrong?
--EDIT--
Actually, the instances for the ATSP problem are NOT from the official website. It's from this one:
http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/
Because the name of this library is TSPLIB (the same as the official set of problems for Concorde) I made this confusion. I'm not sure how this TSPLIB95 is related with Concorde Solver (or if it is related whatsoever).

Concorde is only for symmetric TSP. You'll have to do the standard trick to convert an ATSP into a symmetric TSP (with additional nodes).

Related

How to split data and assign it into designated variables?

I have data in Stata regarding the feeling of the current situation. There are seven types of feeling. The data is stored in the following format (note that the data type is a string, and one person can respond to more than 1 answer)
feeling
4,7
1,3,4
2,5,6,7
1,2,3,4,5,6,7
Since the data is a string, I tried to separate it by
split feeling, parse (,)
and I got the result
feeling1
feeling2
feeling3
feeling4
feeling5
feeling6
feeling7
4
7
1
3
4
2
5
6
7
1
2
3
4
5
6
7
However, this is not the result I want. which is that the representative number of feelings should go into the correct variable. For instance.
feeling1
feeling2
feeling3
feeling4
feeling5
feeling6
feeling7
4
7
1
3
4
2
5
6
7
1
2
3
4
5
6
7
I am not sure if there is any built-in command or function for this kind of problem. I am thinking about using forval in looping through every value in each variable and try to juggle it around into the correct variable.
A loop over the distinct values would be enough here. I give your example in a form explained in the Stata tag wiki as more helpful and then give code to get the variables you want as numeric variables.
* Example generated by -dataex-. For more info, type help dataex
clear
input str13 feeling
"4,7"
"1,3,4"
"2,5,6,7"
"1,2,3,4,5,6,7"
end
forval j = 1/7 {
gen wanted`j' = `j' if strpos(feeling, "`j'")
gen better`j' = strpos(feeling, "`j'") > 0
}
l feeling wanted1-better3
+---------------------------------------------------------------------------+
| feeling wanted1 better1 wanted2 better2 wanted3 better3 |
|---------------------------------------------------------------------------|
1. | 4,7 . 0 . 0 . 0 |
2. | 1,3,4 1 1 . 0 3 1 |
3. | 2,5,6,7 . 0 2 1 . 0 |
4. | 1,2,3,4,5,6,7 1 1 2 1 3 1 |
+---------------------------------------------------------------------------+
If you wanted a string result that would be yielded by
gen wanted`j' = "`j'" if strpos(feeling, "`j'")
Had the number of feelings been 10 or more you would have needed more careful code as for example a search for "1" would find it within "10".
Indicator (some say dummy) variables with distinct values 1 or 0 are immensely more valuable for most analysis of this kind of data.
Note Stata-related sources such as
this FAQ
this paper
and this paper.

Counting 15's in Cribbage Hand

Background
This is a followup question to my previous finding a straight in a cribbage hand question and Counting Pairs in Cribbage Hand
Objective
Count the number of ways cards can be combined to a total of 15, then score 2 points for each pair. Ace worth 1, and J,Q,K are worth 10.
What I have Tried
So my first poke at a solution required 26 different formulas. Basically I checked each possible way to combine cards to see if the total was 15. 1 way to add 5 cards, 5 ways to add 4 cards, 10 ways to add 3 cards, and 10 ways to add 2 cards. I thought I had this licked until I realized I was only looking at combinations, I had not considered the fact that I had to cap the value of cards 11, 12, and 13 to 10. I initially tried an array formula something along the lines of:
MIN(MOD(B1:F1-1,13)+1,10)
But the problem with this is that MIN takes the minimum value of all results not the individual results compared to 10.
I then tried it with an IF function, which worked, but involved the use of CSE formula even wehen being used with SUMPRODUCT which is something I try to avoid when I can
IF(MOD(B1:F1-1,13)+1<11,MOD(B1:F1-1,13)+1,10)
Then I stumble on an answer to a question in code golf which I modified to lead me to this formula, which I kind of like for some strange reason, but its a bit long in repetitive use:
--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2)
My current working formulas are:
5 card check
=(SUMPRODUCT(--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2))=15)*2
4 card checks
=(SUM(AGGREGATE(15,6,--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2),{1,2,3,4}))=15)*2
=(SUM(AGGREGATE(15,6,--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2),{1,2,3,5}))=15)*2
=(SUM(AGGREGATE(15,6,--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2),{1,2,4,5}))=15)*2
=(SUM(AGGREGATE(15,6,--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2),{1,3,4,5}))=15)*2
=(SUM(AGGREGATE(15,6,--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2),{2,3,4,5}))=15)*2
3 card checks
same as 4 card checks using all combinations for 3 cards in the {1,2,3}.
There are 10 different combinations, so 10 different formulas.
The 2 card check was based on the solution by Tom in Counting Pairs in Cribbage Hand and all two cards are checked with a single formula. (yes it is CSE)
2 card check
{=SUM(--(--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2)+TRANSPOSE(--MID("01020304050607080910101010",1+(MOD(B1:F1-1,13)*2),2))=15))}
Question
Can the 3 and 4 card combination sum check be brought into a single formula similar to the 2 card check?
Is there a better way to convert cards 11,12,13 to a value of 10?
Sample Data
| B | C | D | E | F | POINTS
+----+----+----+----+----+
| 1 | 2 | 3 | 17 | 31 | <= 2 (all 5 add to 15)
| 1 | 2 | 3 | 17 | 32 | <= 2 (Last 4 add to 15)
| 11 | 18 | 31 | 44 | 5 | <= 16 ( 4x(J+5), 4X(5+5+5) )
| 6 | 7 | 8 | 9 | 52 | <= 4 (6+9, 7+8)
| 1 | 3 | 7 | 8 | 52 | <= 2 (7+8)
| 2 | 3 | 7 | 9 | 52 | <= 2 (2+3+K)
| 2 | 4 | 6 | 23 | 52 | <= 0 (nothing add to 15)
Excel Version
Excel 2013
For 5:
=(SUMPRODUCT(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))=15)*2
For 4:
=SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,ROW($1:$5),{1,2,3,4;1,2,3,5;1,2,4,5;1,3,4,5;2,3,4,5}),ROW($1:$4)^0)=15))*2
For 3
=SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,ROW($1:$10),{1,2,3;1,2,4;1,2,5;1,3,4;1,3,5;1,4,5;2,3,4;2,3,5;2,4,5;3,4,5}),ROW($1:$3)^0)=15))*2
For 2:
SUMPRODUCT(--((CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))+(TRANSPOSE(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)))=15))
All together:
=(SUMPRODUCT(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))=15)*2+
SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,ROW($1:$5),{1,2,3,4;1,2,3,5;1,2,4,5;1,3,4,5;2,3,4,5}),ROW($1:$4)^0)=15))*2+
SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,ROW($1:$10),{1,2,3;1,2,4;1,2,5;1,3,4;1,3,5;1,4,5;2,3,4;2,3,5;2,4,5;3,4,5}),ROW($1:$3)^0)=15))*2+
SUMPRODUCT(--((CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))+(TRANSPOSE(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)))=15))
For older versions we need to "trick" INDEX into accepting the arrays as Row and Column References:
We do that by using N(IF({1},[thearray]))
=(SUMPRODUCT(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))=15)*2+
SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,N(IF({1},ROW($1:$5))),N(IF({1},{1,2,3,4;1,2,3,5;1,2,4,5;1,3,4,5;2,3,4,5}))),ROW($1:$4)^0)=15))*2+
SUMPRODUCT(--(MMULT(INDEX(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)*ROW($1:$10)^0,N(IF({1},ROW($1:$10))),N(IF({1},{1,2,3;1,2,4;1,2,5;1,3,4;1,3,5;1,4,5;2,3,4;2,3,5;2,4,5;3,4,5}))),ROW($1:$3)^0)=15))*2+
SUMPRODUCT(--((CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10))+(TRANSPOSE(CHOOSE(MOD(A1:E1-1,13)+1,1,2,3,4,5,6,7,8,9,10,10,10,10)))=15))
This is a CSE That must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.

How to characterize a distribution of values?

I try to explain it with an example.
In a school there are n classes. In each classe there are k students, with k from 1 to 700, both n and k are known.
I need a way to characterize, for each class, the distribution of the names of students. For example, in class A there are 10 students, 3 are named "John", 3 "Mark" and 3 "Anne". In another class there are 100 student and everyone is named "Anton".
I need a measure able to be indicative of names distribution in each class. For example, (it's not important), it may be 1 if everyone in a class has the same name and 0 if there aren't 2 identical names in the same class.
In other words a way to sort classes by the distribution of names.
Sounds like you want a "contingency table". It's arbitrary which of your variables you want to have as rows vs. columns, but the table entries are either counts or proportions of how many occurrences fall in the intersection of the categories.
With the example you gave:
Class
A B
_________________
Anne | 3 | 0 | 3
Names Anton | 0 | 100 | 100
John | 3 | 0 | 3
Mark | 3 | 0 | 3
Unknown | 1 | 0 | 1
|--------|--------|----
10 100 | 110
Values at the right and along the bottom are called the "marginal totals", or if proportions, "marginal distributions". The bottom right corner is the grand total of your data, obtained by summing the row or column margins. (They better come out the same!) For proportions, the sum must be 1.

What are co-occurence matrixes and how are they used in NLP?

The pypi docs for a google ngram downloader say that "sometimes you need an aggregate data over the dataset. For example to build a co-occurrence matrix."
The wikipedia for co-occurence matrix has to do with image processing and googling the term seems to bring up some sort of SEO trick.
So what are co-occurrence matrixes (in computational linguistics/NLP)? How are they used in NLP?
What is a co-occurrence matrix ?
Generally speaking, a co-occurrence matrix will have specific entities in rows (ER) and columns (EC). The purpose of this matrix is to present the number of times each ER appears in the same context as each EC.
As a consequence, in order to use a co-occurrence matrix, you have to define your entites and the context in which they co-occur.
In NLP, the most classic approach is to define each entity (ie, lines and columns) as a word present in a text, and the context as a sentence.
Consider the following text :
Roses are red. Sky is blue.
With the classic approach described before, we'll have the following matrix :
| Roses | are | red | Sky | is | blue
Roses | 1 | 1 | 1 | 0 | 0 | 0
are | 1 | 1 | 1 | 0 | 0 | 0
red | 1 | 1 | 1 | 0 | 0 | 0
Sky | 0 | 0 | 0 | 1 | 1 | 1
is | 0 | 0 | 0 | 1 | 1 | 1
Blue | 0 | 0 | 0 | 1 | 1 | 1
Here, each cell indicates wether the two items co-occur or not. You may replace it with the number of times it appears, or with a more sophisticated approach. You may also change the entities themselves, by putting nouns in columns and adjective in lines instead of every word.
What are they used for in NLP ?
The most evident use of these matrix is their ability to provide links between notions. Let's suppose you're working on products reviews. Let's also suppose for simplicity that each review is only composed of short sentences. You'll have something like that :
ProductX is amazing.
I hate productY.
Representing these reviews as one co-occurrence matrix will enable you associate products with appreciations.
The co-occurrence matrix indicates how many times the row word (e.g. 'digital') is surrounded (in a sentence, or in the ±4 word window - depends on the application) by the column word (e.g. 'pie').
The entry '5' in the following table, for example, means that we had 5 sentences in our text where 'digital' was surrounded by 'pie'.
These sentences could have been:
I love a digital pie.
What's digital is often a pie.
May I have some digital pie?
Digital world necessitates pie-eating.
There's something digital about this pie.
Note that the co-occurrence matrix is always symmetric - the entry with the row word 'pie' and the column word 'digital' will be 5 as well (as these words co-occur in the very same sentences!).

Send DNS data: MSB or LSB first?

I'm implementing a DNS(multicast DNS in fact) in c#.
I just want to know if I must encode my uint/int/ushort/... with the LSB on the left or the MSB on the left. And more globally how I could know this? One of this is standard?
Because I didn't found anything in the IETF description. I found a lot of things(each header field length, position), but I didn't found this.
Thank you!
The answer is in RFC 1035 (2.3.2. Data Transmission Order)
Here is the link: http://www.ietf.org/rfc/rfc1035.txt
And the interesting part
2.3.2. Data Transmission Order
The order of transmission of the header and data described in this
document is resolved to the octet level. Whenever a diagram shows a
group of octets, the order of transmission of those octets is the
normal order in which they are read in English. For example, in the
following diagram, the octets are transmitted in the order they are
numbered.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 1 | 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 3 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 5 | 6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Whenever an octet represents a numeric quantity, the left most bit in
the diagram is the high order or most significant bit. That is, the
bit labeled 0 is the most significant bit. For example, the following
diagram represents the value 170 (decimal).
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 1 0 1 0 1 0|
+-+-+-+-+-+-+-+-+
Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit.
When a multi-octet quantity is transmitted the most significant octet
is transmitted first.

Resources