Send DNS data: MSB or LSB first? - dns

I'm implementing a DNS(multicast DNS in fact) in c#.
I just want to know if I must encode my uint/int/ushort/... with the LSB on the left or the MSB on the left. And more globally how I could know this? One of this is standard?
Because I didn't found anything in the IETF description. I found a lot of things(each header field length, position), but I didn't found this.
Thank you!

The answer is in RFC 1035 (2.3.2. Data Transmission Order)
Here is the link: http://www.ietf.org/rfc/rfc1035.txt
And the interesting part
2.3.2. Data Transmission Order
The order of transmission of the header and data described in this
document is resolved to the octet level. Whenever a diagram shows a
group of octets, the order of transmission of those octets is the
normal order in which they are read in English. For example, in the
following diagram, the octets are transmitted in the order they are
numbered.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 1 | 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 3 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 5 | 6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Whenever an octet represents a numeric quantity, the left most bit in
the diagram is the high order or most significant bit. That is, the
bit labeled 0 is the most significant bit. For example, the following
diagram represents the value 170 (decimal).
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 1 0 1 0 1 0|
+-+-+-+-+-+-+-+-+
Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit.
When a multi-octet quantity is transmitted the most significant octet
is transmitted first.

Related

How to split data and assign it into designated variables?

I have data in Stata regarding the feeling of the current situation. There are seven types of feeling. The data is stored in the following format (note that the data type is a string, and one person can respond to more than 1 answer)
feeling
4,7
1,3,4
2,5,6,7
1,2,3,4,5,6,7
Since the data is a string, I tried to separate it by
split feeling, parse (,)
and I got the result
feeling1
feeling2
feeling3
feeling4
feeling5
feeling6
feeling7
4
7
1
3
4
2
5
6
7
1
2
3
4
5
6
7
However, this is not the result I want. which is that the representative number of feelings should go into the correct variable. For instance.
feeling1
feeling2
feeling3
feeling4
feeling5
feeling6
feeling7
4
7
1
3
4
2
5
6
7
1
2
3
4
5
6
7
I am not sure if there is any built-in command or function for this kind of problem. I am thinking about using forval in looping through every value in each variable and try to juggle it around into the correct variable.
A loop over the distinct values would be enough here. I give your example in a form explained in the Stata tag wiki as more helpful and then give code to get the variables you want as numeric variables.
* Example generated by -dataex-. For more info, type help dataex
clear
input str13 feeling
"4,7"
"1,3,4"
"2,5,6,7"
"1,2,3,4,5,6,7"
end
forval j = 1/7 {
gen wanted`j' = `j' if strpos(feeling, "`j'")
gen better`j' = strpos(feeling, "`j'") > 0
}
l feeling wanted1-better3
+---------------------------------------------------------------------------+
| feeling wanted1 better1 wanted2 better2 wanted3 better3 |
|---------------------------------------------------------------------------|
1. | 4,7 . 0 . 0 . 0 |
2. | 1,3,4 1 1 . 0 3 1 |
3. | 2,5,6,7 . 0 2 1 . 0 |
4. | 1,2,3,4,5,6,7 1 1 2 1 3 1 |
+---------------------------------------------------------------------------+
If you wanted a string result that would be yielded by
gen wanted`j' = "`j'" if strpos(feeling, "`j'")
Had the number of feelings been 10 or more you would have needed more careful code as for example a search for "1" would find it within "10".
Indicator (some say dummy) variables with distinct values 1 or 0 are immensely more valuable for most analysis of this kind of data.
Note Stata-related sources such as
this FAQ
this paper
and this paper.

Spark: count events based on two columns

I have a table with events which are grouped by a uid. All rows have the columns uid, visit_num and event_num.
visit_num is an arbitrary counter that occasionally increases. event_num is the counter of interactions within the visit.
I want to merge these two counters into a single interaction counter that keeps increasing by 1 for each event and continues to increase when then next visit has started.
As I only look at the relative distance between events, it's fine if I don't start the counter at 1.
|uid |visit_num|event_num|interaction_num|
| 1 | 1 | 1 | 1 |
| 1 | 1 | 2 | 2 |
| 1 | 2 | 1 | 3 |
| 1 | 2 | 2 | 4 |
| 2 | 1 | 1 | 500 |
| 2 | 2 | 1 | 501 |
| 2 | 2 | 2 | 502 |
I can achieve this by repartitioning the data and using the monotonically_increasing_id like this:
df.repartition("uid")\
.sort("visit_num", "event_num")\
.withColumn("iid", fn.monotonically_increasing_id())
However the documentation states:
The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
As the id seems to be monotonically increasing by partition this seems fine. However:
I am close to reaching the 1 billion partition/uid threshold.
I don't want to rely on the current implementation not changing.
Is there a way I can start each uid with 1 as the first interaction num?
Edit
After testing this some more, I notice that some of the users don't seem to have consecutive iid values using the approach described above.
Edit 2: Windowing
Unfortunately there are some (rare) cases where more thanone row has the samevisit_numandevent_num`. I've tried using the windowing function as below, but due to this assigning the same rank to two identical columns, this is not really an option.
iid_window = Window.partitionBy("uid").orderBy("visit_num", "event_num")
df_sample_iid=df_sample.withColumn("iid", fn.rank().over(iid_window))
The best solution is the Windowing function with rank, as suggested by Jacek Laskowski.
iid_window = Window.partitionBy("uid").orderBy("visit_num", "event_num")
df_sample_iid=df_sample.withColumn("iid", fn.rank().over(iid_window))
In my specific case some more data cleaning was required but generally, this should work.

RFC 1035 Header Structure

I'm studying about dns and would like to understand about this information, because I could not fully understand.
The header contains the following fields:
1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ID |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR| Opcode |AA|TC|RD|RA| Z | RCODE |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| QDCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ANCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| NSCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| ARCOUNT |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
I would to know what mean this numbers on top.
The numbers across the top are simply the bit numbers within the 16 bit word, although as is common with the RFC series of documents they're ordered from most significant bit to least, instead of the (more intuitive) other way around.
So, for example, given an array data of octets containing that header, the ID would be:
(data[0] << 8) | data[1]
and the QR bit would be the most significant bit of data[2]

Concorde TSP Solver - Asymmetric Instances

I am trying to use Concorde to solve some asymmetric instances of the TSP. Although the official website says Concorde does solve this kind of instances, I've seen people saying it doesn't (https://cs.stackexchange.com/a/16336, http://www.math.uwaterloo.ca/tsp/road/austria.html). I'm only doubting the official website because I have the following test instance:
NAME: test
TYPE: ATSP
DIMENSION: 4
EDGE_WEIGHT_TYPE: EXPLICIT
EDGE_WEIGHT_FORMAT: FULL_MATRIX
EDGE_WEIGHT_SECTION
999 | 2 | 2 | 2
2 |999| 2 | 2
2 | 2 |999| 2
2 | 2 | 2 |999
EOF
Concorde gives me, as expected:
Optimal Solution: 8.00
and on the .sol file, the route: 0 1 3 2.
But if I change the matrix to:
EDGE_WEIGHT_SECTION
999 |100| 3 |100
2 |999| 2 | 2
2 | 2 |999| 2
2 | 2 | 2 |999
EOF
The solution given now is 106 with the sequence 0 3 1 2.
No matter what numbers I put in the first row, Concorde never chooses the third city (index 2, value 3).
Does anybody have any idea why that is? Am I reading the input wrong?
--EDIT--
Actually, the instances for the ATSP problem are NOT from the official website. It's from this one:
http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/
Because the name of this library is TSPLIB (the same as the official set of problems for Concorde) I made this confusion. I'm not sure how this TSPLIB95 is related with Concorde Solver (or if it is related whatsoever).
Concorde is only for symmetric TSP. You'll have to do the standard trick to convert an ATSP into a symmetric TSP (with additional nodes).

Spoofing an echo reply

Given that I'm on a local network, if I can capture a ICMP echo request packet, and considering that I want to spoof a echo reply, what part of the original packet would I need to change supposing I make a copy of the original before i send it back? I'm guessing the IP header would need to change, (the destination IP of the original would become the source, and vice versa) as well as the ICMP header (the type would need to change to ECHO_REPLYPACKET). But besides those 2 are there any others?
Quoting RFC 792 :
Echo or Echo Reply Message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identifier | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+-
To form an echo reply message, the source and destination addresses
are simply reversed, the type code changed to 0, and the checksum
recomputed.
Identifier and Sequence Number must be 0 as well.
RFC 1071 shows you how to calculate the Checksum

Resources