wireshark and tcpdump -r: strange tcp window sizes - linux

I'm capturing http traffic with tcpdump and am interested in TCP slow start and how window sizes increase:
$ sudo tcpdump -i eth1 -w wget++.tcpdump tcp and port 80
When I view the dump file with Wireshark the progression of window sizes looks normal, i.e. 5840, 5888, 5888, 8576, 11264, etc...
But when I view the dump file via
$ tcpdump -r wget++.tcpdump -tnN | less
I get what seem to be nonsensical windows sizes ( IP addresses omitted for brevity ):
: S 1069713761:1069713761(0) win 5840 <mss 1460,sackOK,timestamp 24220583 0,nop,wscale 7>
: S 1198053215:1198053215(0) ack 1069713762 win 5672 <mss 1430,sackOK,timestamp 2485833728 24220583,nop,wscale 6>
: . ack 1 win 46 <nop,nop,timestamp 24220604 2485833728>
: . 1:1419(1418) ack 1 win 46 <nop,nop,timestamp 24220604 2485833728>
: P 1419:2002(583) ack 1 win 46 <nop,nop,timestamp 24220604 2485833728>
: . ack 1419 win 133 <nop,nop,timestamp 2485833824 24220604>
: . ack 2002 win 178 <nop,nop,timestamp 2485833830 24220604>
Is there a way to get normal / absolute window sizes on the command line?

The window sizes are correct - they're just unscaled.
The connection initiator has set a wscale (window scaling factor) of 7, so its subsequent win values must be multiplied by 128 to get the window size in bytes. Thus the win 46 indicates a window of 5888 bytes.
The connection recipient has set a wscale of 6, so its win values must be multiplied by 64. Thus win 133 indicates a window of 8512 bytes, and win 178 indicates 11392 bytes.

Also, if the tool (wireshark or tcpdump, it doesn't matter) doesn't see the syn, it has to print the unscaled value, which can fool you

Related

Linux OpenSuse42.3 - port status - filtered

I have a problem, which I hope somebody can point me to the right direction.
Problem >>>
A) Our external provider (connects via VPN) needs to access "OpenSuse42.3" to specific ports, which
"nmap" or "ncat" tools shows as "filtered" or "refused".
B) No services are listening on these ports.
C) No firewall is running on this server.
D) Security team opened these ports on firewall with evidence that connection get reset by server
"OpenSuse42.3".
Test runs from "10.10.10.2" to "10.10.10.1" (problem server) from provider VPN connection (from my computer)
Example 1 : from "10.10.10.2"
>>> nmap -sT -p1101,3050 10.10.10.1
>>>
PORT STATE SERVICE
1101/tcp filtered pt2-discover
3050/tcp filtered gds_db
Example 2 : from "10.10.10.2"
nc -z -v 10.10.10.1 1101
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.
nc -z -v 10.10.10.1 3050
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection refused.
Example 3: on server "10.10.10.1"
tcpdump -n -i eth0 port 1101 or port 3050 -v
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:00:28.940582 IP (tos 0x0, ttl 64, id 32383, offset 0, flags [DF], proto TCP (6), length 60)
10.10.10.2.58000 > 10.10.10.1.1101: Flags [S], cksum 0xa3fc (correct), seq 3906215335, win 29200,
options [mss 1460,sackOK,TS val 1388733400 ecr 0,nop,wscale 7], length 0
13:00:28.940662 IP (tos 0x0, ttl 64, id 40440, offset 0, flags [DF], proto TCP (6), length 40)
10.10.10.1.1101 > 10.10.10.2.58000: Flags [R.], cksum 0x347b (correct), seq 0, ack 3906215336,
win 0, length 0
13:00:31.263502 IP (tos 0x0, ttl 64, id 60627, offset 0, flags [DF], proto TCP (6), length 60)
10.10.10.2.40830 > 10.10.10.1.3050: Flags [S], cksum 0x8bc2 (correct), seq 3504308280, win 29200,
options [mss 1460,sackOK,TS val 1388735723 ecr 0,nop,wscale 7], length 0
13:00:31.263569 IP (tos 0x0, ttl 64, id 40888, offset 0, flags [DF], proto TCP (6), length 40)
10.10.10.1.3050 > 10.10.10.2.40830: Flags [R.], cksum 0x2554 (correct), seq 0, ack 3504308281,
win 0, length 0
BUT
As soon as I put something on the server like - "nc -l 1101" or "nc -l 3050"
problem disappears, which probably makes sense. To my knowledge "nmap" tool usually shows port status as "closed" if port is not firewalled and service is not running and "open" if service is running on this port.
Question
Are ports opened or closed ??? What else do I check, because provider keep insisting that ports are closed on "10.10.10.1" and he cannot continue his work.
Please let me knoe if something is unclear in this situation and I will respond.
Appreciate it !!!!

How to detect CDP by tcpdump

I would like to ask you for help: Does somebody know how to detect Cisco Discovery Protocol via tcpdump?
Currently I'm using following command, but I'm not sure by this:
tcpdump -i eth0 -nn "ether[20:2]==0x2000"
Some hints are appreciated. Thank you ...
Charkh
I normally use this filters
tcpdump -nvi bce0 -s 1500 ether dst 01:00:0c:cc:cc:cc
replace bce0 with your network interface.
This will output the hole CDP information, received from ether the switch or the host itself (if you have a cdpd running on the host)
This will output Switch-Name, Port, Switch Type, Software, VLAN and so on...
the output will look similar to this:
$tcpdump -nvi bce0 -s 1500 ether dst 01:00:0c:cc:cc:cc
tcpdump: WARNING: bce0: no IPv4 address assigned
tcpdump: listening on bce0, link-type EN10MB (Ethernet), capture size 1500 bytes
11:43:24.327197 DTPv1, length 39
Domain TLV (0x0001) TLV, length 18, domain-internal
Status TLV (0x0002) TLV, length 5, 0x81
DTP type TLV (0x0003) TLV, length 5, 0xa5
Neighbor TLV (0x0004) TLV, length 10, 6c:50:4d:06:64:01
11:43:44.820865 CDPv2, ttl: 180s, checksum: 692 (unverified), length 477
Device-ID (0x01), length: 40 bytes: 'my-switch.mydomain.net'
Version String (0x05), length: 247 bytes:
Cisco IOS Software, CBS30X0 Software (CBS30X0-IPBASEK9-M), Version 12.2(58)SE1, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Thu 05-May-11 03:57 by prod_rel_team
Platform (0x06), length: 20 bytes: 'cisco WS-CBS3020-HPQ'
Address (0x02), length: 13 bytes: IPv4 (1) 1.2.3.4
Port-ID (0x03), length: 18 bytes: 'GigabitEthernet0/1'
Capability (0x04), length: 4 bytes: (0x00000028): L2 Switch, IGMP snooping
Protocol-Hello option (0x08), length: 32 bytes:
VTP Management Domain (0x09), length: 13 bytes: 'doman-internal'
Native VLAN ID (0x0a), length: 2 bytes: 358
Duplex (0x0b), length: 1 byte: full
AVVID trust bitmap (0x12), length: 1 byte: 0x00
AVVID untrusted ports CoS (0x13), length: 1 byte: 0x00
Management Addresses (0x16), length: 13 bytes: IPv4 (1) [IP]
unknown field type (0x1a), length: 12 bytes:
0x0000: 0000 0001 0000 0000 ffff ffff
I use the following command:
tcpdump -nn -v -xx -i eth? -s 1500 -c 1 'ether dst 01:00:0c:cc:cc:cc and (ether[24:2] = 0x2000 or ether[20:2] = 0x2000)'
Where eth? is your ethernet adapter.
It can be used with IBM SEA over trunked connections or over standard copper connections.

Why don't the tcp server reply my syn packet when I try to connect it through raw socket?

It depends on the iphdr.saddr field.
When it was set to my own address or a random multicast address, I can see the server replied with the syn/ack packet.
If set to other ips, the server didn't reply.
How to explain it?
The multicast address case:
13:55:08.242535 IP 240.151.224.61.13579 > localhost.5223: Flags [S], seq 123456, win 4096, length 0
E..(g+..#......=....5..g...#....P...$X..
13:55:14.906511 IP 239.151.224.61.13579 > localhost.5223: Flags [S], seq 123456, win 4096, length 0
E..(g+..#......=....5..g...#....P...%X..
13:55:14.906549 IP localhost.5223 > 239.151.224.61.13579: Flags [S.], seq 3502093187, ack 123457, win 43690, options [mss 65495], length 0
E..,..#.#..........=.g5........A...N.......
13:55:15.904599 IP localhost.5223 > 239.151.224.61.13579: Flags [S.], seq 3502093187, ack 123457, win 43690, options [mss 65495], length 0
`
my own address case:
14:14:22.989225 IP slave1.domain.com.13579 > localhost.5223: Flags [S], seq 123456, win 4096, length 0
E..(g+..#......m....5..g...#....P...3...
14:14:22.989236 IP localhost.5223 > slave1.domain.com.13579: Flags [S.], seq 3228604881, ack 123457, win 43690, options [mss 65495], length 0
E..,..#.#..........m.g5..p.....A...A5......
14:14:22.989259 IP slave1.domain.com.13579 > localhost.5223: Flags [.], ack 3228604882, win 4096, length 0
E..(..#.#......m....5..g...A.p..P.......
`
no syn/ack reply case:
14:16:18.719629 IP 223.151.224.61.13579 > localhost.5223: Flags [S], seq 123456, win 4096, length 0
E..(g+..#......=....5..g...#....P...5X..
14:16:46.511299 IP 240.151.224.61.13579 > localhost.5223: Flags [S], seq 123456, win 4096, length 0
E..(g+..#......=....5..g...#....P...$X..
iphdr.saddr represents the source address of the IP packet. I assume that the receiving end of your SYN packet will try to respond with an ACK to whatever source address you provide in the IP packet.

how does the tcp packet “size” reported by tcpdump correlate to the actual IP packet sent?

When I run a tcpdump – what exactly does the number reported between the () after the lowest and highest sequence number mean?
If it correlates to the IP packet sent, how can it be larger than
the mtu of 1500?
On the receive side of this flow tcpdump always
sees packets of 1448 or smaller – where are these packets being
defragmented and reassembled? Can I see that with tcpdump?
Is there any relationship between this number and the size of the tcp socket
buffer?
I.e: Typical tcpdump on the sender:
11:47:18.278352 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 3263771:3272459(8688) ack 1 win 54
11:47:18.278371 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 3272459:3282595(10136) ack 1 win 54
11:47:18.278620 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 3282595:3298523(15928) ack 1 win 54
11:47:18.278727 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 3298523:3301419(2896) ack 1 win 54
11:47:18.278731 IP 1.1.1.36.41034 > 1.1.1.37.60960: P 3301419:3301719(300) ack 1 win 54
11:47:18.777160 IP 1.1.1.36.41034 > 1.1.1.37.60960: P 3301719:3301723(4) ack 1 win 54
11:47:18.777175 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 3301723:3303171(1448) ack 1 win 54
And the receiver:
11:47:18.277895 IP 1.1.1.36.41034 > 1.1.1.37.60960: P 990413:990417(4) ack 1 win 54
11:47:18.277948 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 990417:991865(1448) ack 1 win 54
11:47:18.277953 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 991865:993313(1448) ack 1 win 54
11:47:18.277958 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 993313:994761(1448) ack 1 win 54
11:47:18.278028 IP 1.1.1.36.41034 > 1.1.1.37.60960: . 994761:996209(1448) ack 1 win 54
That is the length of the data, as you can see if you substract the two sequence numbers (after accounting +1 for SYN and FIN flags, if they are present in the packet).
As to why it's greater than the MTU, either your netword card can send jumbo-frames (quite normal on 1Gbps cards), or your card can accept large segments and divide them on hardware (called LSO/GSO or some similar acronym I cannot seem to recall).
Edit: See Large Segment Offload on Wikipedia.

TCP messages getting coalesced

I have a Java application that is writing to the network. It is writing messages in the region of 764b, +/- 5b. A pcap shows that the stream is getting IP fragmented and we can't explain this.
Linux 2.6.18-238.1.1.el5
A strace shows:
(strace -vvvv -f -tt -o strace.out -e trace=network -p $PID)
1: 2045 12:48:23.984173 sendto(45, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0) = 764
2: 15206 12:48:23.984706 sendto(131, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0 <unfinished ...>
3: 2046 12:48:23.984811 sendto(46, "\0\0\0\0\0\0\2\374\0\0\0\0\0\3\n\0\0\0\0\3upd\365myData"..., 764, 0, NULL, 0 <unfinished ...>
4: 15206 12:48:23.984893 <... sendto resumed> ) = 764
5: 2046 12:48:23.984948 <... sendto resumed> ) = 764
I am seeing packets larger than the MTU when I capture the network, which is causing fragmentation.
4809 5.848987 10.0.0.2 -> 10.0.0.5 TCP 40656 > taiclock [ACK] Seq=325501 Ack=1 Win=46 Len=1448 TSV=344627654 TSER=270108068 # First Fragment
4810 5.848991 10.0.0.5 -> 10.0.0.2 TCP taiclock > 40656 [ACK] Seq=1 Ack=326949 Win=12287 Len=0 TSV=270108081 TSER=344627643 # TCP ack
4811 5.849037 10.0.0.2 -> 10.0.0.5 TCP 40656 > taiclock [PSH, ACK] Seq=326949 Ack=1 Win=46 Len=82 TSV=344627654 TSER=270108081 # Second Frag
Questions:
1) It appears the server trying to batch the two sendto() into one IP packet, which is larger than the MTU and is therefore getting fragmented. Why?
2) Looking at the strace output for PID 2046, is the figure after the equal sign <... sendto resumed> line a total for what was sent? I.e. 764b was sent in total for line 3 and line 5? Or is 764 bytes being sent per line?
3) Are there any options I can pass to strace to log all of the sendto() output? Can't seem to find anything..
To answer your questions, in order:
1) It is perfectly normal for multiple send calls to be coalesced when using TCP as it is a stream protocol so does not preserve user level send boundaries in any way. I don't see any evidence of IP fragmentation (which would be bad) in your trace, just of TCP segmentation (which is completely normal).
2) Yes, that is the size - more specifically it is reporting the value that the system call returned after it resumed.
3) You can use "-e write=all" or "-e write=" to get strace to report the whole of the written data.

Resources