Open vSwitch, in-band control: How it works? - linux

I try to measure the control flow impact on Open vSwitch performance while using in-band connections.
So in this task I need to count the messages sent from controller to every switch in the network that uses in-band control.
I try to understand how the controller installs flows into Open vSwitch while using in-band connection.
I've created an example topology using mininet and this article:
http://tocai.dia.uniroma3.it/compunet-wiki/index.php/In-band_control_with_Open_vSwitch
The topology contains 5 switches connected one-by-one (as show on the first picture of the article).
The controller is launched on the h3 host. In my case the POX controller is used. And all is pingable.
So when I try to sniff the traffic on s1 ... s5 interfaces, I see that OpenFlow messages (PacketIn, PacketOut etc) appear only on the s3 interface. On other interface I don't see any TCP or OpenFlow packets.
The question is how the controller installs new flows on s1, s2, s4, s5 switches? And how the controller messages are delivered to the switch that is not directly connected to controller?
Thanks.

Look no further than the OVS documentation! The OpenVSwitch design document has a section describing this in detail:
Design Decisions In Open vSwitch:
Github, direct link to In-Band Control section
official HTML
official plaintext
The implementation subsection says:
Open vSwitch implements in-band control as "hidden" flows, that is,
flows that are not visible through OpenFlow, and at a higher priority
than wildcarded flows can be set up through OpenFlow. This is done so
that the OpenFlow controller cannot interfere with them and possibly
break connectivity with its switches. It is possible to see all flows,
including in-band ones, with the ovs-appctl "bridge/dump-flows"
command.
(...)
The following rules (with the OFPP_NORMAL action) are set up on any
bridge that has any remotes:
(a) DHCP requests sent from the local port.
(b) ARP replies to the local port's MAC address.
(c) ARP requests from the local port's MAC
address.
In-band also sets up the following rules for each unique next-hop MAC
address for the remotes' IPs (the "next hop" is either the remote
itself, if it is on a local subnet, or the gateway to reach the
remote):
(d) ARP replies to the next hop's MAC address.
(e) ARP requests from the next hop's MAC address.
In-band also sets up the following rules for each unique remote IP
address:
(f) ARP replies containing the remote's IP address as a target.
(g) ARP requests containing the remote's IP address as a source.
In-band also sets up the following rules for each unique remote
(IP,port) pair:
(h) TCP traffic to the remote's IP and port.
(i) TCP traffic from the remote's IP and port.
The goal of these rules is to be as narrow as possible to allow a
switch to join a network and be able to communicate with the remotes.
As mentioned earlier, these rules have higher priority than the
controller's rules, so if they are too broad, they may prevent the
controller from implementing its policy. As such, in-band actively
monitors some aspects of flow and packet processing so that the rules
can be made more precise.
In-band control monitors attempts to add flows into the datapath that
could interfere with its duties. The datapath only allows exact match
entries, so in-band control is able to be very precise about the flows
it prevents. Flows that miss in the datapath are sent to userspace to
be processed, so preventing these flows from being cached in the "fast
path" does not affect correctness. The only type of flow that is
currently prevented is one that would prevent DHCP replies from being
seen by the local port. For example, a rule that forwarded all DHCP
traffic to the controller would not be allowed, but one that forwarded
to all ports (including the local port) would.
The document also contains more information about special cases and potential problems, but is quite long so I'll omit it here.

Related

Internet socket behavior when communicating within the same host

I am recently writing some tool for testing some network processes that run across different hosts.
I am tempted to the idea that when testing, instead of running the client and server in different hosts, I can run them within one host.
Since the client and server are using TCP to communicate, so I think this should be fine, except one point below:
Is the TCP socket behavior the same when communicating data within the same host as the case of across hosts?
Will the data be physically present to the NIC interface and then routed to the target socket? Or the kernel will bypass the NIC interface under such scenarios? (Let's limit the OS as only Linux here for discussion)
There seems little specification regarding to such case.
==== EDIT ====
I actually notice some difference between intra-host and inter-host communications.
When doing inter-host communications, my program can successfully get hardware timestamp. But with the exact same code to run within the same host, the hardware timestamp disappears. When supported and enabled, hardware timestamp of TCP packet is available, and is returned as the ancillary data of recvmsg along with the received TCP data. Linux kernel timestamp doc has all the related info.
I checked the source code, the only difference is that whether the sender is within the same host of the receiver, no other difference.
So I am wondering whether Linux kernel will bypass the NIC and present the data directly to the receiver when doing intra-host communication, thus cause the issue.
Will the data be physically present to the NIC interface and then routed to the target socket?
No. There is typically no device that provides this capability, nor is there any need for one.
Or the kernel will bypass the NIC interface under such scenarios?
The kernel will not use the NIC unless it needs to send or receive a packet on a network. Typically, NICs can only return local packets if put in a test or loopback mode, which would require them to stop listening to the network.

Test setup on AWS to test TCP transparent proxy (TPROXY) and spoofing sockets

I'm developing a proof-of-concept of some kind of transparent proxy on Linux.
Transparent proxy intercepts TCP traffic and forwards it to backend.
I use https://www.kernel.org/doc/Documentation/networking/tproxy.txt and spoofing sockets for outgoing TCP connection.
On my dev PC I was able to emulate network using Docker and all works fine.
But I need to deploy test environment on AWS.
Proposed design:
Three VMs within the same subnet:
client, 192.168.0.2
proxy, 192.168.0.3
backend, 192.168.0.4
On client I add route to 192.168.0.4 thru 192.168.0.3
On proxy I confugure TPROXY to intercept TCP packets and forward it to backend with 192.168.0.2 IP source address. Here our transparent proxy works.
On backend I run simple web server. Also I add route to 192.168.0.2 thru 192.168.0.3 otherwise packets will go back directly to 192.168.0.2
The question:
Will proposed network design work as expected?
AWS uses some kind of software defined network and I don't know will it work in the same way as I would connect 3 Linux boxes to one Ethernet switch.
Will proposed network design work as expected?
Highly unlikely.
The IP network in VPC that instances can access is, from all appearances, an IP network (Layer 3), not an Ethernet network (Layer 2), even though it's presented to the instances as though it were Ethernet.
The from/to address that is "interesting" to an Ethernet switch is the MAC address. The from/to address of interest to the EC2 network is the IP address. If you tweak your instance's IP stacks by spoofing the addresses and manipulating the route tables, the only two possible outcomes should be one of these: the packets will actually arrive at the correct instance according to the infrastructure's knowledge of where that IP address should exist... or the packets will be dropped by the network. Most likely, the latter.
There is an IP Source/Destination Check Flag on each EC2 instance that disables some of the network's built-in blocking of packets the network would otherwise have considered spoofed, but this should only apply to traffic with IP addresses outside the VPC supernet CIDR block -- the IP address of each instance is known to the infrastructure and not subject to the kind of tweaking you're contemplating.
You could conceivably build tunnels among the instances using the Generic Route Encapsulation (GRE) protocol, or OpenVPN, or some other tunneling solution, and then the instances would have additional network interfaces in different IP subnets where they could directly exchange traffic using a different subnet and rules they make up, since the network wouldn't see the addresses on the packets encapsulated in the tunnels, and wouldn't impose any restrictions on the inner payload.
Possibly related: In a certain cloud provider other than AWS, a provider with a network design that is far less sensible than VPC, I use inter-instance tunnels (built with OpenVPN) to build my own virtual private subnets that make more sense than what that other cloud provider offers, so I would say this is potentially a perfectly viable alternative -- the increased latency of my solution is sub-millisecond.
But this all assumes that you have a valid reason for choosing a solution involving packet mangling. There should be a better, more inside-the-box way of solving the exact problem you are trying to solve.

Should source IP address filtering be implemented in the Application layer itself or delegated by Application to the Firewall

Let's say my application has listening UDP socket and it knows from what IP addresses it could receive UDP datagrams. Anything coming from other IP addresses would be considered as malicious datagram and should be discarded as early as possible to prevent DoS attacks. The hard part is that the set of these legit IP addresses can dynamically change over application's life time (ie by dynamically receiving them over control channel).
How would you implement filtering based on the source IP address in the case above?
I see two solutions where to put this source IP filtering logic:
Implement it in the application itself after recvfrom() call.
Install default drop policy in the Firewall and then let the application install Firewall rules that would dynamically whitelist legit IP addresses.
There are pros and cons for each solutions. Some that come to my mind:
iptables could end up with O(n) filtering complexity (con for iptables)
iptables drop packets before they even get to the socket buffer (pro for iptables)
iptables might not be very portable (con for iptables)
iptables from my application could interfere with other applications that potentially would also install iptables rules (con for iptables)
if my application installs iptables rules then it can potentially become attack vector itself (con for iptables)
Where would you implement source IP filtering and why?
Can you name any Applications that follow convention #2 (Administrator manually installing static Firewall rules does not count)?
My recommendation (with absolutely no authority behind it) is to use iptables to do rate-limiting to dampen any DoS attacks and do the actual filtering inside your application. This will give you the least-bad of both worlds, allowing you to use the performance of iptables to limit DoS throughput as well as the ability to change which addresses are allowed without introducing a potential security hole.
If you do decide to go about it with iptables alone, I would create a new chain to do the application-specific filtering so that the potential for interference is lowered.
Hope this helps.
Hope this link help you
Network layer firewalls or packet filters operate at the TCP/IP protocol stack, not allowing packets to pass through the firewall unless they match the established rule set defined by the administrator or applied by default. Modern firewalls can filter traffic based on many packet attributes such as source IP address, source port, destination IP address or port, or destination service like WWW or FTP. They can filter based on protocols, TTL values, netblock of originator, of the source, and many other attributes.
Application layer firewalls work on the application level of the TCP/IP stack, intercepting all packets travelling to or from an application, dropping unwanted outside traffic from reaching protected machines, without acknowledgment to the sender. The additional inspection criteria can add extra latency to the forwarding of packets to their destination.
Mandatory access control (MAC) filtering or sandboxing protect vulnerable services by allowing or denying access based on the MAC address of specific devices allowed to connect to a specific network.
Proxy servers or services can run on dedicated hardware devices or as software on a general-purpose machine, responding to input packets such as connection requests, while blocking other packets. Abuse of an internal system would not necessarily cause a security breach, although methods such as IP spoofing could transmit packets to a target network.
Network address translation (NAT) functionality allows hiding the IP addresses of protected devices by numbering them with addresses in the "private address range", as defined in RFC 1918. This functionality offers a defence against network reconnaissance

gsoap client multiple ethernets

I have a linux system with two eth cards. eth0 and eth1. I am creating a client that sends
to endpoint 1.2.3.4.
I send my webservice with soap_call_ functions. How can I select eth1 instead of eth0?
the code is like that
soap_call_ns__add(&soap, server, "", a, b, &result);
How can I set inside the &soap variable the eth0 or the eth1?
(gsoap does not have a bind for clients... like soap_bind)
You want outgoing packages from your host to take a specific route (in this case a specific NIC)? If that's the case, then you have to adjust kernels routing tables.
Shorewall has excellent documentation on that kind of setup. You'll find there info about how to direct certain traffic through a particular network interface.
for gsoap we need to manually bind(2) before connect(3) in tcp_connect

First packet to be sent when starting to browse

Imagine a user sitting at an Ethernet-connected PC. He has a browser open. He types "www.google.com" in the address bar and hits enter.
Now tell me what the first packet to appear on the Ethernet is.
I found this question here: Interview Questions on Socket Programming and Multi-Threading
As I'm not a networking expert, I'd like to hear the answer (I'd assume it is "It depends" ;) ).
With a tool like Wireshark, I can obviously check my own computers behaviour. I'd like to know whether the packets I see (e.g. ARP, DNS, VRRP) are the same in each ethernet configuration (is it dependent on the OS? the driver? the browser even :)?) and which are the conditions in which they appear. Being on the data-link layer, is it maybe even dependent on the physical network (connected to a hub/switch/router)?
The answers that talk about using ARP to find the DNS server are generally wrong.
In particular, IP address resolution for off-net IP addresses is never done using ARP, and it's not the router's responsibility to answer such an ARP query.
Off-net routing is done by the client machine knowing which IP addresses are on the local subnets to which it is connected. If the requested IP address is not local, then the client machine refers to its routing table to find out which gateway to send the packet to.
Hence in most circumstances the first packet sent out will be an ARP request to find the MAC address of the default gateway, if it's not already in the ARP cache.
Only then can it send the DNS query via the gateway. In this case the packet is sent with the DNS server's IP address in the IP destination field, but with the gateway's MAC address on the ethernet packet.
You can always download wireshark and take a look.
Though to spoil the fun.
Assuming, the IP address of the host is not cached, and the MAC address of the DNS server is not cached, the first thing that will be sent will be a broadcast ARP message trying to find out the MAC address of the DNS server (which the router will respond to with its own address).
Next, the host name will be resolved using DNS. Then the returned IP address will be resolved using ARP (again the router will respond with its own address), and finally, the HTTP message will actually be sent.
Actually, it depends on a variety of initial conditions you left unspecified.
Assuming the PC is running an operating system containing a local DNS caching resolver (mine does), the first thing that happens before any packets are sent is the cache is searched for an IP address. This is complicated, because "www.google.com" isn't a fully-qualified domain name, i.e. it's missing the trailing dot, so the DNS resolver will accept any records already in its cache that match its search domain list first. For example, if your search domain list is "example.com." followed by "yoyodyne.com." then cached resources matching the names "www.google.com.example.com." "www.google.com.yoyodyne.com." and finally "www.google.com." will be used if available. Also note: if the web browser is one of the more popular ones, and the PC is running a reasonably current operating system, and the host has at least one network interface with a global scope IPv6 address assigned (and the host is on a network where www.google.com has AAAA records in its DNS horizon), then the remote address of the server might be IPv6 not IPv4. This will be important later.
If the remote address of the Google web server was locally cached in DNS, and the ARP/ND6 cache contains an entry for the IPv4/IPv6 address (respectively) of a default router, then the first transmitted packet will be a TCP SYN packet sourced from the interface address attached to the router and destined for the cached remote IPv4/IPv6 address. Alternatively, the default router could be reachable over some kind of layer-2 or layer-3 tunnel, in which case, the SYN packet will be appropriately encapsulated.
If the remote address of the Google web server was not locally cached, then the host will first need to query for the A and/or AAAA records in the DNS domain search list in sequence until it gets a positive response. If the first DNS resolving server address in the resolver configuration is in one of the local IPv4 subnet ranges, or in a locally attached IPv6 prefix with the L=1 bit set in the router advertisement, and the ARP/ND6 cache already contains an entry for the address in question, then the first packet the host will send is a direct DNS query for either an A record or a AAAA record matching the first fully-qualified domain name in the domain search list. Alternatively, if the first DNS server is not addressable on-link, and a default router has an ARP/ND6 cache entry already, then the DNS query packet will be sent to the default router to forward to the DNS server.
In the event the local on-link DNS server or a default router (respectively, as the case above may be) has no entry in the ARP/ND6 cache, then the first packet the host will send is either an ARP request or an ICMP6 neighbor solicitation for the corresponding address.
Oh, but wait... it's even more horrible. There are tweaky weird edge cases where the first packet the host sends might be a LLMNR query, an IKE initiation, or... or... or... how much do you really care about all this, buckaroo?
It depends
Got that right. E.g. does the local DNS cache contain the address? If not then a DNS lookup is likely to be the first thing.
If the host name is not in DNS cache nor in hosts file, first packet will go to DNS.
Otherwise, the first packet will be HTTP GET.
Well, whatever you try to do, the first thing happening is some Ethernet protocol related data. Notably, Ethernet adapters have to decide whether the Ethernet bus is available (so there's some collision detection taking place here)
It's hard to answer your question because it depends a lot on the type of ethernet network you're using. More information on Ethernet transmission can be found here and here

Resources