Where do the Linux kernel router code replace the MAC address

Where do the Linux kernel router code replace the MAC address - linux

The router will replace the source MAC address of the package it received with the address of the previous hop and the destination MAC address with the address of next hop.
The linux provides a functionality to worked as a router. My question is how the kernel code implement the function for mac address update during its package forwarding process? And where is this part of code
I try to find the codes in /net/ipv4, but can not found anything...

That is not what actually happens.
IP is not dependent on ethernet, so what happens is dependent upon the underlying protocol of the lower layer.
The same thing happens if it is a locally-originated IP packet, or if it is one which has been routed for another host.
Linux's IPv4 stack is not ethernet-dependent in any way, in fact lots of other link-layer protocols are supported by the kernel. IP being a WAN protocol, you can route between different underlying protocols. Some examples are
ppp, slip (serial lines)
PPTP, GRE (for tunnels, mostly VPNs)
IP over ATM
Token ring (mostly legacy, I think)
Loopback and dummy (for local communication only)
Wifi (although this is actually mostly identical to ethernet)
So what actually happens when routing IP frames from one ethernet interface to another is that the link-layer is stripped off completely, then a new link-layer is formed after routing. If the protocol were not ethernet, an appropriate link-layer packet for that protocol would be used instead.
So nobody "changes the MAC address", but rather, the link-layer packet is just completely rebuilt.

Related

What is an interface identifier

I have a laptop that is connected to my organization's network using one or more network adapters. I am trying to write a tool that will continuously monitor the connectivity status and connection quality of each network. However my networking knowledge to limited and the terminology confuses me.
Specifically finding all the network adapters. Someone suggested that I use the command ifconfig and it gave me what are called "interface identifiers".
ex:
['lo0', 'gif0', 'stf0', 'en0', 'en1', 'en2', 'bridge0', 'p2p0']
I'm not quite sure how this helps me solve my problem because I don't know what interface identifiers are and I am not sure how to leverage this information. My assumption is that they represent a computer or a router in the network.
If someone could clear this up or explain it to me in layman's terms that would be really helpful.

First of all, you need to understand that there may be physical network cards(OR/AND logical network adapters) present in the computer to identify connection/manage connection.
Next, you have an incorrect notion about interface identifier.
What you talked about(eth,virbr,lo) are interfaces. In IPv4 addressing scheme, we don't have interface ID. We have interface ID's in IPv6 addresses.
As mentioned in The Payoff of IPv6’s Very Large Address Size
In IPv4, IP addresses have no relationship to the addresses used for underlying data link layer network technologies. A host that connects to a TCP/IP network using an Ethernet network interface card (NIC) has an Ethernet MAC address and an IP address, but the two numbers are distinct and unrelated in any way.
With the overhaul of addressing in IPv6, an opportunity presented itself to create a better way of mapping IP unicast addresses and physical network addresses. Implementing this superior mapping technique was one of the reasons why IPv6 addresses were made so large. With 128 total bits, even with a full 48 bits reserved for network prefix and 16 bits for site subnet, we are still left with 64 bits to use for the interface identifier, which is analogous to the host ID under IPv4.
Having so many bits at our disposal gives us great flexibility. Instead of using arbitrary “made-up” identifiers for hosts, we can base the interface ID on the underlying data link layer hardware address, as long as that address is no greater than 64 bits in length. Since virtually all devices use layer two addresses of 64 bits or fewer, there is no problem in using those addresses for the interface identifier in IP addresses. This provides an immediate benefit: it makes networks easier to administer, since we don't have to record two arbitrary numbers for each host. The IP address can be derived from the MAC address and the network identifier. It also means we can in the future tell the IP address from the MAC address and vice-versa.
Visit this link for more clear understanding about interface ID.
Now,returning to clear your confusion,
all of the connections(interfaces) such as Ethernet-0,Ethernet-1,WiFi-1,etc. have their own interface identifier.You can think of them as a kind of special identification number which identifies the kind of interfaces available at that moment!
When you type ifconfig in Linux, it displays the status of the currently active interfaces.
Now,coming on the example part, let's say you have two Ethernet connections on your system, say, eth0 and eth1(these are interface names) ---so ifconfig will print these two as a result of it's output!
So,to identify these two separate interfaces,there must be an
interface identifier.The interface identifier(generally 64-bit) is
either automatically generated from the interface's MAC address using
the modified EUI-64 format, obtained from a DHCPv6 server,
automatically established randomly, or assigned manually.
Also,the interfaces which you have mentioned are some of the most-commonly used interfaces :-
'lo0', 'gif0', 'stf0', 'en0', 'en1', 'en2', 'bridge0', 'p2p0'
lo0---local network connection(0 for 1st connection of lan)
en0---ethernet connection(0,1,2 for 1st,2nd and 3rd connection on Ethernet)
bridge0---a bridged connection to this machine
p2p0---a peer-to-peer connection
don't know about gif,stf.Please note that there are logical connections/virtual connections,instead of limitation of physical connections(using NIC cards) too!

I discovered that there are man entries for gif and stf -- on OSX, at least. These are generic tunnel interface, and IPv6 to IPv4 tunnel interface ("Six To Four"), respectively.

Modifying H323 tcp packet using linux router

I have a Linux router on which I use CONFIG_IP_NF_QUEUE, iptables userland and Perl module IPTables::IPv4::IPQueue to examine H323 - H.225 packets and pass or drop then. I have need to not only accept or drop the packet but to modify it, to be more specific I would like to change the IP address of the MCU (in the packet) returned from the H323 gatekeeper to the client.
This would require me to examine the TCP packet body and change the IP address in the packet body. Anyone know how can I accomplish this? Is there any open source layer 7 router capable of doing this?

In the old days I've used "ip masquerade" to do something similar to what you are describing.
http://www.tldp.org/HOWTO/IP-Masquerade-HOWTO/supported-client-software.html
But the best solution is to place a gatekeeper as a proxy. In that way you are not fooling the protocol, you are actually remaking the call.
I would look for gnugk routed mode here:
http://www.gnugk.org/h323-proxy.html

If you have already got the IP packet, which from your statement you have succeeded in doing so, I don't see the problem to change the IP address of the packet before passing it on.
Just do some bits manipulation to change the IP address in IP header (also update the IP checksum). Also note that you have to update the TCP header checksum as its calculation involves a pseudo-header that includes IP addresses.
Just read RFC 791 and RFC 793 would give you an idea on how to do this. It's pretty straightforward.

Resolving MAC address for IP address using C++ on Linux

I need to generate an Ethernet header that includes the destination MAC address, (since libnfnetlink gives only the IP header before prerouting takes place), the outgoing interface number is also known, so the lookup can be made in the correct network.
What's the library/function to resolve the MAC address from an IP address?

It's unclear why you need the MAC address, since that's usually handled for you at a lower level.
However, assuming your target is on your local Ethernet segment, you can use the arp command to look up values in the local cache. If the value is not cached... Well, that's a problem. Perhaps arping would help...
(Normally you'd send a packet to, for example, IP address 10.10.10.10, and your system would send an ARP packet out querying who-has 10.10.10.10, and a response would come back from that target system with its MAC address, and then it would be cached. (You can watch this happening with tcpdump.) Or when a system comes on line it would send out a broadcast message informing everyone else of its MAC address. Naturally, if your destination is on another Ethernet segment, you're routing to a gateway rather than directly to the destination, and no destination-MAC address is available.)
You might read further at:
http://linux.die.net/man/8/arp
http://linux.die.net/man/8/arping
http://linux.die.net/man/7/arp
http://www.kernel.org/doc/man-pages/online/pages/man7/arp.7.html

Obviously you can only find the MAC address for directly connected IP addresses, but there's no platform-independent way of doing it. On Linux, you can look in /proc/net/arp after sending something to the target to trigger the kernel to send the ARP.
Edit to add you could also use the SIOCGARP ioctl() though that just looks in the ARP cache, so it won't send an ARP if there isn't one already there.
Otherwise, you would have to craft your own ARP request packet. You could probably reuse a bunch of code from arping if you go that route.

You cannot in general get the MAC address from the IP address, and in fact as IP can run on data link protocols other than ethernet, some IP addresses have no corresponding MAC address.
The MAC address is only available and only relevant on the same ethernet segment. On that segment, it can be retrieved by an ARP request.

Selecting an Interface when Multicasting on Linux

I'm working with a cluster of about 40 nodes running Debian 4. Each node runs a daemon which sits and listens on a multicast IP.
I wrote some client software to send out a multicast over the LAN with a client computer on the same switch as the cluster, so that each node in the cluster would receive the packet and respond.
It works great, except when I run the client software on a computer that has both LAN and WAN interfaces. If there is a WAN interface, the multicast doesn't work. So obviously, I figure the multicast is incorrectly going over the WAN interface (eth0), rather than the LAN (eth1.) So, I use the SO_BINDTODEVICE socket option to force the multicast socket to use eth1, and all is well.
But I thought that the kernel's routing table should determine that the LAN (eth1) is obviously a lower cost destination for the multicast. Is there some reason I have to explicitly force the socket to use eth1? And, is there some way (perhaps an ioctl call) that I can have the application automatically determine if a particular interface is a LAN or WAN?

If you don't explicitly bind to an
interface, I believe Linux uses the
interface for the default unicast
route for multicast sending.
Linux needs a multicast route, if none exists you will get a EHOSTUNREACH or ENETUNREACH error. The LCM project documents this possible problem. The routing will be overridden if you use the socket option IP_MULTICAST_IF or IPV6_MULTICAST_IF. You are supposed be able to specify the interface via scope-id field in IPv6 addresses but not all platforms properly support it. As dameiss points out, Stevens' Unix Network Programming book covers these details, you can browse most of the chapter on multicast via Google Books for free.

If you don't explicitly bind to an interface, I believe Linux uses the interface for the default unicast route for multicast sending. So my guess is that your default route is via the WAN interface.
Richard Stevens' "Unix Network Programming, Vol. 1", chapter 17 (at least in the 3rd edition), has some good information and examples of how to enumerate the network interfaces.

First packet to be sent when starting to browse

Imagine a user sitting at an Ethernet-connected PC. He has a browser open. He types "www.google.com" in the address bar and hits enter.
Now tell me what the first packet to appear on the Ethernet is.
I found this question here: Interview Questions on Socket Programming and Multi-Threading
As I'm not a networking expert, I'd like to hear the answer (I'd assume it is "It depends" ;) ).
With a tool like Wireshark, I can obviously check my own computers behaviour. I'd like to know whether the packets I see (e.g. ARP, DNS, VRRP) are the same in each ethernet configuration (is it dependent on the OS? the driver? the browser even :)?) and which are the conditions in which they appear. Being on the data-link layer, is it maybe even dependent on the physical network (connected to a hub/switch/router)?

The answers that talk about using ARP to find the DNS server are generally wrong.
In particular, IP address resolution for off-net IP addresses is never done using ARP, and it's not the router's responsibility to answer such an ARP query.
Off-net routing is done by the client machine knowing which IP addresses are on the local subnets to which it is connected. If the requested IP address is not local, then the client machine refers to its routing table to find out which gateway to send the packet to.
Hence in most circumstances the first packet sent out will be an ARP request to find the MAC address of the default gateway, if it's not already in the ARP cache.
Only then can it send the DNS query via the gateway. In this case the packet is sent with the DNS server's IP address in the IP destination field, but with the gateway's MAC address on the ethernet packet.

You can always download wireshark and take a look.
Though to spoil the fun.
Assuming, the IP address of the host is not cached, and the MAC address of the DNS server is not cached, the first thing that will be sent will be a broadcast ARP message trying to find out the MAC address of the DNS server (which the router will respond to with its own address).
Next, the host name will be resolved using DNS. Then the returned IP address will be resolved using ARP (again the router will respond with its own address), and finally, the HTTP message will actually be sent.

Actually, it depends on a variety of initial conditions you left unspecified.
Assuming the PC is running an operating system containing a local DNS caching resolver (mine does), the first thing that happens before any packets are sent is the cache is searched for an IP address. This is complicated, because "www.google.com" isn't a fully-qualified domain name, i.e. it's missing the trailing dot, so the DNS resolver will accept any records already in its cache that match its search domain list first. For example, if your search domain list is "example.com." followed by "yoyodyne.com." then cached resources matching the names "www.google.com.example.com." "www.google.com.yoyodyne.com." and finally "www.google.com." will be used if available. Also note: if the web browser is one of the more popular ones, and the PC is running a reasonably current operating system, and the host has at least one network interface with a global scope IPv6 address assigned (and the host is on a network where www.google.com has AAAA records in its DNS horizon), then the remote address of the server might be IPv6 not IPv4. This will be important later.
If the remote address of the Google web server was locally cached in DNS, and the ARP/ND6 cache contains an entry for the IPv4/IPv6 address (respectively) of a default router, then the first transmitted packet will be a TCP SYN packet sourced from the interface address attached to the router and destined for the cached remote IPv4/IPv6 address. Alternatively, the default router could be reachable over some kind of layer-2 or layer-3 tunnel, in which case, the SYN packet will be appropriately encapsulated.
If the remote address of the Google web server was not locally cached, then the host will first need to query for the A and/or AAAA records in the DNS domain search list in sequence until it gets a positive response. If the first DNS resolving server address in the resolver configuration is in one of the local IPv4 subnet ranges, or in a locally attached IPv6 prefix with the L=1 bit set in the router advertisement, and the ARP/ND6 cache already contains an entry for the address in question, then the first packet the host will send is a direct DNS query for either an A record or a AAAA record matching the first fully-qualified domain name in the domain search list. Alternatively, if the first DNS server is not addressable on-link, and a default router has an ARP/ND6 cache entry already, then the DNS query packet will be sent to the default router to forward to the DNS server.
In the event the local on-link DNS server or a default router (respectively, as the case above may be) has no entry in the ARP/ND6 cache, then the first packet the host will send is either an ARP request or an ICMP6 neighbor solicitation for the corresponding address.
Oh, but wait... it's even more horrible. There are tweaky weird edge cases where the first packet the host sends might be a LLMNR query, an IKE initiation, or... or... or... how much do you really care about all this, buckaroo?

It depends
Got that right. E.g. does the local DNS cache contain the address? If not then a DNS lookup is likely to be the first thing.

If the host name is not in DNS cache nor in hosts file, first packet will go to DNS.
Otherwise, the first packet will be HTTP GET.

Well, whatever you try to do, the first thing happening is some Ethernet protocol related data. Notably, Ethernet adapters have to decide whether the Ethernet bus is available (so there's some collision detection taking place here)
It's hard to answer your question because it depends a lot on the type of ethernet network you're using. More information on Ethernet transmission can be found here and here

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string