Improve Ethernet throughput for jumbo frames - linux

We are running throughput test on the gigE of Macnica Helio board with 1GB DDR3 specification.We are now achieving 60% (Jumbo frame) throughput, however we expect higher throughput in our application.
Method of calculation as following:-
(100M / time taken * 8-bit /1Gbps)*100%
What we did:
-Transfer 100MB using server and client code
Server(Cyclone V)
-change eth0 MTU 7500 (only achieve if we turn off tx checksum using ethtool "ethtool -K eth0 tx off" else we are just able to change the MTU up to 3500 only) then execute the server code
Client (Laptop runs UBUNTU)
-change eth0 MTU to 9000 then execute the client code and test the throughput performance using wireshark
We do try to change ipv4 setting using command below but throughput result is still the same
-sysctl -w net.core.rmem_max=18388608
-sysctl -w net.core.wmem_max=18388608
-sysctl -w net.core.rmem_default=1065536
-sysctl -w net.core.wmem_default=1065536
-sysctl -w net.ipv4.tcp_rmem=4096 87380 18388608
-sysctl -w net.ipv4.tcp_wmem=4096 87380 18388608
-sysctl -w net.ipv4.tcp_mem=18388608 18388608 18388608
-sysctl -w net.ipv4.route.flush=1
-sysctl -w net.ipv4.tcp_mtu_probing=1
Question
is there any method or solution to achieve higher throughput?
Is there any effect if we turn off the tx checksum?
What is the the different of tcp_congestion_control between cubic and bic and will it effect throughput performance?

Use ntop.org's PF_RING sockets instead of PF_INET sockets. We have been able to get up to 75% throughput with GigE Vision protocol (UDP) using Intel (e1000) NIC's, without using the NIC-specific PF_RING drivers.
AFAIK the tcp_congestion_control will only help you at the start of the TCP session and has no effect once the session is established.

Related

How to raise the buffer value with ethtool on the network

It occurred the Dropped packet with checking ifconfig command tool. The Dropped counted up with a lot of volumes.
$ ifconfig eth0
eth0 Link encap:イーサネット ハードウェアアドレス 4c:72:b9:f6:27:a8
inetアドレス:192.168.1.102 ブロードキャスト:192.168.1.255 マスク:255.255.255.0
inet6アドレス: fe80::4e72:b9ff:fef6:27a8/64 範囲:リンク
UP BROADCAST RUNNING MULTICAST MTU:1500 メトリック:1
RXパケット:2745254558 エラー:0 **損失:1003363** オーバラン:0 フレーム:0
TXパケット:7633337281 エラー:0 損失:0 オーバラン:0 キャリア:0
衝突(Collisions):0 TXキュー長:1000
RXバイト:1583378766375 (1.5 TB) TXバイト:10394167206386 (10.3 TB)
So I'll use ethtool to raise the network buffer value.
$ sudo ethtool -g eth0
Ring parameters for eth0:
Cannot get device ring settings: Operation not supported
I can't confirm eth0 status.
And, I don't understand what the ring is.
is this Virtual Machine ?
So from the symbols you pasted I assume there is drop on the RX.
You need to rise the RX ring buffer with ethtool -G eth0 rx 4096
Show more info ethtool -i eth0 and netstat -s
There is a lot more tuning to eth0 than just ring buffers.
Try to rise net.core.netdev_max_backlog.
Check it with sysctl net.core.netdev_max_backlog and set the new value with sysctl -w net.core.netdev_max_backlog=numberhere.
EDIT:
Please also show card HW info
sudo lshw -C network
google for Cannot get device ring settings r8169

Multipath TCP : Multiple connections Not Showing

I installed the mptcp kernel on my machine. I tried to test MPTCP by running iperf -c multipath-tcp.org (both end-ponts are MPTCP CAPABLE).
I tried to test if iperf lists the subflows created. I have an active wifi interface + active wired interface. But still iperf showed only the one with the wired interface:
Client connecting to multipath-tcp.org, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.42.123 port 52983 connected with 130.104.230.45 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-22.7 sec 384 KBytes 139 Kbits/sec
THis shouldn't be the case. My wired link was too slow so, even if the flow would have started here, surely subflow would be there in the wifi interface as well.
How could I actually see that MPTCP is in fact creating subflows ?
I saw the question here but my cat proc... file is showing
sl loc_tok rem_tok v6 local_address remote_address st ns tx_queue rx_queue inode
0: B491F32C CDF952DC 0 0B2BA8C0:8E9C 2DE66882:1389 01 02 00000000:00000000 203077
which doesn't relate to any subflows I guess.
Maybe you can check the mptcp setting with sysctl net.mptcp,the path manager should be setted to fullmesh rather than default to establish multiple flows.
sysctl -w net.mptcp.mptcp_path_manager=fullmesh
sysctl -w net.mptcp.mptcp_enabled=1
Further explanation of mptcp setting can be viewed at http://multipath-tcp.org/pmwiki.php/Users/ConfigureMPTCP

How to Capture Remote System network traffic?

I have been using wire-shark to analyse the packets of socket programs, Now i want to see the traffic of other hosts traffic, as i found that i need to use monitor mode that is only supported in Linux platform, so i tried but i couldn't capture any packets that is transferred in my network, listing as 0 packets captured.
Scenario:
I'm having a network consisting of 50+ hosts (all are powered by windows Except mine), my IP address is 192.168.1.10, when i initiate a communication between any 192.168.1.xx it showing the captured traffic.
But my requirement is to monitor the traffic of 192.168.1.21 b/w 192.168.1.22 from my host i,e. from 192.168.1.10.
1: is it possible to capture the traffic as i mentioned?
2: If it is possible then is wire-shark is right tool for it (or should i have to use differnt one)?
3: if it is not possible, then why?
Just adapt this a bit with your own filters and ips : (on local host)
ssh -l root <REMOTE HOST> tshark -w - not tcp port 22 | wireshark -k -i -
or using bash :
wireshark -k -i <(ssh -l root <REMOTE HOST> tshark -w - not tcp port 22)
You can use tcpdump instead of tshark if needed :
ssh -l root <REMOTE HOST> tcpdump -U -s0 -w - -i eth0 'port 22' |
wireshark -k -i -
You are connected to a switch which is "switching" traffic. It bases the traffic you see on your mac address. It will NOT send you traffic that is not destined to your mac address. If you want to monitor all the traffic you need to configure your switch to use a "port mirror" and plug your sniffer into that port. There is no software that you can install on your machine that will circumvent the way network switching works.
http://en.wikipedia.org/wiki/Port_mirroring

Why does Ubuntu terminal shut down while running load tests?

Facing a peculiar problem when doing load testing on my laptop with 2000 comcurrent users using cometd. Following all steps in http://cometd.org/documentation/2.x/howtos/loadtesting.
These tests run fine for about 1000 concurrent client.
But when I increase the load to about 2000 CCUs, the terminal just shuts down.
Any idea what's happening here?
BTW, i have followed all the OS level settings as per the site. i.e.
# ulimit -n 65536
# ifconfig eth0 txqueuelen 8192 # replace eth0 with the ethernet interface you are using
# /sbin/sysctl -w net.core.somaxconn=4096
# /sbin/sysctl -w net.core.netdev_max_backlog=16384
# /sbin/sysctl -w net.core.rmem_max=16777216
# /sbin/sysctl -w net.core.wmem_max=16777216
# /sbin/sysctl -w net.ipv4.tcp_max_syn_backlog=8192
# /sbin/sysctl -w net.ipv4.tcp_syncookies=1
Also, I have noticed this happened even when I run load tests for other platforms. I know this has to be something related to the OS, but I cannot figure out what it could be.
ulimit command has been executed correctly? I read something about it in Ubuntu forum archive and Ubuntu apache problem.

Simulate dropped packets on Linux, based on protocol (UDP, TCP etc)

I know I can use tc and netem to do
tc qdisc add dev eth0 root netem loss 50%
This would drop 50% of packets in all of eth0 traffic. However, I would like to specify a protocol (UDP, TCP etc), so only packets of this protocol would be dropped.
This is an annoying feature of iptables. Even though the docs say that DROP silently drops the packet on the floor, it still tells the calling program, causing sendmsg (or whatever) to return setting errno to ENETUNREACH or EPERM. There doesn't seem to be a feature "no really silently drop the packet and don't tell anyone about it".
I have found the following workaround, however: if the packets are going to be leaving your local machine, you can set the TTL to 0 in the mangle table:
iptables -t mangle -A SomeChain -m ttl -j TTL --ttl-gt 0 --ttl-set 0
I have successfully used this to deal with a DoS reflection attack.
Use iptables instead - it has a probability option that should allow you to do this, for example:
iptables -A INPUT -m statistic -p tcp --mode random --probability 0.5 -j DROP
Adjust the various values to match the desired traffic/direction/probability.

Resources