I am running C++ client and sever implemented using grpc locally on different ports. What i want to do is to run both of them under different bandwidth so that i can see a difference on the time taken to finish the entire communication.
I have tried wondershaper and trickle, but it didn't seem to work.
I also tried to use tc to do the traffic control as following
tc qdisc add dev lo root tbf rate 10mbit burst 10mbit latency 900ms
I tried to use this command to limit the local bandwidth to 10M. Is this the right way to simulate the bandwidth locally?
Related
I have a Linux network application that I am trying to optimize for low latency. This application consumes UDP and produces TCP traffic. I have setup a network tap and written some scripts that correlate the UDP traffic with the application's TCP response to compute end to end latency. I have also setup tracing within the application so I can measure internal latency. I have found that typical end to end latency as measured by the capture device is about 20us but on about 5% of the cases the latency can spike to 2000us and even more. Correlating the internal logs with the capture device logs indicates this spike originates in the kernel TCP transmission.
Any suggestions on how I could get a better understanding of what is going on and hopefully fix it? I am running on a 4 HW core machine, with three of the cores dedicated to the application and the remaining one left for the OS.
Update: Further investigation of the PCAP file shows that TCP messages that exhibit high latency are always immediately preceded by an ACK from the system that is the target of the TCP data (i.e. the system to which the machine under test is sending its TCP data). This leads me to believe that the system under test is trying to keep the data in flight under some minimum and that is why it deliberately delays its responses. Have not had been able to tune this behavior out though.
Thanks in advance
I'm pretty sure it's too late for you, but may be it'll help someone in the future. I'm almost sure that you haven't turned off the Naggle algorithm by setting on the TCP_NOWAIT socket option.
I have a setup with 2 machines. I am using one as the server and the other as client. They are connected directly using a 1Ghz link. Both the machines have 4 cores, 8Gb ram and almost 100Gb disk space. I need to tune the Nginx server ( its the one im trying with but i can use any other as well) to handle 85000 concurrent connections. I have a 1kb file on the server and i am using curl on the client to get the same file over all the connections.
After trying various tuning settings, i have 1500 established connections and around 30000 TIME_WAIT connections when i call the curl around 40000 times. Is there a way i can make the TIME_WAITs ESTABLISHED?
Any help in tuning both the server and client will be much appreciated. I am pretty new to using Linux and trying to get the hang of it. The version of linux on both machines is Fedora 20.
Besides of tuning Nginx, you will also need to tune your Linux installation in respect to limits in number of tcp connections, sockets, open files, etc.
These two links should give you a great overview:
https://www.nginx.com/blog/tuning-nginx/
https://mrotaru.wordpress.com/2013/10/10/scaling-to-12-million-concurrent-connections-how-migratorydata-did-it/
You might want to check how much memory TCP buffers etc are using for all those connections.
See this SO thread: How much memory is consumed by the Linux kernel per TCP/IP network connection?
Also, this page is good: http://www.psc.edu/index.php/networking/641-tcp-tune
Given that your two machines are one the same physical network and delays are very low, you can use fairly small TCP window buffer sizes. Modern Linuxes (you didn't mention what kernel you're using) have TCP Autotuning that automatically adjusts these buffers, so you should not have to worry about this unless you're using an old kernel.
Regardless, however, the application(s) can allocate send- and receive buffers separately, which disables TCP Autotuning, so if you're running an application that does this, you might want to limit how much buffer space an application can request per connection (the net.core.wmem_max and net.core.rmem_max variables mentioned in the SO article).
I would recommend https://github.com/eunyoung14/mtcp to achieve 1 million concurrent connection, I did some tuning of mtcp and tested it on a used Dell PowerEdge R210 with 32G ram and 8 cores to achieve 1 million concurrent connection.
I want to simulate the following scenario: given that I have 4 ubuntu server machines A,B,C and D. I want to reduce the network bandwidth by 20% between machine A and machine C and 10% between A and B. How to do this using network simulation/throttling tools ?
Ubuntu comes with a tool called NetEm. It can control most of the network layer metrics (bandwidth, delay, packetloss). There are tons of tutorials online.
Dummynet is one such tool to do it.
KauNet a tool developed by karlstad university, which can introduce packet level control.
The simple program wondershaper fits here very well.
Just execute:
sudo wondershaper eth0 1024 1024
It will limit your network bandwidth on the eth0 interface to 1Mbps download and upload rate.
Hey I'm developing a 3G connected device with a raspberry Pi. My mobile provider allows me to use 50 MB/month.
This device will be installed somewhere nobody can have physically access to.
My concern is to avoid data traffic overuse. I need a tool to measure all the accumulated traffic going through (in and out) the ppp0 interface in order to disconnect the interface until next month if the 50MB limit is reached.
I tried with ifconfig but since I have some disconnections the counter is always rested at each reconnection.
I tried ntop and iftop but from what I understood these are tools for measuring real-time traffic.
I'm really looking for some kind of cumulative traffic use, like we usually can find on smartphones.
Any idea?
Take a look in to IPtraf :)
I'm not sure if it will go in to enough detail for you as it is relatively lightweight, though it may not be wise to go too heavy on the raspberry pi processor. You could also try looking around for netflow or SNMP based solutions, though I think that might be overkill.
Good luck!
I want to introduce the latency while accessing some files from my system such that I can measure the effect of latency for my application while accessing the data from the network (to be simulated using netem module).
I did the following to achieve this :-
I used two machines Host1 and Host2, and I placed the files to be accessed by the application on Host1 hard disk which can be accessed using /net/<login>/Host1/data and I launced my application on Host2 and accessed the data from Host1 using the path mentioned above.
I also introduced latency on Host1 using tc qdisc del dev eth0 root netem delay 20ms such that whenever the files are accessed from Host2 application, the access to data from Host1 should have a latency of 20ms.
I have couple of doubts :
Is there a way by which I could run the application on the same machine where the latency is set. I DONOT want the latency for the application which I will be running (Sometimes application could be accessed from another server, so If I launch the application on the machine having latency, then application would also be effected). So, is there a way I could introduce latency only to access of files.
Am I doing the correct usage of tc command for testing my scenario. So I just need conformation whether I am doing the correct usage of tc command.
Just to be clear, netem is intended for testing network traffic shaping, not hard disk traffic...
You could limit your netem rules to a specific test port on your localhost by building a tree. Its fairly abstruse, but possible.
The general scenario looks correct to me. The host that serves the resource should have tc running on it. The second host should be the monitor/measurer.