Can someone check this calculation?
I want to calculate the speed of my internet connection by downloading a file from a remote server.
My time unit is in 1/60th of a second. Let's say the file on the remote server is 32K.
timeBegin = ticks <- 1/60th of a second since beginning
of some date
get.url( file )
timeEnd =
ticks
Mbps = ( size of file * 8) / ( timeEnd -
timeBegin ) / 60 / 1048576
Does anyone know of a way to test bandwidth (upload/download) from the command line (unix)?
Don't know the exact command off the top to do what you want.
But, you may not get a very accurate reading of your internet BW based on this test.
There are 2 issues I see:
1) You could be limited by latency. Download time is a factor of both latency (the amount of time for a packet to do a round trip between source and destination) and BW.
2) The server and not you may have limited BW.
You probably can get a more accurate number by checking out sights like this:
speakeasy
Your calculation is not quite correct, you are missing some parentheses.
Mbps = ( size of file * 8) / ( ( timeEnd - timeBegin ) / 60 ) / 1048576
I see DasBoot already pointed out some of the potential sources of inaccuracy in this method. I'll just add to #2 that the critical bandwidth limitation may also exist at some hop in between you and the server.
One way I use to check "bandwidth" between servers is to look at the results from scp between remote and local (and vice versa). You could also consider using a large file like 30-40MB...
Another way is to use the wget command which show the speed of download too (like 1Mb/s)
hope it helps
Try to use IPTRAF to monitor it.
Related
I am writing an application which needs to read thousands (let's say 10000) of small (let's say 4KB) files from a remote location (e.g., an S3 bucket).
The single read from remote takes around 0.1 seconds and processing the file takes just few milliseconds (from 10 to 30). Which is the best solution in terms of performance?
The worse I can do is to download everything serially: 0.1*10000 = 1000 seconds (~16 minutes).
Compressing the files in a single big one would surely help, but it is not an option in my application.
I tried to spawn N threads (where N is the number of logical cores) that download and process 10000/N files each. This gave me 10000/N * 0.1 seconds. If N = 16, it takes indeed around 1 minute to complete
I also tried to spawn K*N threads (this helps sometimes on high latency systems, as GPUs) but I did not get much speed up for any value of K (I tried 2,3,4), maybe due to lot of context thread switching.
My general question is: how would you design such a similar system? Is 1 minute really the best I can achieve?
Thank you,
Giuseppe
The scope of this work is to query two machines' high resolution timer at the "same time" and get the time clock inaccuracy between both systems. This is done by having the 3rd machine sending an SNMP-get for a custom OID where the SNMP agent is configured to invoke a perl script to return the high-resolution timer. All works fine as in the snmp-get manages to return the expected result. However it appears that regardless of the frequency of the snmpget queries, the snmpagent only performs a fresh query to the script at ~5 second intervals. I am running NET SNMP version 5.4.3. After some research I've seen that this is typical of NET SNMP to cache the results and this is done on MIB tree basis. There is MIB (nsCacheTable) with the respective intervals by querying snmpwalk to 1.3.6.1.4.1.8072.1.5.3. Apparently the values can be changed to 0 to remove caching. Some of these are read-only though. Although I've set a few of them to 0 using SNMPset (as most of them return a Bad object type error).
I know very basic SNMP so I followed a guide online and mapped the below custom OID to the perl script with this line in the snmpd.conf.
extend .1.3.6.1.4.1.27654.3 return_date /usr/bin/perl [directory]/[perl script name].pl
Then the actual OID containing the output (time in epoch) is:
iso.3.6.1.4.1.27654.3.3.1.1.11.114.101.116.117.114.110.95.100.97.116.101
Anyone has any ideas how I can disable the caching for this OID?
Thanks in advance.
---EDIT---
According to this blog post, in order to avoid disabling the caching - one can instead use pass-persist scripts which look more complex to implement at first glance. The perl script I used to call is the below:
#!/usr/bin/perl
# THIS SCRIPT RETURNS THE EPOCH TIME OF DAY IN MICROSECONDS
use Time::HiRes qw(gettimeofday);
($s, $usec) = gettimeofday();
$newtime = $s.$usec;
print ($newtime);
Anyone can provide help in converting this script for pass-persist and how the snmpd.conf should look like?
Whats the best way to generate 1000K text files? (with Perl and Windows 7) I want to generate those text files in as possible in less time (possibly withing 5 minutes)? Right now I am using Perl threading with 50 threads. Still it is taking longer time.
What will be best solution? Do I need to increase thread count? Is there any other way to write 1000K files in under five minutes? Here is my code
$start = 0;
$end = 10000;
my $start_run = time();
my #thr = "";
for($t=0; $t < 50; $t++) {
$thr[$t] = threads->create(\&files_write, $start, $end);
#start again from 10000 to 20000 loop
.........
}
for($t=0; $t < 50; $t++) {
$thr[$t]->join();
}
my $end_run = time();
my $run_time = $end_run - $start_run;
print "Job took $run_time seconds\n";
I don't want return result of those threads. I used detach() also but it didn't worked me.
For generating 500k files (with only size of 20kb) it took 1564 seconds (26min). Can I able to achieve within 5min?
Edited: The files_write will only take values from array predefined structure and write into file. thats it.
Any other solution?
The time needed depends on lots of factors, but heavy threading is probably not the solution:
creating files in the same directory at the same time needs probably locking in the OS, so it's better done not too much in parallel
the layout how the data gets written on disk depend on the amount of data and on how many writes you do in parallel. A bad layout can impact the performance a lot, especially on HDD. But even a SDD cannot do lots of parallel writes. This all depends a lot on the disk you use, e.g. it is a desktop system which is optimized for sequential writes or is it a server system which can do more parallel writes as required by databases.
... lots of other factors, often depending on the system
I would suggest that you use a thread pool with a fixed size of threads to benchmark, what the optimal number of threads is for your specific hardware. E.g. start with a single thread and slowly increase the number. My guess is, that the optimal number might be between factor 0.5 and 4 of the number of processor cores you have, but like I said, it heavily depends on your real hardware.
The slow performance is probably due to Windows having to lock the filesystem down while creating files.
If it is only for testing - and not critical data - a RAMdisk may be ideal. Try Googling DataRam RAMdisk.
On a linux system, your standard uptime command returns the following:
6.46, 6.00, 4.51
As most of you will probably know, these correspond to 1, 5 and 15 minute load averages.
I was wondering if it were possible to either pull from somewhere ( /proc/ ... ? ) or manually parse ( ps aux? ) out a more real time snapshot of the systems load average?
Can this be parsed / pulled from anywhere?
You could watch /proc/uptime, which contains the cpu time spent doing something vs. total time. By monitoring this repeatedly, you can effectively acquire samples for any EWMA window of your choosing.
Or, if you like a challenge, you can obtain old_loadavg_value and new_loadavg_value from /proc/loadavg and solve the linear system
new_loadavg1_value = alpha_1 * old_loadavg1_value + (1-alpha_1) * new_sample
new_loadavg5_value = alpha_5 * old_loadavg5_value + (1-alpha_5) * new_sample
new_loadavg15_value = alpha_15 * old_loadavg15_value + (1-alpha_15) * new_sample
for new_sample, and then do the calculation forwards again with an alpha reflecting your desired window.
You could try grabbing a sysstat package
and using sar.
http://linux.die.net/man/1/sar
On a Linux box, I need to display the average CPU utilisation per hour for the last week. Is that information logged somewhere? Or do I need to write a script that wakes up every 15 minutes to copy /proc/loadavg to a logfile?
EDIT: I'm not allowed to use any tools other than those that come with Linux.
You might want to check out sar (man page), it fits your use case nicely.
System Activity Reporter (SAR) - capture important system performance metrics at
periodic intervals.
Example from IBM Developer Works Article:
Add an entry to your root crontab
# Collect measurements at 10-minute intervals
0,10,20,30,40,50 * * * * /usr/lib/sa/sa1
# Create daily reports and purge old files
0 0 * * * /usr/lib/sa/sa2 -A
Then you can simply query this information using a sar command (display all of today's info):
root ~ # sar -A
Or just for a certain days log file:
root ~ # sar -f /var/log/sa/sa16
You can usually find it in the sysstat package for your linux distro
As far as I know it's not stored anywhere... It's a trivial thing to write, anyway. Just add something like
cat /proc/loadavg >> /var/log/loads
to your crontab.
Note that there are monitoring tools (like Munin) which can do this kind of thing for you, and generate pretty graphs of it to boot... they might be overkill for your situation though.
I would recommend looking at Multi Router Traffic Grapher (MRTG).
Using snmpd to read the load average, it will automatically calculate averages at any time interval and length, along with nice charts for analysis.
Someone has already posted a CPU usage example.