Measure speed between two exact sites - linux

I would really like to measure connection speed between two exact sites. Naturally one of the sites is our site. Somehow I need to prove that not our internet connection is flaky, but that a site at the other end, is overcrowded.
At our end I have windows and linux machines available for this.
I imagine I would run a script at certain times of day which - for example - tries to download an image from that site and try to measure download time. Then put the download time into a database then create a graph from the records in the database. (I know that this is really simple and not sophisticated enough, but hence my question)
I need help on the time measurement.
The felt speed differences are big, sometimes the application works flawlessly, but sometimes we get timed out errors.
Now I use speedtest to check if our internet connection is OK, but this does not show that the site that is not working is slow, and now I can't provide hard numbers to assist my situation.
Maybe it is worth mentioning that the application we try to use at the other end is java based.

Here's how I would do it in Linux:
Use wget to download whatever URL you think represents your sites best. Parse the output into a file (sed, awk), use crontab to trigger the download multiple times.
wget www.google.com
...
2014-02-24 22:03:09 (1.26 MB/s) - 'index.html' saved [11251]

Related

Are downloads from spark distribution archive often slow?

I was trying to download spark-hadoop distribution from the website - https://archive.apache.org/dist/spark/spark-3.1.2/. Often I find that the downloads from this site are generally slow. Is it due to some generic issue with the site itself?
That the download is slow I have verified in two ways -
In Colab I have run the command !wget -q https://archive.apache.org/dist/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz which keeps running often for more than say 10 minutes. While at other times it executes within 1 minute.
From the website I tried downloading it and even then the download speed is extremely slow occasionally.
It maybe because
You download multiple times
You download from non-browser, for example curl/wget
Your location is physically far from file server or network is unstable.
or something else. for example file server is slow
I think most of public server has kind a "safe guard" to prevent DDoS, So their "Safe guard" control download traffic per sec.
I faced similar issue, when I download from browser, it took 1min, but It took 10min when I use curl

What do unrecognized GET requests mean?

It's a new Amazon EC2 instance, it's live maybe a couple of hours a day, I've just installed NodeJS and still playing with it. And then I get this in my Putty SSH window:
I don't know what those two last requests mean. I don't have robots.txt and I definitely don't have any HTML file whatsoever (it's all in Jade).
Do I havet to be concerned?
Its probably a bot. Generally you don't need to worry about it, unless they are generating to many requests to the point that it causes performance issues with your application.

using torrents to back up vhd's

Hi it's a question and it may be redundant but I have a hunch there is a tool for this - or there should be and if there isn't I might just make it - or maybe I am barking up the wrong tree in which case correct my thinking:
But my problem is this: I am looking for some way to migrate large virtual disk drives off a server once a week via an internet connection of only moderate speed, in a solution that must be able to be throttled for bandwidth because the internet connection is always in use.
I thought about it and the problem is familar: large files that can moved that also be throttled that can easily survive disconnection/reconnection/large etc etc - the only solution I am familiar with that just does it perfectly is torrents.
Is there a way to automatically strategically make torrents and automatically "send" them to a client download list remotely? I am working in Windows Hyper-V Host but I use only Linux for the guests and I could easily cook up a guest to do the copying so consider it a windows or linux problem.
PS: the vhds are "offline" copies of guest servers by the time I am moving them - consider them merely 20-30gig dum files.
PPS: I'd rather avoid spending money
Bittorrent is an excellent choice, as it handles both incremental updates and automatic resume after connection loss very well.
To create a .torrent file automatically, use the btmakemetainfo script found in the original bittorrent package, or one from the numerous rewrites (bittornado, ...) -- all that matters is that it's scriptable. You should take care to set the "disable DHT" flag in the .torrent file.
You will need to find a tracker that allows you to track files with arbitrary hashes (because you do not know these in advance); you can either use an existing open tracker, or set up your own, but you should take care to limit the client IP ranges appropriately.
This reduces the problem to transferring the .torrent files -- I usually use rsync via ssh from a cronjob for that.
For point to point transfers, torrent is an expensive use of bandwidth. For 1:n transfers it is great as the distribution of load allows the client's upload bandwidth to be shared by other clients, so the bandwidth cost is amortised and everyone gains...
It sounds like you have only one client in which case I would look at a different solution...
wget allows for throttling and can resume transfers where it left off if the FTP/http server supports resuming transfers... That is what I would use
You can use rsync for that (http://linux.die.net/man/1/rsync). Search for the --partial option in man and that should do the trick. When a transfer is interrupted the unfinished result (file or directory) is kept. I am not 100% sure if it works with telnet/ssh transport when you send from local to a remote location (never checked that) but it should work with rsync daemon on the remote side.
You can also use that for sync in two local storage locations.
rsync --partial [-r for directories] source destination
edit: Just confirmed the crossed out statement with ssh

If I download a hacked Joomla website on my laptop to fix it

If I download a hacked website on my laptop to fix it and I run php code that someone else could have modified, am I going to risk to damage my local computer?
Let's say I need to assign some privileges to run a mysql database this could be potentially dangerous right?
It is a hacked Joomla website.
You cannot be sure what can happen. For maximum protection, I recommend putting everything in a virtual machine and then disable its internet access.
Yes, there is a risk: the PHP code will have the same permissions as the user running the code on the computer. If you give the PHP code access to a database, it will be able to do anything the MySQL user can do.
If you're going for 100% safety, run all of it in a virtual machine to avoid accidents with your actual laptop.
Update: of course, a good first step would be to diff the PHP code with the Joomla! official PHP code of the appropriate version, to identify differences between the two.
That depends.
If hacker put some malicious script(js\html) that use vulnerabilities in browser, or something similar, than you may damage your machine.
Usually modified php's provide backdoors, also known as shells, or provide proxies, or something similar. They are used for remote access, and are not usually intended to broke the machine. However, that's not always true.
If your site was running under unix environment, and your laptop runs Windows, the risk is lesser.
I would recommend at least using firewall. For full protection, you should do anything inside a virtual machine.
Use any compare tool to find modified places.
As for database, use only local copy. When you've corrected everything, replace the version on the server with it.
When code has been modified by someone else, running/executing it is always dangerous. Therefore, you must take care that it can't be executed:
Don't download with a web browser. Use a tool that just makes a binary copy like rsync, wget or log into your server, create a ZIP archive of the modified scripts and then download that.
Always make a backup copy of everything before you look at it. That includes the database, all scripts, HTML pages, templates, everything.
Run the code on an isolated computer (no network connection). If you don't have a spare laptop, run it in a virtual machine with networking turned off. This isn't as secure as the first option because virtual machines have bugs, too, but it's better than nothing.
Never execute the code unless you know it's safe. First, compare it against a know good copy. If there isn't one, read the code and try to figure out what it does. If that's beyond your limits, mark it down as experience, scrap the whole thing and start from scratch.
You don't want users of your site to sue you when they get hacked because you failed to remove all the malicious code, do you?
The bad code might not be in the scripts; if your site is vulnerable to script injection, then it can be in the database and only be visible when the pages are rendered. If this is the case, fix all places where database values are pasted into the HTML verbatim before you try to view them in a web browser.
Joomla hacks are usually pretty straight forward (but time consuming) to clean up (old Joomla versions can be pretty venerable to attack), follow some of the tips here to keep your self safe and remember to:
Replace all the Joomla system files with the latest version from Joomla!
If you have a fairly recent backup it would be much easier to just remove the hacked site and restore the backup, and then update it to the latest version of Joomla to help secure it.

Call Visitors web stat program from PHP

I've been looking into different web statistics programs for my site, and one promising one is Visitors. Unfortunately, it's a C program and I don't know how to call it from the web server. I've tried using PHP's shell_exec, but my web host (NFSN) has PHP's safe mode on and it's giving me an error message.
Is there a way to execute the program within safe mode? If not, can it work with CGI? If so, how? (I've never used CGI before)
Visitors looks like a log analyzer and report generator. Its probably best setup as a chron job to create static HTML pages once a day or so.
If you don't have shell access to your hosting account, or some sort of control panel that lets you setup up chron jobs, you'll be out of luck.
Is there any reason not to just use Google Analytics? It's free, and you don't have to write it yourself. I use it, and it gives you a lot of information.
Sorry, I know it's not a "programming" answer ;)
I second the answer of Jonathan: this is a log analyzer, meaning that you must feed it as input the logfile of the webserver and it generates a summarization of it. Given that you are on a shared host, it is improbable that you can access to that file, and even if you would access it, it is probable that it contains then entries for all the websites hosted on the given machine (setting up separate logging for each VirtualHost is certainly possible with Apache, but I don't know if it is a common practice).
One possible workaround would be for you to write out a logfile from your pages. However this is rather difficult and can have a severe performance impact (you have to serialize the writes to the logfile for one, if you don't want to get garbage from time to time). All in all, I would suggest going with an online analytics service, like Google Analytics.
As fortune would have it I do have access to the log file for my site. I've been able to generate the HTML page on the server manually - I've just been looking for a way to get it to happen automatically. All I need is to execute a shell command and get the output to display as the page.
Sounds like a good job for an intern.
=)
Call your host and see if you can work out a deal for doing a shell execute.
I managed to solve this problem on my own. I put the following lines in a file named visitors.cgi:
#!/bin/sh
printf "Content-type: text/html\n\n"
exec visitors -A /home/logs/access_log

Resources