simulate user load on hardware router - linux

I am trying to simulate user load on a hardware router. I am specifically trying to emulate the average load of a home router.
What i need to to do is load it up over a week long period at different times and perform the following:
Data Transfer
Torrent Downloads
HTTP/HTTPS Pages requests to different pages. Static content, dynamic content. etc.
I would need to this repeat at my specific intervals and be able to test multiple routers at once.
Anyone know of any software or scripts that will achieve this.
Cheers

Sure. You might be surprised to learn that the load on an average home router is probably pretty low most of the time. Do the math: even downloading at maximum DSL or cable router speed (even if it were small packet sizes, which in higher loads is not usually the case) is just not a significant load on a modern CPU these days.
Scripting loads is easy. I have a script that I bang against Comcast sometimes when I doubt their last mile link to my home. It simply uses wget (or try curl) to download a file of reasonable size repeatedly and records the download statistics (time and/or data rate) of the transfers. Just find a .pdf or other file of the size you need from around the net somewhere, or use a busy website with lots of content. Just avoid the little guys who might have to pay for that bandwidth you are consuming in your test. Better yet, Amazon S3 storage (and transfer bandwidth) is very cheap these days and easy to use. You could put some files of your own choosing up there, and download those repeatedly for your test environment instead of stealing bandwidth from someone else! ;)
Never played with any torrent clients, so I can't help you there, but I bet there are some you can script.
Also, you might check out netperf. I don't know the status of that project, but I've used it in the past to generate very high network loads. Google for it.
Have fun and good luck!
-Chris

Related

Prevent bottleneck on bandwidth for mobile internet

I am sure that this question has already been answered, but unfortunately I do not know the keywords. Therefore my search remained unsuccessful until now.
Scenario: I want to transmit a lifestream via Mobile Internet using RaspberryPi, and depending on the bandwidth, downscale the streams and upscale them again when available.
My two questions for the network specialists among you:
i know i can actively check the bandwidth, but how would you do this without interfering with the existing processes transmitting? Should I commit a bandwidth to the processes and then slowly determine the remaining bandwidth using a test tool? Or are there already practical solutions?
Can I determine in the mobile Internet, or in the network interface, when a bottelneck is reached?
Passive methods would be my preference. where I wouldn't have to load the bandwidth. e.g. I could know how much bandwidth the stream uses, and how much arrives. But how do I make sure there is enough capacity before I go up with the bitrate?
Thanks for your wisdom ;)

How much bandwith is needed for my website

i have a website what can be used by 50 users at the same time. Those users will be in the same room.
My problem is to know how much bandwith (in Mb/s) do I need to rent for that room so that they can access my website comfortably (speed up and down) ?
The average page size of my website is 1MB.
I searched for answers on the internet and all I got was bandwith used in a month (for servers).
Sorry if my question is "vague", I did my best to make it clear.
Thank you in advance for your answers.
Using https://gtmetrix.com/ you can test your websites speed, page size, and load times
There are several alternatives you just have to do the research
The more important issue you should focus on is why your page is 1Mb that should be your first priority to resolve and using tools like gtmetrix can help
I recommend load testing your site to figure that out. If you're at all familiar with JMeter, you can use it to create a script that simulates a user navigating your site, then run multiple instances of that user (in your case, 50) to see how the site holds up under load.
You can learn more about JMeter here:
https://jmeter.apache.org/
If you're not familiar with creating JMeter scripts, you can record and auto-generate basic scripts using the Blazemeter Chrome Extension, here.
For low-load testing (50 users is pretty low), you can upload your JMeter script to Blazemeter, and with a free tier Blazemeter account, you can perform some basic tests to see how your site holds up. If you go that route, I recommend focusing on avg. response time and hits/second in order to determine what your bandwidth need truly is under load.

How much does a single request to the server cost

I was wondering how much do you win by putting all of your css scripts and stuff that needs to be downloaded in one file?
I know that you would win a lot by using sprites, but at some point it might actually hurt to do that.
For example my website uses a lot of small icons and most of the pages has different icons after combining all those icons together i might get over 500kb in total, but if i make one sprite per page it is reduced to almost 50kb/page so that's cool.
But what about scripts js/css how much would i win by making a script for each page which has just over ~100 lines? Or maybe i wouldn't win at all?
Question, basically i want to know how much does a single request cost to download a file and is it really bad to to have many script/image files with todays modern browsers and a high speed connections.
EDIT
Thank you all for your answers, it was hard to chose just one because every answer did answer my question, I chose to reward the one that in my opinion answered my question about request cost the most directly, I will not accept any answer as correct because everyone was.
Multiple requests means more latency, so that will often make a difference. Exactly how costly that is will depend on the size of the response, the performance of the server, where in the world it's hosted, whether it's been cached, etc... To get real measurements you should experiment with your real world examples.
I often use PageSpeed, and generally follow the documented best practices: https://developers.google.com/speed/docs/insights/about.
To try answering your final question directly: additional requests will cost more. It's not necessarily "really bad" to have many files, but it's generally a good idea to combine content into a single file when you can.
Your question isn't answerable in a real generic way.
There are a few reasons to combine scripts and stylesheets.
Browsers using HTTP/1.1 will open multiple connections, typically 2-4 for every host. Because almost every site has the actual HTML file and at least one other resource like a stylesheet, script or image, these connections are created right when you load the initial URL like index.html.
TCP connections are costly. That's why browsers open directly multiple connections ahead of time.
Connections are usually limited to a small number and each connection can only transfer one file at a time.
That said, you could split your files across multiple hosts (e.g. an additional static.example.com), which increases the number of hosts / connections and can speed up the download. On the other hand, this brings additional overhead, because of more connections and additional DNS lookups.
On the other hand, there are valid reasons to leave your files split.
The most important one is HTTP/2. HTTP/2 uses only a single connection and multiplexes all file downloads over that connection. There are multiple demos online that demonstrate this, e.g. http://www.http2demo.io/
If you leave your files split, they can also be cached separately. If you have just small parts changing, the browser could just reload the changed file and all others would be answered using 304 Not Modified. You should have appropriate caching headers in place of course.
That said, if you have the resources, you could serve all your files separately using HTTP/2 for clients that support it. If you have a lot of older clients, you could fallback to combined files for them when they make requests using HTTP/1.1.
Tricky question :)
Of course, the trivial answer is that more requests takes more time, but that is not necessarily this simple.
browsers open multiple http connections to the same host, see http://sgdev-blog.blogspot.hu/2014/01/maximum-concurrent-connection-to-same.html Because that, not using parallel download but rather downloading one huge file is considered as a performance bottleneck by http://www.sitepoint.com/seven-mistakes-that-make-websites-slow/
web servers shall use gzip content-encoding whenever possible. Therefore size of the text resources such as HTML, JS, CSS are quite compressed.
most of those assets are static content, therefore a standard web server shall use etag caching on them. It means that next time the download will be like 26 bytes, since the server tells "not changed" instead of sending the 32kbyte of JavaScript over again
Because of the etag cache, the whole web site shall be cacheable (I assume you're programming a game or something like that, not some old-school J2EE servlet page).
I would suggest making 2-4 big files and download that, if you really want to go for the big files
So to put it together:
if you have only static content, then it is all the same, because etag caching will shortcut any real download from the server, server returns 304 Not modified answer
if you have some generated dynamic content (such as servlet pages), keep the JS and CSS separate as they can be etag cached separately, and only the servlet page needs to be downloaded
check that your server supports gzip content encoding for compression, this helps a lot :)
if you have multiple dynamic content (such as mutliple dynamically changing images), it makes sense to have them represented as 2-4 separate images to utilize the parallel http connections for download (although I can hardly imagine this use case in the real life)
Please, ensure that you're not serving static content dynamically. I.e. try to load the image to a web browser, open the network traffic view, reload with F5 and see that you get 304 Not modified from the server, instead of 200 OK and real traffic.
The biggest performance optimization is that you don't pull anything from the server, and it comes out of the box if used properly :)
I think #DigitalDan has the best answer.
But the question belies the real one, how do I make my page load faster? Or at least , APPEAR to load faster...
I would add something about "above the fold": basically you want to inline as much as will allow your page to render the main visible content on the first round trip, as that is what is perceived as the fastest by the user, and make sure nothing else on the page blocks that...
Archibald explains it well:
https://www.youtube.com/watch?v=EVEiIlJSx_Y
How much you win if you use any of these types might vary based on your specific needs, but I will talk about my case: in my web application we don't combine all files, instead, have 2 types of files, common files, and per page files, where we have common files that needed globally for our application, and other files that is used for its case only, and here is why.
Above is a chart request analysis for my web application, what you need to consider is this
DNS Lookup happens only once as it cached after that, however, DNS name might be cached already, then.
On each request we have:
request start + initial connection + SSL negotiation+ time to first byte + content download
The main factor here which takes majority of request time in most cases is the content download size, so if I have multiple files that all of them needed to be used in all pages, I would combine them into one file so I can save the TCP stack time, on the other hand, if I have files needed to be used in specific pages, I would make it separate so I can save the content download time in other pages.
Actually very relevant question (topic) that many web developer face.
I would also add my answer among other contributors of this question.
Introduction before going to answer
High performance web sites depending on different factors, here is some consideration:
Website size
Content type of website (primary content Text, image, video or mixture)
Traffic on your website (How many people visiting your website average)
Web-host Location vs your primary visitor location (with in your country, region and world wide), it matters a lot if you have website for Europe and your host is in US.
Web-host server (hardware) technology, I prefer SSD disks.
How web-server (software) is setup and optimized
Is it dynamic or static web site
If dynamic, how your code and database is structured and designed
By defining your need you might be able to find the proper strategy.
Regarding your question in general
What regards your website. I recommend you to look at Steve Souders 14 recommendation in his Book High Performance Web Sites.
Steve Souders 14 advice:
Make fewer HTTP requests
Use a Content Delivery Network (CDN)
Add an Expires Header
Gzip Components
Put Style-sheets at the Top
Put Scripts at the Bottom
Avoid CSS Expressions
Make JavaScript and CSS External if possible
Reduce DNS Lookups
Minify JavaScript
Avoid Redirects
Remove Duplicates Scripts
Configure ETages
Make Ajax Cacheable
Regarding your question
So if we take js/css in consideration following will help a lot:
It is better to have different codes on different files.
Example: you might have page1, page2, page3 and page4.
Page1 and page2 uses js1 and js2
Page3 uses only js3
Page4 uses all js1, js2 and js3
So it will be a good idea to have JavaScript in 3 files. You are not interested in including every thing you have that you do not use.
CSS Sprites
CSS at top and JS at the end
Minifying JavaScript
Put your JavaScript and CSS in external files
CDN, in case you use jQuery for example do not download it to your website just use the recommended CDN address.
Conclusion
I am pretty sure there is more details to write. And not all advice are necessary to implement, but it is important to be aware of those. As I mentioned before, I suggest you reading this tiny book, it gives you more details. And finally there is no perfect final solution. You need to start some where, do your best and improved it. No thing is permanent.
Good luck.
the answer to your question is it really depends.
the ultimate goal of page load optimization is to make your users feel your page load is fast.
some suggestions:
do not merge common library js css files like jquery coz they might have already cached by brower when you visited other sites so u don't even need to download them;
merge resources, but at least separate first screen required resouces and the others coz the earlier user could see some meaningful stuff, the faster they feel about your page;
if several of your pages shared some resources, separate the merged files for shared resources and page specific resources so that when you visit the second page, the shared ones might have already been cached by browser, so the page load is faster;
user might be using a phone with slow or inconsistent speed 3g/4g network, so even 50k of data or 2 more requests does make them feel different a lot;
Is really bad to have a lot of 100-lines-files and is also really bad to have just one or two big files, though for each type css/js/markup.
Desktops have mostly high speed connection, and mobile has also high latency.
Taking all the theory about this topic, i think the best approach shall be more practical, less accurate and based upon actual connection speed and device types from a statistical point of view.
For example, i think this is the best way to go today:
1) put all the stuff needed to show the first page/functionality to the user, in one file, shall be under 100KB - this is absolutely a requirement.
2) after that, split or group the files in sizes so that the latency is no longer noticeable together with the download time.
To make it simple and concrete, if we assume: time to first byte is around ~200ms, the size of each file should be between ~120KB and ~200 KB, good for the most connections of today, averaged.

Is there such a thing as a reverse CDN? (content 'retrieval' network)

Our clients upload a serious amount of data from all over the world and we'd like to do our best to make that as painless as possible. Our clients upload 2GB worth of files over their sometimes very 'retail' broadband packages (with capped upload speeds) that draw out upload times to 24-48 hours. At any given time we have 10 or more concurrent uploads and peek periods we can have 100 concurrent uploads. So we decided to consider ways to reduce latency and keep our clients traffic local... so just as a CDN has download servers in various locations, we'd like upload servers.
Any experience or thoughts?
We're not a huge company but this is a problem worth solving so we'll consider all options.
What about putting some servers physically closer to your clients ?
Same ISP, or at the very least in the same countries. Then you just collect it on schedule. I don't imagine that they're getting top speeds when there's 100 of them uploading to you either, so the sooner you can get them completed the better.
Also, do they need to upload this stuff immediately ?? Can some of them post DVD for whatever isn't time sensitive ? I know it sux dealing with media in the post.... so it's hardly ideal.
A reverse CDN sort of situation would only really happen if you had multiple clients using torrents and seeding their uploads (somehow) to one of your servers.
You haven't really said if this is a problem for you, or your clients. So, some more info is going to get you a better answer here.
2GB per what time period? Hour? Day?
If your operation is huge, I wouldn't be too surprised if Akamai or one of the other usual CDN suspects can provide this service to you for the right price. You might get your bizdev folks (or purchasing) in touch with them.

Logging requests on high traffic websites

I wonder how high traffic websites handle traffic logging, for example a website like myspace.com receives a lot of hits, I can imagine it would take a lot of space to log all those requests, so, do they log every single request or how do they handle this?
If you view source on a MySpace page, you get the answer:
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-6293770-1");
pageTracker._setDomainName(".myspace.com");
pageTracker._setSampleRate("1"); //sets sampling rate to 1 percent
pageTracker._trackPageview();
</script>
That script means they're using Google Analytics.
They can't just gauge traffic using IIS logs because they may sell ads to third parties, and third parties won't take your word for how much traffic you get. They want independent numbers from a separate company, and that's where Google Analytics comes in.
Just for future reference - whenever you've got a question about how a web site is doing something, try viewing the source. You'd be amazed at what you can find there in plain view.
We had a similar issue with out Intranet which is used by hundreds of people. The disk activity was huge and performance was being hurt.
The short answer is Asynchronous non-blocking logging.
probably like google analytics.
Use Javascript to load a page on a difference server, etc.
Don't how they track it since I don't work there. I am pretty sure that they have enough storage to record every little thing about their user if they wanted.
If I were them, I would use AwStats if I just wanted to know basic stuff about my users.
It is more likely that they have developed their own scripts for tracking their users. Stuff they would log
-ip_address
-referrer
-time
-browser
-OS
and so on. Then a script to see different data about the user varying by day, weeks, or months. As brulak said, something along the line of Analytics, but since they have access to actual database, they can learn much more about their users.
ZXTM traffic shaping and logging, speaking from experience here
I'd be extremely surprised if they didn't log every single request, yes, and operations with particularly high traffic volumes usually roll their own log-management solutions against the raw server logs, in some form or other -- sometimes as simple batch-type processes, sometimes as complete subsystems.
One company I worked for, back in the dot-com heyday, got upwards of twenty million pageviews a day; for that site (actually a set of them, running across a few dozen machines in all, as I recall), our ops team wrote a quite sophisticated, clustered solution in C that parsed, translated (into relational storage), compressed and distributed the logs daily. Log files, especially verbose ones, pile up fast, and the commercial solutions available at the time just couldn't cut it.
If by logging you mean for collecting server related information (request and response times, db and cpu usage per request etc) I think they sample only the 10% or 1% of the traffic. That gives the same results (provide developers with auditing information) without filling in the disks or slowing the site down.

Resources