How to test a web service with up to 1 million users? - performance-testing

I'd like to know how to test a web service with up to 1 million active users, all accessing the site at the same time.
This is in theory - I don't have a web service like this, but was recently reading this article on how to build a scalable app for > 500K users, and it got me wondering how people would test this?
For the sake of discussion, lets assume that I'm in full control of the service and have 1 million test accounts already created, with the usernames test1 -> test1000000 available. I'd prefer that the accounts were accessing my service from places all over the world, but am open to any suggestions!
EDIT: I'm familiar with JMeter and Selenium, but was concerned with the idea that possibly the client activity if all run from a single location would be bottlenecked by the local network, and thus not a great test? So instead of having say 10 JMeter clients at different locations running 100K clients, I was thinking that it might be better to have 1000 JMeter clients testing 1000 users each, all from different locations... but maybe this isn't much of a concern?

I think at a high level, there could be test nodes distributed around the world. Each would contain the logic to authenticate and execute a certain type of transaction. Blocks of test accounts could be distributed to each node and each node would launch the tests in parallel.
At a practical level I would start by looking at a framework locust.io claims it does this in its tag line :)
http://locust.io/

You can use apache jmeter or my personal preference siege
In case of siege I would think of generating a urls.txt file with million urls each representing a call from a user and running them concurrently.
As for your concern about the locations
Blazemeter has a geo-distributed stress testing too

You can take a look on Tsung tool http://tsung.erlang-projects.org/
It is really light weight and allows to run hundred of thousand virtual users from average machine (depends from your script difficulty).

While you can't do multi step automation at these sites, the following services will let you hit an url from different client locations (e.g Asia, North America, Australia), and throttle the bandwidth if you like for testing purposes:
WebPageTest - https://www.webpagetest.org/
- from their about page: "WebPagetest is an open source project that is primarily being developed and supported by Google as part of our efforts to make the web faster." This site also has an api, is opensource, and allows you to automate it via a node api and cli.
Pingdom
https://tools.pingdom.com/
More info in this blogpost from KeyCDN

Related

NodeJS / ERP - Performance / Scalability

In the company I work for,
we plan to renew and re-code our 12 years old , online sales web application.
Our traffic is a bit high ; over 100.000 sales orders a day
means there will be at least 1 million interactions for a day on the web application.
I'm want to use NodeJS as web the server which will be integrated to our ERP system running on Oracle Exadata database.
My question is :
Performance is Very Very critical for us, I'm not sure NodeJS is scalable enough for this high transaction count.
I've read some blogs on internet which states some very very big companies uses NodeJS already,
but I'm not sure they use it as main & backbone system or only for some smaller applications in corporate usage.
Can you share your experiences , if possible with examples including transaction count ?
Thanks in advance !
Why are you looking at Node.js? What other options are you considering? Why choose one over the other? What expertise does your team have?
Node.js is quite scalable, provided you know what you're doing. How much of your load is mid-tier vs database? If there's a lot going on in the mid-tier, then you need to be able to scale it out horizontally. Here are a few high-level things to consider:
Many people use Docker to containerize their apps and scale them out with Kubernetes (though those aren't Node.js specific).
You'll likely want to learn about PM2 to keep your Node.js processes running.
Use node-oracledb connection pools.
Use bind variables for security and performance.
Look into using DRCP if you are using Kubernetes and each container has it's own connection pool.
Consider looking through this guide to creating a REST API with Node.js and Oracle Database to get an idea of how things work:
https://jsao.io/2018/03/creating-a-rest-api-with-node-js-and-oracle-database/

How to gather user metrics for an Electron desktop app?

I would like to gather some metrics about usage for an Electron-based cross-platform desktop app. This would consist of basic information on the user's environment (OS, screen size, etc) as well as the ability to track usage, for example track how many times the app is opened or specific actions within the app.
These metrics should be sent to an analytics server, so they can be viewed in aggregate. Ideally I could host the server-side component myself, but would certainly consider a solution hosted by a third party.
There are various analytics solutions for the web (Google Analytics, Piwik), and for mobile apps, as well as solutions for Node.js server-side apps. Is it feasible to adapt one of these solutions for desktop Electron-based apps? How? Or are there any good analytics solutions specifically designed for use with desktop apps which work with Electron / javascript?
Unlike a typical webpage, the user might be using the app offline, so offline actions should be recorded, queued, and sent later when the user comes online. A desktop app is typically loading pages from the file system, not HTTP, so the solution needs to be able to cope with that.
Unlike a Node.js server-side application, there could be a large number of clients rather than just a single (or a few) server instances. Analytics for a desktop app would be user-centric, whereas a server-side Node.js app might not be.
Ease of setup is also a big factor - an ideal solution would just have a few lines of configuration to gather basic metrics, then could be extended as necessary with custom actions/events.
The easiest thing will be to use Google Analytics or a similar offering.
For most you'll have two major issues to solve over hosting on a website:
Electron does not store cookies or state between runs. You have to store this manually
Most analytics libraries ignore file: urls so that they only get hits from the internet
Use an existing library and most of these issues will already be solved for you.

Saas Architecture and design suggestion.. Any existing products that simplifies the design

I have the following setup
Customer access -> Web application -> Database
A Server application (console based) for each customer running in the Server continuously that downloads data from various locations and update database
So if i am having 100 customers, i will need to run 100 console applications in the server.
If there is any problem/crash with one server application(because of specific kind of data i am downloading), i will be able to fix it by restarting or patching.
I took this approach as i initially thought it is easy to maintain. But feeling not anymore. I am sure there are better tools available outside to manage this kind of scenarios. If you know any please let me know. I should be able start/restart/patch/monitor server usage/check for crash on the server application through some nice GUI.
Or may be there is a way to write one multi-threaded application to serve all customers instead of one for each. And there may be a way to shutdown/restart the any customer's thread.
Thanks
The right way is to use a threaded application that can set your tenant context for the process that needs to be done for that thread.
This way, we have 1 app for all customers and van make use of application events and mailers to notify on case of any error.
An audit table with track of the various data processing status can help in a GUI to be built for tracking the progress on a tenant basis.
HTH

Testing a Windows Azure web app for maximum user load

I am conducting some research on emerging web technologies and have created a very simple Azure website which makes use of web sockets and mongo db as the database. I have managed to get all the components working together and now must perform load testing on the application.
The main criteria is the maximum user load that the app can support, at the moment there is 1 web role instance, so probably I would need to test the max user load for that instance, then try with 2 instances and so on.
I found some solutions online such as Loadstorm, however I cannot afford to pay to use these services so I need to be able to do this from my own development machine OR from another cloud service.
I have come across Visual Studio Load Tests and they seem quite useful, however it seems they require VS Ultimate and an active msdn subscription - the prerequisites are listed here. Also, from this video which shows the basics of load tests, it seems like these load tests are created completely separately from the actual web project, so does that mean I can only see metrics related to the user? i.e. I cannot see the amount of RAM being used, processor etc.
Any suggestions?
You might create a Linux virtual machine in Azure itself or another hosting provider and use ApacheBench (ab) or JMeter to do simple load testing on your application. Be aware that in such a setup your benchmark servers may be a bottleneck themselves.
Another approach is to use online load testing services wich allow some free usage, such as:
loader.io, by SendGrid Labs
LoadStorm
Blazemeter
Blitz
Neotys
Loadimpact
For load-testing, LoadStorm is very reasonably priced, especially compared to on-premises software (and has a free tier with up to 25 virtual clients). You can install code such as jmeter, but you'll still need machines (or vm's) to host and run it from, and you need to make sure that the load-generator machines aren't the bottleneck in your tests.
When you run your tests, you may want to consider separating your web tier from MongoDB. MongoDB will consume as much memory as possible (as that's what gives MongoDB its speed). In a real-world scenario, you'll likely have MongoDB in its own environment. So for your tests, I'd consider offloading MongoDB to its own instance(s), and 10gen has a Worker Role setup that's fairly straightforward to install.
Also remember that NIC bandwidth is 100Mbps per core, which could be a limiting factor on your tests, depending on how much load you're driving.
One alternative to self-hosting MongoDB: Offload MongoDB to a hoster such as MongoLab. This will allow you to test the capacity of your web app without worrying about the details around MongoDB setup, configuration, optimization, etc. Currently MongoLab offers their free tier hosted in Azure, US West and US East data centers.
Editing my response, didnt read the question carefully.
Check out this thread for various tools and links:
Open source Tool for Stress, Load and Performance testing
If you are interested in finding the performance counters of the application under test you can revisit some of the latest features added to Visual Load Cloud base load test.
http://blogs.msdn.com/b/visualstudioalm/archive/2014/04/07/get-application-performance-data-during-load-runs-with-visual-studio-online.aspx
To get more info on Visual Studio Cloud Load Testing solution - https://www.visualstudio.com/features/vso-cloud-load-testing-vs

Building a website backend in c#, compiled to a binary

I am creating a novel website that integrates web feeds from around the internet. I want to build a backend that does CPU intensive analysis of the web data on a regular basis, which will eventually add the results continuously into a database.
This database will be accessable by the website through a normal asp.net backend that will server the page up to the client.
Is it advisable, and best practice, to build the complex CPU operations in c# binaries that run continuously on the server?
Sounds like you want a .NET executable that either runs on a schedule (cronjob-style) or that schedules itself. In any case it's wise to have it completely separate to your website process. It sounds like data-generation and data-serving are separate concerns, so they should be kept separate. This also means that you can move it off the web-serving machine if load becomes an issue. If you're updating a live database remember to take transactions into account.

Resources