How to get precise performance measurements on desktop client-server application - performance-testing

as part of our quality metrics we need to calculate the time the application under test (a windows client - server desktop application) takes to execute some common tasks like time to open a window, time to save a document, etc.
The common use cases include the maximum time allowed to complete the use case in a specified machine and we need to validate these times every sprint.
We use UFT (HP Unified Functional Testing (UFT)) as our testing tool but the times we get on a sample application have big variations. We have disabled everything on the computer, the CPU is near 0% and we have variations of 15% on the measures we do.
Have you experience gathering this kind of metrics in projects? What tool did you use or how did you got the information? Doing it manually is not an option.

QTP provides a timing mechanism in it's MercuryTimers object... It's a collection of named timers that you can start, stop, continue, reset, and read it's elapsed time. Because it's a collection, you can set several simultaneous timers and read them independently of each other.
You don't need to declare or dimension them at all, they come into existence as soon as you call a method on them. They also persist across actions, and because they use literal strings for names, you can parameterize the names in any way you choose.
Here's an example:
MercuryTimers("App Life Time").Start
MercuryTimers("Load Time").Start
systemUtil.Run TestURL
if not Browser("IE").Exist(60) then
FailTestBecause "Browser load timeout"
End If
MercuryTimers("Load Time").Stop
TimeTakenToLoad = MercuryTimers("Load Time").ElapsedTime
RunAction "Do Other Stuff"
RunAction "Exit and Close Browser"
MercuryTimers("App Life Time").Stop
TotalTimeAppExisted = MercuryTimers("App Life Time").ElapsedTime
SomeOutputFunction "Load Time:" & TimeTakenToLoad & " - Exist Time:" & TotalTimeAppExisted & " - in milliseconds"
As far as eliminating all unnecessary processes on the test machine, It sounds like you already did that. I always check blackviper.com for windows services information.

Related

C# Algorithmic Stock Trading

We are working on a Algorithmic trading software in C#. We monitor Market Price and then based on certain conditions, we want to buy the stock.
User input can be taken from GUI (WPF) and send to back-end for monitoring.
Back - end receives data continuously from Stock Exchange and checks if user entered price is met with certain limits and conditions. If all are satisfied, then we will buy / sell the stock (in Futures FUT).
Now, I want to design my Back end service.
I need Task Parallel Library or Custom Thread Pool where I want to create my tasks / threads / pool when application starts (may be incremental or fixed say 5000).
All will be in waiting state.
Once user creates an algorithm, we will activate one thread from the pool and monitors price for each incoming string. If it matches, then buy / sell and then go into waiting state again. (I don't want to create and destroy the threads / tasks as it is time consuming).
So please can you guys help me in this regard? If the above approach is good or do we have any other approach?
I am struck with this idea and not able to go out of box to think on this.
The above approach is definitely not "good"
Given the idea above, the architecture is wrong in many cardinal aspects. If your Project aspires to survive in 2017+ markets, try to learn from mistakes already taken in 2007-2016 years.
The percentages demonstrate the NBBO flutter for all U.S. Stocks from 2007-01 ~ 2012-01. ( Lower values means better NBBO stability. Higher values: Instability ) ( courtesy NANEX )
Financial Markets operate on nanosecond scales
Yes, a few inches of glass-fibre signal propagation transport delay decide on PROFIT or LOSS.
If planning to trading in Stock Markets, your system will observe the HFT crowd, doing dirty practice of Quote Stuffing and Vacuum-Cleaning 'em right in front of your nose at such scales, that your single-machine multi-threaded execution will just move through thin-air of fall in gap already created many microseconds before your decision took place on your localhost CPU.
The rise of HFT from 2007-01 ~ 2012-01 ( courtesy NANEX ).
May read more about an illusion of liquidity here.
See the expansion of Quotes against the level of Trades:
( courtesy NANEX )
Even if one decides to trade in a single instrument, on FX, the times are prohibitively short ( more than 20% of the ToB Bids are changed in time less than 2 ms and do not arrive to your localhost before your trading algorithm may react accordingly ).
If your TAMARA-measurements are similar to this, at your localhost, simply forget to trade in any HF/MF/LF-HFT instruments -- you simply do not see the real market ( the tip of the iceberg ) -- as the +20% price-events happen in the very first column ( 1 .. 2 ms ), where you do not see any single event at all!
5000 threads is bad, don't do that ever, you'll degrade the performance with context switch loss much more than parallel execution timing improvement. Traditionally the number of threads for your application should be equal to the number of cores in your system, by default. There are other possible variants, but probably they aren't the best option for your.
So you can use a ThreadPool with some working item method there with infinite loop, which is very low level, but you have control on what is going on in your system. Callback function could update the UI so the user will be notified about the trading results.
However, if you are saying that you can use the TPL, I suggest to consider these two options for your case:
Use a collection of tasks running forever for checking the new trading request. You still should tune up the number of simultaneously running tasks because you probably don't want them to fight each other for a CPU time. As the LongRunning tasks are created with dedicated background thread, many of them will degrade your application performance as well. Maybe in this approach you should introduce a strategy pattern implementation for a algorithm being run inside the task.
Setup a TPL Dataflow process within your application. For such approach your should encapsulate the info about the algorithm inside a DTO-object, and introduce a pipeline:
BufferBlock for storing all the incoming requests. Maybe you can use here a BroadcastBlock, if you want to check the sell or buy options in parallel. You can link the block with a boolean predicate here so the different block will process different types of requests.
ActionBlock (maybe one block for each algorithm from user) for processing the algorithmic check for a pattern based on which you are providing the decision.
ActionBlock for storing all the buy / sell requests for a data successfully passed by the algorithm.
BufferBlock for UI reaction with a Reactive Extensions (Introductory book for Rx, if you aren't familiar with it)
This solution still has to be tuned up with a block creation options, and more informative for you how exactly your data flow across the trading algorithm, the speed of the decision making and overall performance. You should properly examine for a defaults for TPL Dataflow blocks, you can find them into the official documentation. Other good place to start is Stephen Cleary's introductory blog posts (Part 1, Part 2, Part 3) and the chapter #4 about this library in his book.
With C# 5.0, the natural approach is to use async methods running on top of the default thread pool.
This way, you are creating Tasks quite often, but the most prominent cost of that is in GC. And unless you have very high performance requirements, that cost should be acceptable.
I think you would be better with an event loop, and if you need to scale, you can always shard by stock.

Jmeter - how to get higher randomize effect?

I need to simulate "real traffic" on Web farm, by other words I need to generate high peaks but as well periods which less or even no HTTP requests (hits) at all. Reason for that is to test some atomized mechanisms for adding and reducing CPU and memory for Web servers itself (that is another story). That is why I need "totally random" sceneries when I have loads but as well period with zero or less traffic (so I can add or reduce compute power).
This is situation that I get now, as you can see I always have some avg load its always around some number of hits, even if I change 10 to 100 threads. Values (results) will always have some average value. There are no periods with less or more traffic which would be separated be +10 mints or so, only by few seconds.
Current situation
I would like to get "higher" variations by HITS/REQUESTS with some time breaks between it.
Situation that I want: i.stack.imgur.com/I4LhU.png
I tried several timers but no success and I do not want to use "Ultimate Thread Group" and similar components because I want test to be totaly randome and not predefined with time breaks and pause periods (thread delays). I would like test which will be totally randomized by it self - which could for example generate from 1 to 100 users per XY time.
This is my current Jmeter setup: i.stack.imgur.com/I4LhU.png
I do not know if I am missing some parameter in current setup or there is totally another way to do this.
Thanks a lot!
If this is something you really want (I strongly believe that the test needs to be repeatable, not random), I would suggest using Constant Throughput Timer for this. Despite the word "Constant" in its name you can use a Function or a Variable there, for instance __Random() and you will get different controllable "spikes" each iteration.
Moreover, you put a __P() function and amend its value via Beanshell Server while the test is running

difference between passing control to different program using return() and calling a program using xctl

If I have ,say, 2 screens. First is the prompt screen which asks for, say, some record key and the next screen displays the information about the record.
Now when I want to transfer the control to the second screen (after doing the job of the 1st screen) I can do that by :
exec cics
return(trans-id)
commarea(ws-commarea)
end exec.
where trans-id is that of the 2nd screen.
Then what is need for using a calling function such as xctl when we already have the return() available in cics?
Using XCTL or LINK or dynamic CALLs confines your processing to one CICS transaction.
If you so desire, you can design your application to spread different business functions across multiple transactions, passing data with a commarea.
Historically this wasn't done for a number of reasons. Thirty years ago, some CICS Systems Programmers felt transaction IDs were a limited resource and encouraged application designers to keep processing to the minimum number of transactions possible.
Security in CICS is handled at the transaction level, so your user must have authority to execute all transactions that comprise the business function they must perform.
Resources such as temporary storage queues are often named in part using the transaction ID to differentiate and keep them separate.
Prior to CICS TS version 2 (I think) the data to be shared between those transactions was limited to the size of a commarea (32K). All supported versions of CICS now have channels and containers, allowing you to pass significantly larger amounts of data.
My experience is that it is simpler to code and easier to maintain pseudo-conversational transactions with screen interactions if the code is all in one transaction. You really want your transactions to be pseudo-conversational or non conversational. I believe this to be the overriding reason you see transactions designed to use XCTL, LINK, or dynamic CALLs.
XCTL also doesn't allow dynamic routing (you always stay in the same CICS region), and is one way only. Pseudo-conversational return as above will let the user update the screen, and then only when they press an Attention Identifier (such as Enter) will the next program run. XCTL will run immediately.

Reporting progress on a million call process

I have a console/desktop application that crawls a lot (think million calls) of data from various webservices. At any given time I have about 10 threads performing these call and aggregating the data into a MySql database. All seeds are also stored in a database.
What would be the best way to report it's progress? By progress I mean:
How many calls already executed
How many failed
What's the average call duration
How much is left
I thought about logging all of them somehow and tailing the log to get the data. Another idea was to offer some kind of output to a always open TCP endpoint where some form of UI could read the data and display some aggregation. Both ways look too rough and too complicated.
Any other ideas?
The "best way" depends on your requirements. If you use a logging framework like NLog, you can plug in a variety of logging targets like files, databases, the console or TCP endpoints.
You can also use a viewer like Harvester as a logging target.
When logging multi-threaded applications I sometimes have an additional thread that writes a summary of progress to the logger once every so often (e.g. every 15 seconds).
since it is a Console Application, just use Writeline, just have the application spit the important stuff out to the Console.
I did something Similar in an application that I created to export PDF's from a SQL Server Database back into PDF Format
you can do it many different ways. if you are counting records and their size you can run a tally of sorts and have it show the total every so many records..
I also wrote out to a Text File, so that I could keep track of all the PDFs and what case numbers they went to and things like that. that information is in the answer that I gave to the above linked question.
you could also write things out to a Text File every so often with the statistics.
the logger that Eric J. mentions is probably going to be a little bit easier to implement, and would be a nice tool for your toolbox.
these options are just as valid depending on your specific needs.

Progress bar and multiple threads, decoupling GUI and logic - which design pattern would be the best?

I'm looking for a design pattern that would fit my application design.
My application processes large amounts of data and produces some graphs.
Data processing (fetching from files, CPU intensive calculations) and graph operations (drawing, updating) are done in seperate threads.
Graph can be scrolled - in this case new data portions need to be processed.
Because there can be several series on a graph, multiple threads can be spawned (two threads per serie, one for dataset update and one for graph update).
I don't want to create multiple progress bars. Instead, I'd like to have single progress bar that inform about global progress. At the moment I can think of MVC and Observer/Observable, but it's a little bit blurry :) Maybe somebody could point me in a right direction, thanks.
I once spent the best part of a week trying to make a smooth, non-hiccupy progress bar over a very complex algorithm.
The algorithm had 6 different steps. Each step had timing characteristics that were seriously dependent on A) the underlying data being processed, not just the "amount" of data but also the "type" of data and B) 2 of the steps scaled extremely well with increasing number of cpus, 2 steps ran in 2 threads and 2 steps were effectively single-threaded.
The mix of data effectively had a much larger impact on execution time of each step than number of cores.
The solution that finally cracked it was really quite simple. I made 6 functions that analyzed the data set and tried to predict the actual run-time of each analysis step. The heuristic in each function analyzed both the data sets under analysis and the number of cpus. Based on run-time data from my own 4 core machine, each function basically returned the number of milliseconds it was expected to take, on my machine.
f1(..) + f2(..) + f3(..) + f4(..) + f5(..) + f6(..) = total runtime in milliseconds
Now given this information, you can effectively know what percentage of the total execution time each step is supposed to take. Now if you say step1 is supposed to take 40% of the execution time, you basically need to find out how to emit 40 1% events from that algorithm. Say the for-loop is processing 100,000 items, you could probably do:
for (int i = 0; i < numItems; i++){
if (i % (numItems / percentageOfTotalForThisStep) == 0) emitProgressEvent();
.. do the actual processing ..
}
This algorithm gave us a silky smooth progress bar that performed flawlessly. Your implementation technology can have different forms of scaling and features available in the progress bar, but the basic way of thinking about the problem is the same.
And yes, it did not really matter that the heuristic reference numbers were worked out on my machine - the only real problem is if you want to change the numbers when running on a different machine. But you still know the ratio (which is the only really important thing here), so you can see how your local hardware runs differently from the one I had.
Now the average SO reader may wonder why on earth someone would spend a week making a smooth progress bar. The feature was requested by the head salesman, and I believe he used it in sales meetings to get contracts. Money talks ;)
In situations with threads or asynchronous processes/tasks like this, I find it helpful to have an abstract type or object in the main thread that represents (and ideally encapsulates) each process. So, for each worker thread, there will presumably be an object (let's call it Operation) in the main thread to manage that worker, and obviously there will be some kind of list-like data structure to hold these Operations.
Where applicable, each Operation provides the start/stop methods for its worker, and in some cases - such as yours - numeric properties representing the progress and expected total time or work of that particular Operation's task. The units don't necessarily need to be time-based, if you know you'll be performing 6,230 calculations, you can just think of these properties as calculation counts. Furthermore, each task will need to have some way of updating its owning Operation of its current progress in whatever mechanism is appropriate (callbacks, closures, event dispatching, or whatever mechanism your programming language/threading framework provides).
So while your actual work is being performed off in separate threads, a corresponding Operation object in the "main" thread is continually being updated/notified of its worker's progress. The progress bar can update itself accordingly, mapping the total of the Operations' "expected" times to its total, and the total of the Operations' "progress" times to its current progress, in whatever way makes sense for your progress bar framework.
Obviously there's a ton of other considerations/work that needs be done in actually implementing this, but I hope this gives you the gist of it.
Multiple progress bars aren't such a bad idea, mind you. Or maybe a complex progress bar that shows several threads running (like download manager programs sometimes have). As long as the UI is intuitive, your users will appreciate the extra data.
When I try to answer such design questions I first try to look at similar or analogous problems in other application, and how they're solved. So I would suggest you do some research by considering other applications that display complex progress (like the download manager example) and try to adapt an existing solution to your application.
Sorry I can't offer more specific design, this is just general advice. :)
Stick with Observer/Observable for this kind of thing. Some object observes the various series processing threads and reports status by updating the summary bar.

Resources