Visual Studio Load Test Agent Weighting - visual-studio-2012

I have a questions about Visual Studio 2013. We're running load tests with agents. In total, we have 5 agents and 1 controller. In the agent properties (from the Manage Test Controller option), I have set the weighting to 15 for each agent. This totals 75 - so does that mean that the controller handles the rest?
The documentation is a little vague as it suggests that if you make one 20 (e.g. A) and another 40 (e.g. B) that B will run double the load of A. However I am not sure how this works when we have 5 agents set to 15?
Thanks in advance!

The weights are just specifying ratios. If all the values are the same (ie 15 in your example) then each agent will get the same load.
Suppose you want to run a test with 300 simulated users with your 5 agents each having a weight of 15. Then each agent gets approximately (300*15)/(5*15) users, ie 60. The "approximately" is added because other values may not divide nicely with integers.
Suppose instead that the 5 agents have weights of 7, 11, 13, 17 and 19. Then the 300 simulated users will be spread as approximately 31, 49, 58, 76 and 86, respectively.
See here and here for more details.

Related

Why does Azure Synapse limit the Storage Node size to 60?

I see that Synapse provisioned SQL pool (SQL DW) design keeps the data distribution limited to 60 nodes. Am I understanding that limitation correctly?
If so, how and why did Microsoft arrive with this specific number? Why 60 and not (say) 50 or 70? I am asking for an explanation of the design decision that led to the product having a configuration limit.
It was a number that had many factors :)
60 is the number of SQL distributions, which are supported on 1 to 60 nodes.
We can use 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30 or 60 (did I get all of them?) possible nodes per scale configuration.

Can we modify the number of tablets/shards per table after we create a universe?

As described in this example each tserver started with 12 tablet as we set number of shards to 4.
And when we added a new node the number of tablet per tserver became 9. it seems the total number of tablet, which is 36, will not increase.
My question is:
How many node could we add while we have 36 total tablet(in this example)?
And Is it possible to increase shards count in a running universe to be able to add more node?
How many node could we add while we have 36 total tablet(in this example)?
In this example, you can expand to 12 nodes (each node would end up with 1 leader and 2 followers).
Reasoning: There are 36 total tablets for this table and the replication factor is 3. So there will 12 tablet leaders and 24 tablet followers. Leaders are responsible for handling writes and reads (unless you're doing follower reads, lets assume that is not the case). If you go to 12 nodes, each node would at least have one leader and be doing some work.
Ideally, you should create enough tablets upfront so that you end up with 4 tablets per node eventually.
And Is it possible to increase shards count in a running universe to be able to add more node?
This is currently not possible, but being worked on and getting close to the finish. expected to be released in Q1 2020. If you are interested in this feature, please subscribe to this GitHub issue for updates.
Until that is ready, as a workaround, you can split the table into sufficient number of tablets.

Customize the OpenAi Gym Taxi v2 Environment

I would like to modifiy the Taxi V2-Environment in Open AI Gym.
Is it possible to pick up 2 passengers, before I reach the destination point.
Yes, it is possible you can modify the taxi.py file in envs in the gym folder. For two passengers the number of states (state-space) will increase from 500 (5*5*5*4) to 10,000 (5*5*5*4*5*4), 5*4 states for another(2nd) passenger. You can now modify the code accordingly. We can also increase the number of action space from 6 to 8, by adding pickup and drop for passenger 2, to keep a record of both passengers if you want.

Cassandra data modeling timeseries data

I have this data about visited users for an app/service:
contentId (assume uuid),
platform (eg. website, mobile etc),
softwareVersion (eg. sw1.1, sw2.5, ..etc),
regionId (eg. us144, uk123, etc..)
....
I have modeled it very simply like
id(String) time(Date) count(int)
contentid1-sw1.1 Feb06 30
contentid1-sw2.1 Feb06 20
contentid1-sw1.1 Feb07 10
contentid1-sw2.1 Feb07 10
contentid1-us144 Feb06 23
contentid1-sw1.1-us144 Feb06 10
....
Reason is because there's a popular query where someone can ask for contentId=foo,platform=bar,regionId=baz or any combination of those for a range of time (say between Jan 01 - Feb 05).
But another query that's not easily answerable is:
Return top K 'platform' for contentId=foo between Jan01 - Feb05. By top it means to be sorted by 'count's in that range. So for above data, query for top 2 platforms for contentId=contentId1 between Feb6-Feb8 must return:
sw1.1 40
sw2.1 30
Not sure how to model that in C* to get answers for top K queries, anyone has any ideas?
PS: there are 1billion+ entries for each day.
Also I am open to using Spark or any other frameworks along with C* to get these answers.

Tracking metrics using StatsD (via etsy) and Graphite, graphite graph doesn't seem to be graphing all the data

We have a metric that we increment every time a user performs a certain action on our website, but the graphs don't seem to be accurate.
So going off this hunch, we invested the updates.log of carbon and discovered that the action had happened over 4 thousand times today(using grep and wc), but according the Integral result of the graph it returned only 220ish.
What could be the cause of this? Data is being reported to statsd using the statsd php library, and calling statsd::increment('metric'); and as stated above, the log confirms that 4,000+ updates to this key happened today.
We are using:
graphite 0.9.6 with statsD (etsy)
After some research through the documentation, and some conversations with others, I've found the problem - and the solution.
The way the whisper file format is designed, it expect you (or your application) to publish updates no faster than the minimum interval in your storage-schemas.conf file. This file is used to configure how much data retention you have at different time interval resolutions.
My storage-schemas.conf file was set with a minimum retention time of 1 minute. The default StatsD daemon (from etsy) is designed to update to carbon (the graphite daemon) every 10 seconds. The reason this is a problem is: over a 60 second period StatsD reports 6 times, each write overwrites the last one (in that 60 second interval, because you're updating faster than once per minute). This produces really weird results on your graph because the last 10 seconds in a minute could be completely dead and report a 0 for the activity during that period, which results in completely nuking all of the data you had written for that minute.
To fix this, I had to re-configure my storage-schemas.conf file to store data at a maximum resolution of 10 seconds, so every update from StatsD would be saved in the whisper database without being overwritten.
Etsy published the storage-schemas.conf configuration that they were using for their installation of carbon, which looks like this:
[stats]
priority = 110
pattern = ^stats\..*
retentions = 10:2160,60:10080,600:262974
This has a 10 second minimum retention time, and stores 6 hours worth of them. However, due to my next problem, I extended the retention periods significantly.
As I let this data collect for a few days, I noticed that it still looked off (and was under reporting). This was due to 2 problems.
StatsD (older versions) only reported an average number of events per second for each 10 second reporting period. This means, if you incremented a key 100 times in 1 second and 0 times for the next 9 seconds, at the end of the 10th second statsD would report 10 to graphite, instead of 100. (100/10 = 10). This failed to report the total number of events for a 10 second period (obviously).Newer versions of statsD fix this problem, as they introduced the stats_counts bucket, which logs the total # of events per metric for each 10 second period (so instead of reporting 10 in the previous example, it reports 100).After I upgraded StatsD, I noticed that the last 6 hours of data looked great, but as I looked beyond the last 6 hours - things looked weird, and the next reason is why:
As graphite stores data, it moves data from high precision retention to lower precision retention. This means, using the etsy storage-schemas.conf example, after 6 hours of 10 second precision, data was moved to 60 second (1 minute) precision. In order to move 6 data points from 10s to 60s precision, graphite does an average of the 6 data points. So it'd take the total value of the oldest 6 data points, and divide it by 6. This gives an average # of events per 10 seconds for that 60 second period (and not the total # of events, which is what we care about specifically).This is just how graphite is designed, and for some cases it might be useful, but in our case, it's not what we wanted. To "fix" this problem, I increased our 10 second precision retention time to 60 days. Beyond 60 days, I store the minutely and 10-minutely precisions, but they're essentially there for no reason, as that data isn't as useful to us.
I hope this helps someone, I know it annoyed me for a few days - and I know there isn't a huge community of people that are using this stack of software for this purpose, so it took a bit of research to really figure out what was going on and how to get a result that I wanted.
After posting my comment above I found Graphite 0.9.9 has a (new?) configuration file, storage-aggregation.conf, in which one can control the aggregation method per pattern. The available options are average, sum, min, max, and last.
http://readthedocs.org/docs/graphite/en/latest/config-carbon.html#storage-aggregation-conf

Resources