Sample User Data for multiple test across parallel workers in Playwright - node.js

I have a use case for testing authentication functionality where there are multiple test cases like login into an app, forgetting a password, log in to MFA-enabled applications. I have a set of multiple users that can be used in any of the test cases but then the issue arises when trying to run them in multiple browser contexts. I have stored my test data in a JSON file with the username and password of multiple sample users.
When let's say test runs for login to MFA enabled application, all three browser worker is launched simultaneously and all of them try to get user details from the test data file.
BUT the issue comes here, all of them picked up the first object let's say user A, all three browser test passes till the password step but when MFA is entered, it creates a race condition, the one worker who submitted the OTP first will pass but the rest fails because that 30 seconds window OTP is already redeemed.
I want to have a way that works in the same way as the synchronized method in Java where if a worker is using one user, don't make them available for another user, instead provide them next user from the test data.
Please guide me on how to do that in Playwright!

I'm sure there are far more elegant ways of doing this, but one approach is to use the worker index / parallel index feature described in the docs here:
https://playwright.dev/docs/test-parallel#worker-index-and-parallel-index
It looks like parallel index may be a better fit for your use case.
If each of your rows of test data includes the index of the worker it is intended for, then your code can ensure that worker 0 only picks up worker 0's test data and worker 1 picks up worker 1's test data.
Alternatively you could use testConfig.workers to determine how many workers there are and then use the remainer/modulo operator (%) in JS to split the rows up between the different workers:
https://playwright.dev/docs/api/class-testconfig#test-config-workers
So you would compare TestDataRowNumber % testConfig.workers to process.env.TEST_PARALLEL_INDEX to split the file up amongst however many workers you had.

Related

Submitting multiple runs to the same node on AzureML

I want to perform hyperparameter search using AzureML. My models are small (around 1GB) thus I would like to run multiple models on the same GPU/node to save costs but I do not know how to achieve this.
The way I currently submit jobs is the following (resulting in one training run per GPU/node):
experiment = Experiment(workspace, experiment_name)
config = ScriptRunConfig(source_directory="./src",
script="train.py",
compute_target="gpu_cluster",
environment="env_name",
arguments=["--args args"])
run = experiment.submit(config)
ScriptRunConfig can be provided with a distributed_job_config. I tried to use MpiConfiguration there but if this is done the run fails due to an MPI error that reads as if the cluster is configured to only allow one run per node:
Open RTE detected a bad parameter in hostfile: [...]
The max_slots parameter is less than the slots parameter:
slots = 3
max_slots = 1
[...] ORTE_ERROR_LOG: Bad Parameter in file util/hostfile/hostfile.c at line 407
Using HyperDriveConfig also defaults to submitting one run to one GPU and additionally providing a MpiConfiguration leads to the same error as shown above.
I guess I could always rewrite my train script to train multiple models in parallel, s.t. each run wraps multiple trainings. I would like to avoid this option though, because then logging and checkpoint writes become increasingly messy and it would require a large refactor of the train pipeline. Also this functionality seems so basic that I hope there is a way to do this gracefully. Any ideas?
Use Run.create_children method which will start child runs that are “local” to the parent run, and don’t need authentication.
For AMLcompute max_concurrent_runs map to maximum number of nodes that will be used to run a hyperparameter tuning run.
So there would be 1 execution per node.
single service deployed but you can load multiple model versions in the init then the score function, depending on the request’s param, uses particular model version to score.
or with the new ML Endpoints (Preview).
What are endpoints (preview) - Azure Machine Learning | Microsoft Docs

Running a repetitive task in Node.js for each row in a postgres table on a different interval for each row

What would be a good approach to running a repetitive task for each row in a large postgres db table on a different per row interval in Node.js.
To give you some more context, here's a quick description of the application:
It's a chat based customer support app.
It consists of teams, which can be either a client team or a support team. Teams have users, which can be either client users or support users.
Client users send messages to a support team and wait for one of that team's users to answer their question.
When there's an unanswered client message waiting for a response, every agent for the receiving support team will receive a notification every n seconds (n being set on a per-team basis by the team admin).
So this task needs to infinitely loop through the rows in the teams table and send notifications if:
The team has messages waiting to be answered.
N seconds have passed since the last notification was sent (N being the number of seconds set by the team admin).
There might be a better approach to this condition altogether.
So my questions are:
What is an efficient way to infinitely loop through a postgres table with no upper limit on the number rows?
Should I load 1 row at a time? Several at a time?
What would be a good way to do this in Node?
I'm using Knex. Does Knex provide a mechanism for lazy loading a table and iterating through the rows?
A) Running a repetitive task via node can be done via a the js built-in function 'setInterval'.
// run the intervalFnc() every 5 seconds
const timerId = setTimeout(intervalFnc, 5000);
function intervalFnc() { console.log("Hello"); }
// to quit running it:
clearTimeout(timerId);
Then your interval function can do the actual work. An alternative would be to use cron (linux), or some OS process scheduler to trigger the function. I would use this method if you want to do it every minute, and a cron job if you want to do it every hour (in between these times becomes more debatable).
B) An efficient way...
B-1) Retrieving a block of records from a DB will be more efficient than one at a time. Knex has .offset and .limit clauses to choose a group of records to retrieve. A sample from the knex doc:
knex.select('*').from('users').limit(10).offset(30)
B-2) Database indexed access is important for performance if your tables are very large. I would recommend including an status flag field in your table to note which records are 'in-process', and also include a "next-review-timestamp" field with both fields being both indexed. Retrieve the records that have status_flag='in-process' AND next_review_timestamp <= now(). Sample:
knex('users').where('status_flag', 'in-process').whereRaw('next_review_timestamp <= now()')
Hope this helps!

Getting Multiple Last Price Quotes from Interactive Brokers's API

I have a question regarding the Python API of Interactive Brokers.
Can multiple asset and stock contracts be passed into reqMktData() function and obtain the last prices? (I can set the snapshots = TRUE in reqMktData to get the last price. You can assume that I have subscribed to the appropriate data services.)
To put things in perspective, this is what I am trying to do:
1) Call reqMktData, get last prices for multiple assets.
2) Feed the data into my prediction engine, and do something
3) Go to step 1.
When I contacted Interactive Brokers, they said:
"Only one contract can be passed to reqMktData() at one time, so there is no bulk request feature in requesting real time data."
Obviously one way to get around this is to do a loop but this is too slow. Another way to do this is through multithreading but this is a lot of work plus I can't afford the extra expense of a new computer. I am not interested in either one.
Any suggestions?
You can only specify 1 contract in each reqMktData call. There is no choice but to use a loop of some type. The speed shouldn't be an issue as you can make up to 50 requests per second, maybe even more for snapshots.
The speed issue could be that you want too much data (> 50/s) or you're using an old version of the IB python api, check in connection.py for lock.acquire, I've deleted all of them. Also, if there has been no trade for >10 seconds, IB will wait for a trade before sending a snapshot. Test with active symbols.
However, what you should do is request live streaming data by setting snapshot to false and just keep track of the last price in the stream. You can stream up to 100 tickers with the default minimums. You keep them separate by using unique ticker ids.

JMeter reports are different in Jenkins

I have a JMeter test that has two thread groups. The first thread group goes out and gets auth and audit tokens. The second requires the tokens to test the APIs on which I'm interested in gathering performance data. I have Listeners set up as children of the samplers in the second thread group only. Running JMeter I get the results I want. But when I execute the same test from Jenkins, I get results from the both of the thread groups. I don't want the results from the first thread group. They clutter up my graphs and since there is only one execution of each they fluctuate, performance wise, enough to trigger my unstable/failed percentages routinely. Is there a way to get Jenkins to report on only the listeners/samplers I want? Do I have to run one test to get the tokens and another to test? If so, how do I pass the tokens from one test to the other?
You can execute 2 jenkins jobs:
First job write to file the tokens using BeanShell/JSR223 PostProcessor
Second job read the tokens from file using CSV Data Set Config

JMeter multiple or nested users/threads

The system I am trying to test involves various users interacting with various events.
I'd like to create a JMeter scenario as follows:
There are 10 users and 50 events. The test will generate a total of 50 interactions (API calls) per second, with interactions distributed evenly across all users and events. In other words, each user will generate 5 interactions with a different event each second. There will be only one interaction per event per second, and no users will overlap in the events they are interacting with.
Using either a CSV/TXT file with a list of 10 users or a CSV/TXT file of a list of 50 events, I am able to separately create either 10 threads of users or 50 threads of events. However, I am not able to create them together in the same script.
As a result, I am only able to create a script that generates 10 users interacting with one event, or one user interacting with 50 events.
Any thoughts?
Thanks in advance.
I'd create a test with 10 threads and two different CSV files. One CSV file with 10 users and one with 50 events. If the CSV Dataset config sharing mode is set to All Threads then you can have each of the 10 users calling different events.

Resources