How to download files or custom logs written as part of a test in performance center? - performance-testing

I have a script which on running creates an ID and I want to have the list of IDs from the 30 mins test executed with this script. I want to run the test on performance center which uses LG to which I don't have access to? Basically how to get the files that are created by the script using functions like fopen fprintf after every test executed in performance center?

lr_output_message()....this will send the data to the controller for tracking. You could also use virtual table server or a queue service running in a cloud provider
realistically, you do not want to write data to the local load generator. You will be turning the local drive into a bottleneck for the entire test as multiple users compete for access to the write head. This is also why no, or log on error, is recommended for execution.


Azure Automation Use Case

I have a certain script (python), which needs to be automated that is relatively memory and CPU intensive. For a monthly process, it runs ~300 times, and each time it takes somewhere from 10-24 hours to complete, based on input. It takes certain (csv) file(s) as input and produces certain file(s) as output, after processing of course. And btw, each run is independent.
We need to use configs and be able to pass command line arguments to the script. Certain imports, which are not default python packages, need to be installed as well (requirements.txt). Also, need to take care of logging pipeline (EFK) setup (as ES-K can be centralised, but where to keep log files and fluentd config?)
Last bit is monitoring - will we be able to restart in case of unexpected closure?
Best way to automate this, tools and technologies?
My thoughts
Create a docker image of the whole setup (python script, fluent-d config, python packages etc.). Now we somehow auto deploy this image (on a VM (or something else?)), execute the python process, save the output (files) to some central location (datalake, eg) and destroy the instance upon successful completion of process.
So, is what I'm thinking possible in Azure? If it is, what are the cloud components I need to explore -- answer to my somehows and somethings? If not, what is probably the best solution for my use case?
Any lead would be much appreciated. Thanks.
Normally for short living jobs I'd say use an Azure Function. Thing is, they have a maximum runtime of 10 minutes unless you put them on an App Service Plan. But that will costs more unless you manually stop/start the app service plan.
If you can containerize the whole thing I recommend using Azure Container Instances because you then only pay for what you actual use. You can use an Azure Function to start the container, based on an http request, timer or something like that.
You can set a restart policy to indicate what should happen in case of unexpected failures, see the docs.
Configuration can be passed from the Azure Function to the container instance or you could leverage the Azure App Configuration service.
Though I don't know all the details, this sounds like a good candidate for Azure Batch. There is no additional charge for using Batch. You only pay for the underlying resources consumed, such as the virtual machines, storage, and networking. Batch works well with intrinsically parallel (also known as "embarrassingly parallel") workloads.
The following high-level workflow is typical of nearly all applications and services that use the Batch service for processing parallel workloads:
Basic Workflow
Upload the data files that you want to process to an Azure Storage account. Batch includes built-in support for accessing Azure Blob storage, and your tasks can download these files to compute nodes when the tasks are run.
Upload the application files that your tasks will run. These files can be binaries or scripts and their dependencies, and are executed by the tasks in your jobs. Your tasks can download these files from your Storage account, or you can use the application packages feature of Batch for application management and deployment.
Create a pool of compute nodes. When you create a pool, you specify the number of compute nodes for the pool, their size, and the operating system. When each task in your job runs, it's assigned to execute on one of the nodes in your pool.
Create a job. A job manages a collection of tasks. You associate each job to a specific pool where that job's tasks will run.
Add tasks to the job. Each task runs the application or script that you uploaded to process the data files it downloads from your Storage account. As each task completes, it can upload its output to Azure Storage.
Monitor job progress and retrieve the task output from Azure Storage.
I would go with Azure Devops and a custom agent pool. This agent pool could include some virtual machines (maybe only one) with docker installed. I would then install all the necessary packages that you mentioned on this docker container and also the DevOps agent (it will be needed to communicate with the agent pool).
You could pass every parameter needed in the build container agents through Azure Devops tasks and also have a common storage layer for build and release pipeline. This way you could mamipulate/process your files on the build pipeline and then using the same folder create a task on the release pipeline to export/upload those files somewhere.
As this script should run many times through the month, you could have many containers so that to run more than one job at a given time.
I follow the same procedure for a corporate environment. I keep a VM running windows with multiple docker machines to compile diferent code frameworks. Each container includes different tools and is registered to a custom agent pool. Jobs are distributed across those containers and build and release pipelines integrate with multiple processing.
You probably suppose to use Azure Data Factory for moving and transforming data.
Then you can also use ADF for calling Azure Batch that will be using python.
Adding more info could probably suggest other better suggestions.

Build an extensible system for scraping websites

Currently, I have a server running. Whenever I receive a request, I want some mechanism to start the scraping process on some other resource(preferably dynamically created) as I don't want to perform scraping on my main instance. Further, I don't want the other instance to keep running and charging me when I am not scraping data.
So, preferably a system that I can request to start scraping the site and close when it finishes.
Currently, I have looked in google cloud functions but they have a cap at 9 min max for every function so it won't fit my requirement as scraping would take much more time than that. I have also looked in AWS SDK it allows us to create VMs on runtime and also close them but I can't figure out how to push my API script onto the newly created AWS instance.
Further, the system should be extensible. Like I have many different scripts that scrape different websites. So, a robust solution would be ideal.
I am open to using any technology. Any help would be greatly appreciated. Thanks
I can't figure out how to push my API script onto the newly created AWS instance.
This is achieved by using UserData:
When you launch an instance in Amazon EC2, you have the option of passing user data to the instance that can be used to perform common automated configuration tasks and even run scripts after the instance starts.
So basically, you would construct your UserData to install your scripts, all dependencies and run them. This would be executed when new instances are launched.
If you want the system to be scalable, you can lunch your instances in Auto Scaling Group and scale it up or down as you require.
The other option is running your scripts as Docker containers. For example using AWS Fargate.
By the way, AWS Lambda has limit of 15 minutes, so not much more than Google functions.

Cloud-based node.js console app needs to run once a day

I'm looking for what I would assume is quite a standard solution: I have a node app that doesn't do any web-work - simply runs and outputs to a console, and ends. I want to host it, preferably on Azure, and have it run once a day - ideally also logging output or sending me the output.
The only solution I can find is to create a VM on Azure, and set a cron job - then I need to either go fetch the debug logs daily, or write node code to email me the output. Anything more efficient available?
Azure Functions would be worth investigating. It can be timer triggered and would avoid the overhead of a VM.
Also I would investigate Azure Container Instances, this is a good match for their use case. You can have a container image that you run on an ACI instance that has your Node app.

When using Azure Batch Processing, what is the best way to create and use a configuration file which can change per instance?

So I'm new to Azure and currently working on a project in which I will be using azure batch processing to run an application in several instances with different configurations.
I was wondering what is the best practice for doing this, with reference to how easy it is to change the configuration files, to deploy, how to interlink them with source control etc.
Any thoughts/knowledge would be helpful as I can't seem to find much based Azure batch and configuration files.
You would manage configuration in the batch client which is a normal application running on the client outside of batch job. This application creates pools, jobs and tasks i.e. sends them to Batch Queue. You can store configuration for this client in a usual way you are used to (app.config, json files etc.).
Scheduling a job to run in batch involves specifying job parameters like pool id, task id, resource files etc. and command line for an executable to run. This is where you pass required parameters for a task instance to use.

Running (& compiling) untrusted user code

I want to create a application that contains a feature that allows users to submit code and the server will compile and run it, similar to Ideone & Spoj. How do I do this securely in a scalable manner?
Partial Solutions I'm aware of:
IDEA 1 - 3rd Party Services
The Sphere Engine. However this costs a LOT of money!
I'm not aware of any open source application I can run on my server to achieve this, or a cheaper alternative. Please correct me if i'm wrong.
This would be the next most sensible choice. However, I'm unsure how to implement it. For example let's say I created a VM and started to run the user's code. This would restrict damage on MY system, but not the damage on the VM, which other users would have to use. Does that mean I have to create a new VM each and every time I want to compile and run user's code (which clearly is not scalable - correct me if I'm wrong.
Having not set up a thing, I assumed that services like TravisCI (which compiles code and runs it under test cases you provide), have a base virtual machine image, which boots up and processes your code. The next user to come along gets a separate VM booted from the same base image, your changes aren't stored.
So inside the VM, the user code can do whatever. All of its effects, except stuff written to the console will be erased at the end of the time limit.
